G. Sawitzki: Statistical Data Analysis 2011/2012
If you are preparing for an assigned task, please heed the seminar tips.
For the 2011/2012 course, you can provide your results in one of one of three forms:
- Web presentation
- Presentation (oral, plus keynotes handout)
- Pdf documentation
For copyright reasons, access to some course material may be restriced. These links are shown in this style.
Projection Pursuit
 |
Keywords
- Picturing
- Rotation
- Isolation
- Masking
|
| (John Tukey, 1973) |
PRIM 9 |
Literature
Peter J. Huber: Projection Pursuit. The Annals of Statistics, Vol. 13, No. 2. (Jun., 1985), pp. 435-475
G. P. Nason: Exploratory Projection Pursuit
Mathew Ward: Projection Pursuit: A Brief Overview
See also:
Data Sets
- PRIM
- See Section 5.2 in (Cook and Swayne, 2007). Provided as example in ggobi.
- Data
Comments
- Color perception
- Documentation Data
- Chemical Diabetes
- Literature
- Mineral water
- See Section 7.3 in (Cook and Swayne, 2007),
and R package classifly.
See also http://www.mineralwaters.org/ for comparisons and detailed analysis.
Brushing
 |
Keywords
- Linking,
- Scatterplot brushing
|
| (John A. McDonald, 1980+) |
ORION |
Data Sets
- Boston housing data
- original at UCI Machine Learning Repository.
- UNT version.
- Harrell version.
- Cars
- Henderson and Velleman. (1981). Building Regression Models
Interactively Biometrics 37 400.
Software: DataDesk (demo version)
Smoothing and Kernel Density Estimation
Data Sets
R:
data(faithful)
- Old Faithful Geyser Data
-
A look at some data on the Old Faithful geyser
A. Azzalini and A. W. Bowman
Applied Statistics
39
357--365
(1990)
Literature
-
A Brief Survey of Bandwidth Selection for Density Estimation
M. C. Jones and J. S. Marron and S. J. Sheather
Journal of the American Statistical Association
91
401--407
(1996)
-
-
Bandwidth Selection in Kernel Density Estimation: A Review
B. Turlach
Principal Component Analysis
Data
R:
library("UsingR")
data(fat)
Literature
- mkb92ma: Chapter 8.3
-
Multivariate Analysis
K. V. Mardia and J.T.Kent and J.M.Bibby
(1979)
- Branden2005Robust-classifi
-
Robust classification in high dimensions based on the SIMCA method
K. Branden and M. Hubert
Chemometrics and Intelligent Laboratory Systems
79
10--21
(2005)
In this paper we first investigate the robustness of the SIMCA method for classifying high-dimensional observations. It turns out that both stages of the algorithm, the estimation of principal components and the construction of a classification rule, can be highly disturbed by the presence of outliers. Therefore we propose a robust procedure RSIMCA which is based on a robust Principal Component Analysis method for high-dimensional data (ROBPCA). Various simulations and real examples reveal the robustness of our approach. (c) 2005 Elsevier B.V. All rights reserved.
One-Dimensional Diagnostics
Literature
- gs94oned
-
-
Diagnostic Plots for One-Dimensional Data
G. Sawitzki
in:
P. Dirschedl, R. Ostermann (eds.) Computational Statistics. Papers
Collected on the Occasion of the 25th Conference on Statistical
Computing at Schloss Reisensburg. Physica-Verlag, Heidelberg 1994.
pp. 237--258
(1994)
Software and more information: http://www.statlab.uni-heidelberg.de/projects/onedim/.
-
In preparation
Dimension Reduction
Literature
- Li1991Sliced-Inverse-
-
-
Sliced Inverse Regression for Dimension Reduction
K.-C. Li
Journal of the American Statistical Association
86
316-327
(1991)
-
Resampling
Classification and Regression Trees, DART
Literature
- Breiman1984CART
-
-
Classification and Regression Trees
R. O. L. Breiman, J. Friedman and C. Stone
(1984)
-
- 593439
-
-
J. H. Friedman (Aug. 1996a)
"Local
Learning Based on Recursive Covering"
(software)
-
See also...
Courses to look at
Andreas Buja (University of Pennsyvania):
Lectures on statistics and data analysis, Columbia University 2009
Heike Hoffmann et al. (Iowa State University): Visualizing Quantitative Information
Ross Ihaka et al. (Auckland): Computational Data Analysis and Graphics
Hadley Wickham (Rice University): Data Visualisation
Data
D. F. Andrews and A. M. Herzberg:
Data
XX, 442 S.
(Springer 1985) Data sets online
D. Cook and D. F. Swayne: Interactive and Dynamic Graphics for Data Analysis
(Springer 2007), Data Descriptions (Feb 2007, PDF, 1.5Mb), Data: See
Data section of the book home page.
See also Data.
Literature
- (Belsley et al. 1980)
-
Belsley, Kuh & Welsch, Regression Diagnostics, Wiley, 1980.
- (Cook and Swayne, 2007)
-
D. Cook and D. F. Swayne: Interactive and Dynamic Graphics for Data Analysis
(Springer 2007), Data Descriptions (Feb 2007, PDF, 1.5Mb)
Software
- DataDesk
- http://www.datadesk.com/
- ggobi
-
www.ggobi.org
- R
-
www.cran.r-project.org
$Source: /u/math/sa3/cvswww/www/www.statlab.uni-heidelberg.de/studinfo/da2011/index.html,v $
$Revision: 1.15 $
$Date: 2011/11/29 18:47:00 $