http:// nsf dms-0101360 vostat - head 2004 ashish mahabal vostat arming astronomers with advanced...
Post on 20-Dec-2015
217 views
TRANSCRIPT
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
VOStat Arming Astronomers with
Advanced Statistics
Caltech: A. Mahabal, M. Graham, S.G.Djorgovski,
R. Williams
Penn State: J. Babu (PI), E. Feigelson
CMU: R. Nichol, D. Van DenBerk, L.Wasserman
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
Use of statistics
• 15000 astronomical studies per year
• 5% have “statistics” in their abstract
• 20% treat variable objects or multivariate datasets
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
Traditional methods
• Fourier transform (Fourier 1807)
• Least sq. and chisq (Legendre 1805, Pearson 1901)
• Kolmogorov-Smirnov test (Kolomogrov 1933)
• Principal Component Analysis (Hotelling 1936)
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
VOStat
• Web based service
• Simple and sophisticated statistical routines
• Large datasets
• Public domain (R)/ specially written
• General purpose and Virtual Observatory
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
VOStat
• ASCII / VOTABLE as input (can be used as an intermediate block for a VO based pipeline)
• CGI routines as prototypes (few 1000 lines)• Webservices (Java GUI) - hundreds of thousands
of lines (limited by R’s capabilities) - distributed, multi-OS, multi-language
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
Examples of available functions
• Descriptive statistics (e.g. boxplot)• Two- and k-sample tests (e.g. Wilcoxon rank-sum
test)• Density estimation (e.g. Kernel smoothing)• Correlation and regression (e.g. PCA)• Censored data (e.g. Survival)• Multivariate classification (e.g. H clustering)• External functions (e.g. K-density)
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
User-friendly GUI• Columns are autoselected (and can be deselected)• Parameter choices for functions are conveniently placed• Can be used from your own webpages on tables residing
elsewhere
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
Toy Demos
• Rediscovering HR diagram
• Rediscovering FP of Globular Clusters
• Looking for outliers in color-color space
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
Rediscovering HR diagram
• Hyades stars (Hipparcus main catalog)
• Mean/median/boxplot
• Density estimation (Histogram)
• Kernel smoothing
• Correlation matrix
• X-Y plot
• Multivariate clustering
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
• X-Y plot between Vmag and B-V reveals the famous structure in the dataset: the color-magnitude of bright stars showing the main sequence, giant branch (with red clump stars), and a few Hyades white dwarfs.
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
FP of Globular clusters
• Matrix of pairwise correlation coefficients
• Pairwise plots
• Principal Component Analysis
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
• Core parameters as a group tend to be highly correlated, unlike the half-light parameters. This is indicative of the dynamical evolution driven by the core collapse.
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
Exploring outliers
• Palomar-QUEST synoptic sky survey
• 9 mix-and-match colors from 8 filters
• Aim: finding outliers in color-color space for spectroscopic follow-up
• 1000 random objects
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
Boxplot• Reveals relationships between colors
(mean, median, overlap, outliers)
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
Clustering• K-means provides various cluster centers along
with withinss and a list of possible outliers
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
K-density
• Probability - density association for outliers
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
Visual confirmation(found from 1000 random objects)
VOStat - HEAD 2004Ashish Mahabal
http://www.vostat.org NSF DMS-0101360
Summary
• Web-based• VO compatible• Public domain and
specialized routines