outline class intros – what are your goals? – what types of problems? datasets? overview of...
TRANSCRIPT
Outline
• Class Intros– What are your goals?– What types of problems? datasets?
• Overview of Course
• Example Research Project
Breadth vs. Depth vs. Relevancy
Class
Project
Question
Hypothesis
Data
Analytics
Charts
Answer
57 59 61 63 65 67 69 71 73 75 77150
170
190
210
230
250
270
Height (inches)
Wei
ght (
poun
ds)
Are height and weight related?
Question
Hypothesis
Data
Analytics
Charts
Answer
Can we put a person on Mars by 2025?
Question
Hypothesis
Data
Analytics
Charts
Answer
What determines housing prices?
LocationSquare Feet
Crime
Number of VariablesAnalyzed
Week
1 2 3 4 5 6 7
6+
5
4
3
2
1
Predictive AnalyticsData AnalysisSoftware Statistics Data Mining
Data Visualization - MathematicsMean
StandardDeviation
Correlation
Temperature Variation Across Cities in 2011
San Francisco
Boston
30 60 90
Austin
30 60 90
30 60 90
San Diego
30 60 90
30 60 90
Tampa Bay
Normal Distribution
Distribution of Height
Normal Distribution
Outliers
Identify
Remove?
Correlation• To what degree are two variables related?
57 59 61 63 65 67 69 71 73 75 77150
170
190
210
230
250
270
Height (inches)
Wei
ght (
poun
ds)
Excel Pivot Table
Excel Analysis Pak
Write Code/ Program- Input Data- Analyze- Graphics
Datasets, etc.
Enter CommandsView Results
R / R-Studio
Currently, how many R Packages?
At the command line enter: dim(available.packages()) available.packages()
Correlation Matrix
Multivariate Regression
Y
X’s
HeightY
X4X3X2X1