midterm e xam review
DESCRIPTION
Midterm E xam Review. General Information. Date: 3/13/2014 Time: 11-12.20 Location: 101 Davis Closed book, closed notes. Topics. Doing data science text: Ch.2 Statistical inference, exploratory data analysis, and data science process Population and samples, sample sizes Data model - PowerPoint PPT PresentationTRANSCRIPT
Midterm Exam Review
General Information
• Date: 3/13/2014• Time: 11-12.20• Location: 101 Davis• Closed book, closed notes
Topics
• Doing data science text: Ch.2 – Statistical inference, exploratory data analysis, and data
science process– Population and samples, sample sizes– Data model
• Statistical model• Algorithms
– Fitting a model– Probability distributions– EDA: plots, graphs and summaries
• One question
Topics (contd.)• Doing data science: Ch. 3• Comparison of algorithms and stat models• Three basic algorithms
– Linear regression– K-NN (semi-supervised.. Classification)– K-means (unsupervised clustering)
• Intuitive idea • Algorithmic steps for each of these algorithms• Representative examples• Why and when would you use each of these algorithms?• 2 questions
Topics: Lin & Dyer’s text
• Hadoop: HDFS as in Chapter 2• MapReduce: MR data-flow including
combiners and partitioners• 2 questions
Bloomberg Tech Talk on ML
• Building Intelligent solution• See the presentation• Up to slide#16 (No NLP or MT)• 1 question
Format
• 5 questions not equally weighed• HDFS: direct• Ch.2 dds: direct• MR and K-NN: little tricky• K-means: direct• Questions will test your understanding of the
concepts• Example: what is the effect of large K vs smaller K in
K-NN?
Seating for the exam
• Question, space for answer format• Designated seating: Will let you know the plan