cs 438/697 - syllabus applied machine learning · 2015. 2. 23. · cs 438/697 - syllabus applied...

15
CS 438/697 - Syllabus Applied Machine Learning Instructor Information Wei Ding, PhD [email protected] Phone (W): 617-287-6428 Class Schedule: TTH 4:00 - 5:15 PM, Wheatley W02-0124 Office Location:S-3-179 Science Building Office Hours: Tuesday 2:30 PM – 4:00 PM; Thursday 2:30 PM – 4:00 PM; Note: The following link will assist you in forwarding your UMB email account to your personal account: http://www.umb.edu/it/getting_services/email/office365/o365_forward. Throughout the semester, I will communicate with you via your UMB email account. You may have e-mail redirected from your official UMass Boston address to another e-mail address at your own risk. The University will not be responsible for the handling of e-mail by outside vendors or by departmental servers. Course Information Course Title: CS 438/697 Applied Machine Learning Credits: 3 credits Online? no Course Description: This class will teach the practical side of machine learning for applications, such as pattern recognition from images or building predictive classifiers. The emphasis will be on learning the process of applying machine learning effectively to a variety of problems rather than pure machine learning theory. This course does not assume any prior exposure to machine learning theory or practice. The class is a computer science elective course. We will cover a wide range of learning algorithms that can be applied to a variety of problems. In particular, we will cover topics such as decision trees, rule based classification, support vector machines, Bayesian networks, and clustering. In addition to the course textbook, we will have additional readings from research articles. Context: This class is an elective of the graduate program in computer science to prepare students to develop practical data science skills. Prerequisites: CS 310 Advanced Data Structures and Algorithms or permission of instructor Prerequisite Skills: Before taking the class, students should understand how to structure and manipulate data in computing. Course Objectives: By fully participating in this course, students should be able to: 1. understand the state-of-the-art classification algorithms. 2. understand the state-of-the-art clustering algorithms. 3. understand the state-of-the-art association rule mining algorithms. Page 1

Upload: others

Post on 15-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

Instructor Information Wei Ding, PhD [email protected] Phone (W): 617-287-6428 Class Schedule: TTH 4:00 - 5:15 PM, Wheatley W02-0124 Office Location:S-3-179 Science Building Office Hours: Tuesday 2:30 PM – 4:00 PM; Thursday 2:30 PM – 4:00 PM; Note: The following link will assist you in forwarding your UMB email account to your personal account: http://www.umb.edu/it/getting_services/email/office365/o365_forward. Throughout the semester, I will communicate with you via your UMB email account. You may have e-mail redirected from your official UMass Boston address to another e-mail address at your own risk. The University will not be responsible for the handling of e-mail by outside vendors or by departmental servers.

Course Information Course Title: CS 438/697 Applied Machine Learning Credits: 3 credits Online? no Course Description: This class will teach the practical side of machine learning for applications, such as pattern

recognition from images or building predictive classifiers. The emphasis will be on learning the process of applying machine learning effectively to a variety of problems rather than pure machine learning theory. This course does not assume any prior exposure to machine learning theory or practice. The class is a computer science elective course. We will cover a wide range of learning algorithms that can be applied to a variety of problems. In particular, we will cover topics such as decision trees, rule based classification, support vector machines, Bayesian networks, and clustering. In addition to the course textbook, we will have additional readings from research articles.

Context: This class is an elective of the graduate program in computer science to prepare students to develop practical data science skills.

Prerequisites: CS 310 Advanced Data Structures and Algorithms or permission of instructor Prerequisite Skills: Before taking the class, students should understand how to structure and manipulate data in

computing. Course Objectives: By fully participating in this course, students should be able to:

1. understand the state-of-the-art classification algorithms. 2. understand the state-of-the-art clustering algorithms. 3. understand the state-of-the-art association rule mining algorithms.

Page 1

Page 2: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

4. know how to apply appropriate machine learning algorithms to a variety of problems.

Core Competencies: The objectives for this course focus on the following core competencies:

1. machine learning algorithms in software design 2. software problem formulation and implementation

Required Assignments:

Homework Assignment 1 (100 points) Assigned Date: Tuesday, February 3, 2015 Due Date: 4:00 PM Tuesday, February 10, 2015 Educational Goal Become familiar with the WEKA Workbench. Requirements Use the following learning schemes to analyze the iris data (in iris.arff):

Decision stump - weka.classifiers.DecisionStump OneR - weka.classifiers.OneR Decision table - weka.classifiers.DecisionTable -R C4.5 - weka.classifiers.j48.J48 PART - weka.classifiers.j48.PART

Submission Requirements

1. Write a brief report that records how your investigations proceeded and what results you found for the following questions. Do not describe how to use the workbench or how the schemes in it work. • How do the classifiers determine whether a flower is an Iris-setosa, Iris-

versicolor, or Iris-virginica? • What can you say about the accuracy of these classifiers when classifying a

flower that has not been used for training? • Why does OneR perform so badly? • Which classifier performs the best? • Do the decisions made by the best classifier make sense to you?

2. Attach the homework cover page to your report. 3. Submit the softcopy of your report via UMassOnline. Zero points for late

submission. 4. Turn in the paper copy of your report in class. Paper copy should be bound

firmly together as one pack (for example, staple, but not limited to, at the left corner). 5 points will be deducted for unbound homework.

5. No hard copies or soft copies result in 0 points. Homework Assignment 2 (100 points) Assigned Date: Thursday, February 19, 2015 Due Date: 4:00PM Tuesday, February 26, 2015 Educational Goal

Page 2

Page 3: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

Become familiar with WEKA Java APIs. Requirements

• WEKA data (\Weka-3-6\data): Iris.arff, breast-cancer.arff, soybean.arff, ionosphere.arff, glass.arff

• Problems 1. Implement a simpleClassify function using a classifier provided by WEKA

APIs. Resources: Netbeans IDE Tutorial for using the Weka API http://www.cs.umb.edu/~ding/classes/480_697/homework/WekaJavaAPITutorial.pdf Sample code http://www.cs.umb.edu/~ding/classes/480_697/homework/WekaTest.java

2. Use this simpleClassify function to classify the 5 WEKA data sets. 3. Implement a simpleClassify2 function, which can beat any of the following 5

baseline algorithms: Decision stump, OneR, Decision table, C4.5, PART. (Hint: ensemble learning: a weighted combination of different classifiers may achieve a better performance)

4. Write a report to explain the design idea of the simpleClassify2 function including a flowchart and the proposed algorithm in Pseudocode. The report should include detailed experimental analysis.

5. Prepare a readme file to explain how to run the classifiers. Submission Requirements

1. Submit the softcopy of your report, readme file, and Java source code via UMassOnline. Zero points for late submission.

2. Turn in the paper copy including the cover page of your report, readme file, and Java source code in class. Paper copy should be bound firmly together as one pack (for example, staple, but not limited to, at the left corner). 5 points will be deducted for unbound homework.

3. No hard copies or soft copies results in 0 points. Homework Assignment 3 (200 points) Assigned Date: Thursday, March 5, 2015 Due Date: 4:00PM Thursday, March 26 2015 Educational Goal Apply WEKA to understand and analyze the flooding prediction problem with the aid of flooding data visualization; Perform interdisciplinary teamwork between CS students and EEOS students. Requirements

• Flooding data http://www.cs.umb.edu/~yangmu/dataset/data_precipitation.zip(~50MB) The zip file includes the raw data of:

1. The map files are: sampleLocations.shp worldmap.shp states.shp

2. The atmospheric variables (files) are: 1000hPa geopotential height (Z1000.csv) 500hP geopotential height (Z500.csv)

Page 3

Page 4: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

300hPa geopotential height (Z300.csv) 850hPa temperature (T850.csv) 850hPa zonal and meridional wind (U850.csv and V850.csv) 300hPa zonal and meridional wind (U300.csv and V300.csv) Precipitable water (PW.csv) Geopotential height is in meters, temperature in Kelvin, wind in meters per second and precipitable water in mm. The atmospheric data is from January 1, 2010 to December 31, 2010. Each atmospheric variable is stored as a 2-D matrix with each row being a daily average value and each column being a point in space. The columns are associated with the "ID" field in the map file of "samplelocations.shp". Each file represents 5,328 points in space between the equator and North Pole (37 latitudes and 144 longitudes). For example, the data in column 4 row 6 of the Z1000.csv file represents the 1000hPa geopotential height in January 6, 2010 at the location marked by the point with "ID"=4 in the samplelocations.shp file.

3. Extreme Precipitation Cluster: There are frequent heavy precipitation events from early July to mid-August in the State of Iowa. Severe thunderstorm activity during August 8–11, 2010 in central and southeast Iowa resulted in major flooding from August 11–16, 2010. Two related "Blocking" events have been observed during July and August. Particularly, the extreme precipitation cluster we are looking for begins at July 4th, 2010.

Description of Extreme Precipitation Cluster data: Column 1: Spatial average precipitation (22 stations covering Iowa) for each day (inch) Column 2: Spatial standard deviation of daily precipitation at the 22 stations for each day Column 3: Day of the month Column 4: Julian day (1-365) Column 5: Month Column 6: Year

• Goal: Understand the data using visualization.

• Tasks 1. Visualize the data points in a global map. For example:

Page 4

Page 5: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

2. Visualize each factor (geopotential height, temperature, etc.) on the map using interpolation analysis in a global view. For the starting day of an extreme precipitation cluster, visualize the result at least 10 days ahead of it. For example: suppose 5/30/2011 is the starting day of a precipitation cluster, visualize the data from 5/20/2011 to 5/29/2011. For each factor, extract the data for a day in that range, and visualize it on the map using interpolation analysis. A sample of interpolation result is:

3. Analyze your results and state your idea on how to find the most useful pattern resulting to Precipitation Clusters using visualization. Describe your idea and state why your idea can work (show reasons using visualized results) or does not work (show reasons using visualized results). Tips: A pattern is a constrained factor. For example (the example may not be true in reality), low temperature (constraint of value) happened 5 days ago (constraint of time) at New York state (constraint of location) resulting a precipitation cluster in Iowa. 4. Plot the precipitation value (State of Iowa) versus the time change, noted as a curve c. 5. Plot the each factor value versus the time change, noted as curves 𝑓𝑓1, 𝑓𝑓2,…, 𝑓𝑓𝑛𝑛, given some places close to Iowa. Please pick those spatial points based on your observation. 6. Let’s assume that curve c can be estimated by 𝑓𝑓1, 𝑓𝑓2,…, 𝑓𝑓𝑛𝑛 using a linear combination. For flooding forecasting, we need to estimate c ahead of t days. Therefore, the problem becomes to find the best weight vector a, which has the form of �̂�𝑐(𝑥𝑥) = ∑ 𝑎𝑎𝑖𝑖𝑓𝑓𝑖𝑖(𝑥𝑥 − 𝑡𝑡)𝑖𝑖 , such that 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 = ∑ (�̂�𝑐(𝑥𝑥) − 𝑐𝑐(𝑥𝑥))2𝑥𝑥 is minimized (hint: this is a linear regression problem). Also try different t and plot a figure with 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 versus t, where 𝑡𝑡 ≥ 5. Then study the results based on which t yields the minimum error.

Page 5

Page 6: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

7. Combine your conclusions from Tasks 3 to 6 and make another in-depth investigation cycle. Analyze whether results from Tasks 3 and 6 are confirmed. If not, try to do it in the other way. For example, check whether patterns observed in Task 3 can be observed and confirmed in the result of Task 6 (for example, has a relatively larger weight coefficient); check whether factors corresponding to high weight coefficients obtained in Task 6 imply any visible pattern using ArcGIS map visualization. Submission Requirements

1. Write a brief report that records how your investigations proceeded and answering the above 6 questions.

2. Attach the homework cover page to your report 3. Submit the softcopy of your report and code via UMassOnline. Zero points for

late submission. 4. Turn in the paper copy of your report in class. Paper copy should be bound

firmly together as one pack (for example, staple, but not limited to, at the left corner). 5 points will be deducted for unbounded homework.

5. No hard copies or soft copies results in 0 points. Homework Assignment 4 (200 points) Assigned Date: Tuesday, April 21, 2015 Due Date of Progress Report and Final Report: Progress Report 1: 4:00PM Tuesday, April 28, 2015 Progress Report 2: 4:00PM Tuesday, May 7, 2015 Final Report: 4:00PM Tuesday, May 12, 2015 Educational Goal Apply machine learning techniques to long-lead forecasting of extreme flood events; Perform interdisciplinary teamwork between CS students and EEOS students. Project Requirements

1. Flood data: Contact your TA Yang Mu to obtain the 23,011 observations over 63 years from January 1st, 1948 to December 31st, 2010. The data set includes 9 meteorological variables described in Homework 3 as well as in Table 1 of the paper handed out (Referred as the Flooding Paper in the rest of the assignment description) in class.

Need additional information? Please talk to your professor Wei Ding and we will contact our domain scientists to obtain the information for you.

2. Project Topic: Every team must improve class label definition in Section 4.1 Class Label of the Flooding Paper. You may design a totally different or better way to define class label of extreme precipitation clusters (EPC).

• Your own idea on flood prediction: You can propose your project idea and report your preliminary results in Progress Report 1 due on April 20; or

• Suggested topic 1: Improve the method of feature construction in Section 4.3 Feature Space Construction of the Flooding paper and then perform flood prediction afterward. Please find a better or different way to construct spatial and temporal meteorological variables.

• Suggested topic 2: Improve training set construction in Section 5.3 Hierarchical Re-sampling and then perform flood prediction afterward. Please find a better or different way to deal with imbalanced training set.

• Suggested topic 3:

Page 6

Page 7: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

Three steps are as follows. 1). Extreme Precipitation Cluster (EPC) study We hypothesize that flood is caused by Precipitation Cluster (PC). Therefore, we target at finding EPCs (Extreme Precipitation Cluster) from PCs because EPC is hypothetically associated with extreme flood events. Open research problem: provide a formal definition of EPC and PC. 2). EPC Pattern study After we formally define EPC and PC in Step 1, we can identify patterns for PCs (this is what you have done in Homework 3). Our goal is to find 2 sets of patterns: a positive set (patterns always occur in both PCs and EPCs) and a negative set (patterns are for PCs but not for EPCs). 3). EPC prediction Design a machine learning algorithm to use the positive and negative sets identified in Step 2 for EPC prediction. Report Requirements

1. Your reports must clearly • Explain the motivation of the proposed method. • Discuss why the proposed method is better. • Describe how to quantitatively analyze the proposed method. • Discuss how to design the experiments to evaluate the proposed method. • Discuss how to validate the proposed method using various experiments. • Illustrate the experimental results using figures and/or tables.

Submission Requirements

1. Attach the homework cover page to your report. 2. Submit the softcopy of your report and code via UMassOnline. Zero points for

late submission.

Extreme Precipitation Cluster (EPC) study

Patterns study for EPC

EPC prediction

Identify and define EPC

Observe human readable patterns and convert it to machine readable format

Predict EPC with designed patterns using machine learning model

Page 7

Page 8: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

3. Turn in the paper copy of your report in class. Paper copy should be bound

firmly together as one pack (for example, staple, but not limited to, at the left corner). 5 points will be deducted for unbounded homework.

4. No hard copies or soft copies result in 0 points.

Course Rubric:

Assignment/Deliverable Number Grade % 1. Homework Assignment 1 1 5% 2. Homework Assignment 2 2 5% 3. Homework Assignment 3 3 15% 4. Homework Assignment 4 4 15%

Final Project Presentation 5% Group Work are required for Homework Assignments 3 and 4

Participation (as defined below) 2.5% Attendance (as defined below) 2.5% Examinations including the midterm and final exams

50%

Course Policies: Participation - Participation includes completing all required reading and writing

assignments prior to class, thoughtfully participating in discussions, and taking responsibility for helping create a positive learning environment by arriving promptly, listening respectfully, and participating constructively. Attendance - You are responsible for material covered in any class that you do not

attend. Late Work – No late work is allowed. Group Work

Term Project grading is calculated in the following manner:

1. The evaluation is divided into three phases: Project Phase: In this phase the team score in the project is recorded.

Personal Evaluation Phase: This phase deals with the personal

contribution of each of the members in the team while doing the project. This will be done after the project is finished. The criteria for the personal evaluation is as follows from highly effective level to ineffective level:

1. Understanding the project and taking the initiative. 2. Decision Making (from collaborative down to unilateral) 3. Cooperation (members help others out, members do only own work) 4. Ability to handle conflict or differences (explore and solve conflicts,

avoid or ignore)

Page 8

Page 9: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

5. Balance of participation (balance workload, a few do most of the work) 6. Focus on Schedule 7. Communication 8. Support and Team spirit (appreciation, no appreciation)

Each of you will evaluate your teammates. For example, you will be evaluated by 2 other team members in a 3-person team. Individual personal evaluation score is confidential, but you will be notified your overall score on personal evaluation. During the personal evaluation phase, you should consider your team members’ contributions in the following parts:

1. Project design 2. Code implementation 3. Project testing 4. Project documentation 5. Term project report writing 6. Term project presentation

2. Percentage weight for each phase: Project Phase 50%. Personal Evaluation Phase 50%.

3. Calculation process for the final grade:

The terms used are as follows: Project phase percentage --- project%. Personal Evaluation Phase --- personal%. Term project score --- overall project. Personal evaluation --- overall personal. Maximum points in term project --- project max. Maximum points in personal evaluation --- personal max. Final points --- Final grade.

Final grade = {(overall project/project max)*(project%)} + {(overall personal/personal max)*(personal%)} + The final grade will be out of 100 points.

4. An example implementation of the above process: There is a team with three members X, Y, Z. Let the project maximum score be 100 points (actual score can be different).

Page 9

Page 10: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

Let the personal evaluation have a maximum of 100 points (actual score can be different). Assume the team scored 100 points, the full mark, in the project. Project score for each of the member is as follows (they will get same scores in this phase): Member X: 100 Member Y: 100 Member Z: 100 Personal evaluation score for each of the member is as follows: Member X: 100 Member Y: 70 Member Z: 50 Score of Member X: Overall project score: 100 Overall personal evaluation score: 100 Final grade(X) = {(100/100)*50} + {(100/100)*50} = 50+50=100 out of 100 points

Score of Member Y: Overall project score: 100 Overall personal evaluation score: 70 Final grade(Y) = {(100/100)*50} + {(70/100)*50} = 50+35 = 85 out of 100 points Score of Member Z: Overall project score: 100 Overall personal evaluation score: 25 Final grade (Z) = {(100/100)*50} + {(50/100)*50} = 50+25= 75 out of 100 points

Grading Grading: Grade type for the course is a whole or partial letter grade. (Please see table below)

Note: the lowest passing grade for a graduate student is a “C”. Grades lower than a “C” that are submitted by faculty will automatically be recorded as an “F”. Please see the Graduate Catalog for more detailed information on the University’s grading policy.

Grading Policy

Letter Grade Percentage

Quality Points

A 93-100% 4.00

Page 10

Page 11: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

A- 90-92% 3.75 B+ 87-89% 3.25 B 83-86% 3.00 B- 80-82% 2.75 C+ 77-79% 2.25 C 73-76% 2.00 F 0-72% 0.0

INC

A grade of Incomplete (INC) is not automatically awarded when a student fails to complete a course. Incompletes are given at the discretion of the instructor. They are awarded when satisfactory work has been accomplished in the majority of the course work, but the student is unable to complete course requirements as a result of circumstances beyond his/her control. The student must negotiate with and receive the approval of the course instructor in order to receive a grade of incomplete

N/A

IF Received for failure to comply with contracted completion terms. N/A W Received if withdrawal occurs before the withdrawal deadline. N/A AU Audit (only permitted on space-available basis) N/A

NA Not Attending (student appeared on roster, but never attended class. Student is still responsible for tuition and fee charges unless withdrawal form is submitted before deadline. NA has no effect on cumulative GPA.)

N/A

Required Text(s): Ian H. Witten, Eibe Frank, Mark A. Hall. 2011. Data Mining: Practical Machine Learning

Tools and Techniques. Morgan Kaufmann. ISBN-10: 0321498054 | ISBN-13: 978-0321498052. ISBN 978-0-12-374856-0

Technical Requirements: Students will have accounts at the Unix Lab and the Web Lab in the Computer Science

Department. Recommended Texts None. Other Reading: The instructor will prepare lecture notes in PPT slides that will be distributed to the students

in class. Course Schedule Weeks Meetings Topic & Class Activities Readings Assignments

1.

Jan 27 Course Administration

Slides: Class Administration

Jan 29

WEKA Data Mining Tutorial for First Time and Beginner Users (watch it; will be covered on the exam) Weka Intro

Witten & Frank, CH 1, 9-10

2. Feb 3 Weka Intro (continued) HWK 1 assigned

Page 11

Page 12: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

Weeks Meetings Topic & Class Activities Readings Assignments

Feb 5 Input: Concepts, instances, attributes

Witten & Frank, CH 2-3.2

3. Feb 10

Input: Concepts, instances, attributes (continued)

HWK1 due

Feb 12 Output: Knowledge representation

4. Feb 17 Output: Knowledge representation

(continued)

Feb 19 Output: Knowledge representation (continued) HWK2 assigned

5. Feb 24 Algorithms: The basic methods

Witten & Frank, CH 3.3-4

Feb 26 Algorithms: The basic methods (continued) HWK 2 due

6.

Mar 3 Algorithms: The basic methods (continued)

Mar 5 Algorithms: Covering Algorithms:

Constructing Rules

HWK 3 assigned

7. Mar 10 Midterm Exam Review Mar 12 Midterm Exam

8. Mar 17 Spring Vacation (no class) Mar 19 Spring Vacation (no class)

9.

Mar 24 Post-Midterm Exam Review

Mar 26 Algorithms: Mining Association Rules

Witten & Frank, CH 6.1, 6.2, 6.5

HWK3 due

10.

Mar 31 Algorithms: Mining Association Rules (continued)

Apr 2 Algorithms: Linear Models

Witten & Frank, CH 6.3,6.4, 6.6, 6.7

11.

Apr 7 ArcGIS for Beginners Slides: ArcGIS

Apr 9 Algorithms: Feature Selection and Optimization

Witten & Frank, CH 7.1-7.5)

12. Apr 14 Algorithms: Feature Selection and

Optimization (continued)

Page 12

Page 13: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

Weeks Meetings Topic & Class Activities Readings Assignments

Apr 16 Flood Forecasting: A Review

13.

Apr 21 Algorithms: Feature Selection and

Optimization (continued)

HWK 4 assigned

Apr 23 Algorithms: Semi-Supervised Learning

Witten & Frank, CH 7.6-8

14. Apr 28 Term Project Progress Report HWK 4.I Due

Apr 30 Algorithms: Semi-Supervised Learning (continued)

15. May 5 Term Project Progress Report May 7 Final Exam Review HWK 4.II Due

16. May 12 Term Project Progress Report HWK 4. III Due 17. May 19 Final Exam Methods of Instruction Methods: We will learn from the recommended text book, various sources on the web, and slides that will be made available in the class. Accommodations The University of Massachusetts Boston is committed to providing reasonable academic accommodations for all students with disabilities. This syllabus is available in alternate format upon request. If you have a disability and feel you will need accommodations in this course, please contact the Ross Center for Disability Services, Campus Center, Upper Level, Room 211 at 617.287.7430. http://www.umb.edu/academics/vpass/disability/ After registration with the Ross Center, a student should present and discuss the accommodations with the professor. Although a student can request accommodations at any time, we recommend that students inform the professor of the need for accommodations by the end of the Drop/Add period to ensure that accommodations are available for the entirety of the course.

Academic Integrity and the Code of Student Conduct Code of Conduct and Academic Integrity

It is the expressed policy of the University that every aspect of academic life--not only formal coursework situations, but all relationships and interactions connected to the educational process--shall be conducted in an absolutely and uncompromisingly honest manner. The University presupposes that any submission of work for academic credit is the student’s own and is in compliance with University policies, including its policies on appropriate citation and plagiarism. These policies are spelled out in the Code of Student Conduct. Students are required to adhere to the Code of Student Conduct, including requirements for academic honesty, as delineated in the University of Massachusetts Boston Graduate Catalogue and relevant

Page 13

Page 14: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

program student handbook(s). http://www.umb.edu/life_on_campus/policies/community/code You are encouraged to visit and review the UMass website on Correct Citation and Avoiding Plagiarism: http://umb.libguides.com/citations

Other Pertinent and Important Information Homework: All homework must be typed not hand-written and must be submitted with the given cover page Homework is due exactly at the prescribed time. No late homework is accepted.

Academic Integrity: Providing answers for any examination when not specifically authorized by the instructor to do so,

or, informing any person or persons of the contents of any examination prior to the time the examination is given is considered cheating.

Penalty for cheating will be extremely severe. Use your best judgment. If you are not sure about certain activities, consult the instructor. Standard academic honesty procedure will be followed for cheating and active cheating automatically results F in the final grade. Please check University Policy on Academic Standards and Cheating for additional information (http://www.umb.edu/life_on_campus/policies/academics/academic_honesty).

Attendance: You are responsible for material covered in any class that you do not attend.

Incomplete Policy: The grade incomplete (INC) is reported only where a portion of the assigned or required class work,

or the final examination, has not been completed because of serious illness, extreme personal circumstances, or scholarly reasons at the request of the instructor. If your record is such that you would fail the course regardless of your missing work, you will fail.

Permission of the instructor must be obtained and the form for Grade Incomplete must be completed. If you are receiving the grade of incomplete (INC), you are allowed one year in which to complete

the course. The new grade must be submitted to the Registrar by the grading deadline for that semester, i.e., by the end of the next fall for the fall semester incompletes. The grade for any course not completed by this deadline will be converted to the grade of 'F'.

Coursework Difficulties: Please discuss all coursework matters with me sooner than later. Withdrawing From This Course: Please refer to the written policies and procedures on formal withdrawal and add/change dates listed in the Graduate Studies Catalog. You are advised to retain a copy of this syllabus in your personal file for use when applying for future degrees, certification, licensure, or transfer of credit.

Bibliography Computer Science Department, Stanford University. (2011, Autumn). CS229A Applied Machine Learning.

Retrieved From the Computer Science Department of Standford University web site: http://cs229a.stanford.edu/

School of Informatics, The University of Edinburg. (2014, Autumn). Introductory Applied Machine Learning. Retrieved From the School of Informatics of the University of Edinburg web site: http://www.inf.ed.ac.uk/teaching/courses/iaml/

Page 14

Page 15: CS 438/697 - Syllabus Applied Machine Learning · 2015. 2. 23. · CS 438/697 - Syllabus Applied Machine Learning 4. know how to apply appropriate machine learning algorithms to a

CS 438/697 - Syllabus Applied Machine Learning

Master of Information Technology Strategy, Carnegie Mellon University. (2014, Autumn). Applied Machine

Learning. Retrieved From the Master of Information Technology Strategy of Carnegie Mellon University web site: http://www.cmu.edu/mits/curriculum/core/05-834.html

Page 15