machine learning in biomedical informatics

Download Machine Learning in  BioMedical  Informatics

Post on 23-Feb-2016




0 download

Embed Size (px)


Machine Learning in BioMedical Informatics. SCE 5095: Special Topics Course Instructor: Jinbo Bi Computer Science and Engineering Dept. Course Information. Instructor: Dr. Jinbo Bi Office: ITEB 233 Phone: 860-486-1458 Email: - PowerPoint PPT Presentation


Steven F. Ashby Center for Applied Scientific Computing Month DD, 1997

Machine Learning in BioMedical InformaticsSCE 5095: Special Topics Course

Instructor: Jinbo BiComputer Science and Engineering Dept.

# Course InformationInstructor: Dr. Jinbo Bi Office: ITEB 233Phone: 860-486-1458Email: jinbo@engr.uconn.eduWeb: Tue / Thur. 3:30pm 4:45pm Location: ITEB 127Office hours: Tue/Thur 4:45-5:15pmHuskyCThttp://learn.uconn.eduLogin with your NetID and passwordIllustration# Introduction of the instructor and TAPh.D in MathematicsResearch interests: machine learning, data mining, optimization, biomedical informatics, bioinformaticsTAJingyuan Zhang graduate student in my lab who has done some machine learning work previously


Color of flowersCancer, Psychiatric disorders, Course InformationPrerequisite: Basics of linear algebra, calculus, optimization and basics of programming Course textbook (not required): Introduction to Data Mining (2005) by Pang-Ning Tan, Michael Steinbach, Vipin KumarPattern Recognition and Machine Learning (2006) Christopher M. BishopPattern Classification (2nd edition, 2000) Richard O. Duda, Peter E. Hart and David G. StorkAdditional class notes and copied materials will be givenReading material links will be provided# Objectives:Introduce students knowledge about the basic concepts of machine learning and the state-of-the-art machine learning algorithmsFocus on some high-demanding medical informatics problems with hands-on experience of applying data mining techniques Format:Lectures, Micro teaching assignment, Quizes, A term project

Course Information# GradingMicro teaching assignment (1): 20% In-class/In-lab open-book open notes quizzes (3): 30% Term Project (1): 40%Lab assignment (1): 10%

Lab assignment will not be graded, and it accounts for a warm-up exercise. As long as you turn it in, you will get 10%# PolicyComputersParticipation in micro-teaching sessions is very important, and itself accounts for 50% of the credits for micro-teaching assignmentQuizzes are graded by our teaching assistant with guidance from instructorFinal term projects will be graded by the instructorIf you miss a quiz, there will be a take-home quiz to make up the credits# Micro-teaching sessionsStudents in our class need to form THREE roughly-even study groupsThe instructor will help to balance off the study groupsEach study group will be responsible of teaching one specific topic chosen from the following:Support Vector MachinesSpectral ClusteringBoosting (PAC learning model)# Term ProjectPossible project topics will be provided as links, students are encouraged to propose their ownTeams of 1-3 students can be createdEach team needs to give two presentations: a progress report presentation (10-15min); a final presentation in the last week (15-20min)Each team needs to submit a project reportDefinition of the problemData mining approaches used to solve the problemComputational resultsConclusion (success or failure)# Machine Learning / Data MiningData mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information ACM SIGKDD conferenceThe ultimate goal of machine learning is the creation and understanding of machine intelligence ICML conference The main goal of statistical learning theory is to provide a framework for studying the problem of inference, that is of gaining knowledge, making predictions, and decisions from a set of data. NIPS conference

# Traditional Topics in Data Mining /AIFuzzy set and fuzzy logicFuzzy if-then rulesEvolutionary computationGenetic algorithmsEvolutionary strategiesArtificial neural networksBack propagation network (supervised learning)Self-organization network (unsupervised learning, will not be covered) # Lack theoretical analysis about the behavior of the algorithmsTraditional Techniquesmay be unsuitable due to Enormity of dataHigh dimensionality of dataHeterogeneous, distributed nature of dataChallenges in traditional techniquesMachine Learning/Pattern RecognitionStatistics/AISoft Computing#

Recent Topics in Data MiningSupervised learning such as classification and regressionSupport vector machinesRegularized least squaresFisher discriminant analysis (LDA)Graphical models (Bayesian nets)Boosting algorithms

Draw from Machine Learning domains

# Recent Topics in Data MiningUnsupervised learning such as clusteringK-means Gaussian mixture modelsHierarchical clusteringGraph based clustering (spectral clustering)Dimension reductionFeature selectionCompact feature space into low-dimensional space (principal component analysis)# Statistical BehaviorMany perspectives to analyze how the algorithm handles uncertaintySimple examples:Consistency analysisLearning bounds (upper bound on test error of the constructed model or solution)Statistical not deterministicWith probability p, the upper bound holdsP( > p)


View more >