machine learning in biomedical informatics

Download Machine Learning in BioMedical  Informatics

Post on 13-Jan-2016

25 views

Category:

Documents

1 download

Embed Size (px)

DESCRIPTION

Machine Learning in BioMedical Informatics. SCE 5095: Special Topics Course Instructor: Jinbo Bi Computer Science and Engineering Dept. Course Information. Instructor: Dr. Jinbo Bi Office: ITEB 233 Phone: 860-486-1458 Email: jinbo@engr.uconn.edu - PowerPoint PPT Presentation

TRANSCRIPT

Steven F. Ashby Center for Applied Scientific Computing Month DD, 1997

Machine Learning in BioMedical InformaticsSCE 5095: Special Topics Course

Instructor: Jinbo BiComputer Science and Engineering Dept.

# Course InformationInstructor: Dr. Jinbo Bi Office: ITEB 233Phone: 860-486-1458Email: jinbo@engr.uconn.eduWeb: http://www.engr.uconn.edu/~jinbo/Time: Mon / Wed. 2:00pm 3:15pm Location: CAST 204Office hours: Mon. 3:30-4:30pmHuskyCThttp://learn.uconn.eduLogin with your NetID and passwordIllustration# Introduction of the instructorPh.D in MathematicsPrevious work experience:Siemens Medical Solutions Inc.Department of Defense, BioanalysisMassachusetts General HospitalResearch Interests

subtypingGWAS

Color of flowersCancer, Psychiatric disorders,

http://labhealthinfo.uconn.edu/EasyBreathing# Course InformationPrerequisite: Basics of linear algebra, calculus, and basics of programming Course textbook (not required): Introduction to Data Mining (2005) by Pang-Ning Tan, Michael Steinbach, Vipin KumarPattern Recognition and Machine Learning (2006) Christopher M. BishopPattern Classification (2nd edition, 2000) Richard O. Duda, Peter E. Hart and David G. StorkAdditional class notes and copied materials will be givenReading material links will be provided# Objectives:Introduce students knowledge about the basic concepts of machine learning and the state-of-the-art literature in data mining/machine learningGet to know some general topics in medical informaticsFocus on some high-demanding medical informatics problems with hands-on experience of applying data mining techniques Format:Lectures, Labs, Paper reviews, A term project

Course Information# Survey Why are you taking this course?What would you like to gain from this course?What topics are you most interested in learning about from this course? Any other suggestions?

(Please respond before NEXT THUR. You can also Login HuskyCT and download the MS word file, fill in, and shoot me an email.)

# GradingIn-Class Lab Assignments (3): 30% Paper review (1): 10% Term Project (1): 50%Participation (1): 10%# PolicyComputersAssignments must be submitted electronically via HuskyCTMake-up policyIf a lab assignment or a paper review assignment is missed, there will be a final take-home exam to make upIf two of these assignments are missed, an additional lab assignment and a final take-home exam will be used to make up.# Three In-class Lab AssignmentsAt the class where in-class lab assignment is given, the class meeting will take place in a computer lab, and no lectureComputer lab will be at ITEB 138 (TA reserve)The assignment is due at the beginning of the class one week after the assignment is givenIf the assignment is handed in one-two days late, 10 credits will be reduced for each additional dayAssignments will be graded by our teaching assistant# Paper reviewTopics of papers for review will be discussedEach student selects 1 paper in each assignment, prepares slides and presents the paper in 8 15 mins in the classThe goal is to take a look at the state-of-the-art research work in the related fieldPaper review assignment is on topics of state-of-the-art data mining techniques# Term ProjectPossible project topics will be provided as links, students are encouraged to propose their ownTeams of 1-2 students can be createdEach team needs to give a presentation in the last 1-2 weeks of the class (10-15min)Each team needs to submit a project reportDefinition of the problemData mining approaches used to solve the problemComputational resultsConclusion (success or failure)# Final ExamIf you need make-up final exam, the exam will be provided on May. 1st (Wed)Take-home examDue on May 9th (Thur.)

# Three In-class Lab AssignmentsBioMedical Informatics TopicsSo manyCardiac Ultrasound image categorizationComputerized decision support for Trauma Patient CareComputer assisted diagnostic coding# Cardiac ultrasound view separation

# Cardiac ultrasound view separation

Classification (or clustering)

Apical 4 chamber view

Parasternal long axis view

Parasternal short axis view#

25 min of transport time/patient

High-frequency vital-sign waveforms (3 waveforms)ECG, SpO2, RespiratoryLow-frequency vital-sign time series (9 variables)Derived variablesECG heart rateSpO2 heart rateSaO2 arterial O2 saturationRespiratory rate

Discrete patient attribute data (100 variables)Demographics, injury description, prehospital interventions, etc. Measured variablesNIBP (systolic, diastolic, MAP)NIBP heart rateEnd tidal CO2Vital signs used in decision-support algorithms

HRRRSaO2SBPDBPPropaqTrauma Patient Care# 16Electrocardiogram, PhotoplethysmogramThe variables consist of ECG, photoplethysmogram, and respiratory waveform signals recorded at approximately 182, 91, and 23 Hz 783 patients with ANY non-zero data AT ANY TIME

Trauma Patient Care#

Heart Rate

Respiratory Rate

Saturation of Oxygen

BloodPressureMajorBleedingMake a predictionTrauma Patient Care# Patients Criteria Patient1428diagnosis250AMI24143250429SCIP...............heart failurediabetesCode database

Look up ICD-9 codesPatient Notes Patient1ANoteBCDE2FG...............Hospital Document DBDiagnostic Code DBStatisticsreimbursement Insurance19SIEMENS/38Diagnostic coding# 19Here is the example of the document in our database.

Patients Criteria Patient1428diagnosis250AMI24143250429SCIP...............heart failurediabetesCode database

Look up ICD-9 codesPatient Notes Patient1ANoteBCDE2FG...............Hospital Document DBDiagnostic Code DBStatisticsreimbursement Insurance

20SIEMENS/38Diagnostic coding# 20Here is the example of the document in our database.

Patients Criteria Patient1428diagnosis250AMI24143250429SCIP...............heart failurediabetesCode database

Look up ICD-9 codesPatient Notes Patient1ANoteBCDE2FG...............Hospital Document DBDiagnostic Code DBStatisticsreimbursement Insurance

21SIEMENS/38Diagnostic coding# 21It would be helpful if we have a system to highlight the evidence from the documents that correspond to an assigned code, and present to the coder.

Machine Learning / Data MiningData mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information The ultimate goal of machine learning is the creation and understanding of machine intelligenceThe main goal of statistical learning theory is to provide a framework for studying the problem of inference, that is of gaining knowledge, making predictions, and making decisions from a set of data.

# Traditional Topics in Data Mining /AIFuzzy set and fuzzy logicFuzzy if-then rulesEvolutionary computationGenetic algorithmsEvolutionary strategiesArtificial neural networksBack propagation network (supervised learning)Self-organization network (unsupervised learning, will not be covered) # Next ClassContinue with data mining topicsReview of some basics of linear algebra and probability# Last ClassDescribed the syllabus of this courseTalked about HuskyCT website (Illustration)Briefly introduce 3 medical informatics topicsMedical images: cardiac echo view recognitionNumerical: Trauma patient careFree text: ICD-9 diagnostic codingIntroduce a little bit about definition of data mining, machine learning, statistical learning theory.

# Lack theoretical analysis about the behavior of the algorithmsTraditional Techniquesmay be unsuitable due to Enormity of dataHigh dimensionality of dataHeterogeneous, distributed nature of dataChallenges in traditional techniquesMachine Learning/Pattern RecognitionStatistics/AISoft Computing#

Recent Topics in Data MiningSupervised learning such as classification and regressionSupport vector machinesRegularized least squaresFisher discriminant analysis (LDA)Graphical models (Bayesian nets)others

Draw from Machine Learning domains

# Recent Topics in Data MiningUnsupervised learning such as clusteringK-means Gaussian mixture modelsHierarchical clusteringGraph based clustering (spectral clustering)Dimension reductionFeature selectionCompact feature space into low-dimensional space (principal component analysis)# Statistical BehaviorMany perspectives to analyze how the algorithm handles uncertaintySimple examples:Consistency analysisLearning bounds (upper bound on test error of the constructed model or solution)Statistical not deterministicWith probability p, the upper bound holdsP( > p)