nlp cell - i2b2: informatics for integrating biology & the … steps in natural language...
TRANSCRIPT
![Page 1: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/1.jpg)
NLP Cell
Qing Zeng-Treitler, PhDSergey Goryachev, MS
![Page 2: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/2.jpg)
Outline
NLP overviewHITEx overviewUse casesNLP cell in the hive
![Page 3: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/3.jpg)
Free Text Data is A Rich Source of Information
Rule out polymyalgia rheumatica versus early inflammatory arthritis versus diffuse osteoarthritis. She will continue on Tylenol as needed. I have urged her to start Aleve, 1 tablet twice a day with usual precautions. I have sent today for sedimentation rate, CRP, rheumatoid factor, ANA, and CBC. She is quite reluctant to have a blood drawn, but I have told her it is very important to help guide her treatment. I will be in touch with her when results are available, and we briefly discussed the possibility of starting steroids. I will arrange office follow up in 1 month.
Diagnoses
LabOrders
Medications
Follow Up PlansTreatment Options
![Page 4: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/4.jpg)
Typical Steps in Natural Language Processing (NLP)
Morphological analysis (tokenizing)Syntactic analysis (transforming linear sequences/sentences of words into structures, usually trees)Semantic analysis (assigning meaning to the syntactic structures)Discourse integration (interpreting a sentence in the context of adjacent sentences)Pragmatic analysis (to understand the actual meaning of a piece of text)
![Page 5: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/5.jpg)
I have sent today for sedimentation rate, CRP, rheumatoid factor, ANA, and CBC. She is quite reluctant to have a blood drawn, but I have told her it is very important to help guide her treatment. I will be in touch with her when results are available, and we briefly discussed the possibility of starting steroids.
I have sent today for sedimentation rate, CRP, rheumatoid factor, ANA, and CBC. She is quitereluctant to have a blood drawn, but I have told her it is very important to help guide her treatment. I will be in touch with her when results are available, and we briefly discussed the possibility of startingsteroids.
I have sent today for sedimentation rate, CRP, rheumatoid factor, ANA, and CBC. She is quite reluctant to have a blood drawn, but I have told her it is very important to help guide her treatment. I will be in touch with her when results are available, and we briefly discussed the possibility of starting steroids.
Pronoun Verb Adjective Proposition NounI have sent today for sedimentation rate, CRP, rheumatoid factor, ANA, and CBC. She is quite reluctant to have a blood drawn, but I have told her it is very important to help guide her treatment. I will be in touch with her when results are available, and we briefly discussed the possibility of starting steroids.
Lab TestsTime
OrderI have sent today for sedimentation rate, CRP, rheumatoid factor, ANA, and CBC. She is quite reluctant to have a blood drawn, but I have told her it is very important to help guide her treatment. I will be in touch with her when results are available, and we briefly discussed the possibility of starting steroids.
Dr.xxx 01/01/1999I have sent today for sedimentation rate, CRP, rheumatoid factor, ANA, and CBC. She is quite reluctant to have a blood drawn, but I have told her it is very important to help guide her treatment. I will be in touch with her when results are available, and we briefly discussed the possibility of starting steroids.
Lab Test Order:
Patient Name: xxxx
Physician Name: xxxx
Date: 01/01/1999
Test: Sedimentation Rate…..
![Page 6: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/6.jpg)
NLP Approaches
SymbolicEmpirical
![Page 7: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/7.jpg)
Symbolic Approach
TokenizationLexical AnalysisSyntactic AnalysisSemantic AnalysisPragmatic Analysis
{Finding: InfiltrationCertainty: HighDistribution: PatchyTime of Appearance: NewLocation: Left Lower Lobe}
![Page 8: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/8.jpg)
Empirical (Statistical) Approach
CorporaSupervised learningUnsupervised learning
New (ADJ) patchy (ADJ) infiltrate (N) is (AUX) found (V) in (PREP) left (ADJ) lower (ADJ) lobe (N).
The (DEF) patient (N) complains (V) of (PREP) chest (N) Pain (N).………
The cardiovascular exam is regular.
Probability ((DEF)(ADJ) (N) (V) (ADJ))> Probability ((N) (ADJ) (PREP) (V) (N))
![Page 9: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/9.jpg)
NLP Applications in the Biomedical Domain
Information extraction of clinical dataText mining of literatureFree text query and retrievalAutomated question answeringSpeech recognitionReport generationDecision support, adverse event detection, knowledge discovery, tailored health communication ……
![Page 10: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/10.jpg)
MetaMap Transfer (MMTx) - NLM
Syntactic parsing and lexical matchingExtraction UMLS concepts from free text
Paragraph
Sentence
Token
Part of Speech
Noun Phrase
Concept
![Page 11: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/11.jpg)
Medical Language Extraction and Encoding System(MEDLEE) - Columbia
problem:mass certainty>> moderate certainty problemdescr>> palpable code>> UMLS:C0746412^mass palpable
ANY DECISION TO BIOPSY A PALPABLE MASS SHOULD BE MADE ON CLINICAL GROUNDS.
Sub-language semantic grammarIts ability to detect clinical conditions in chest X-ray reports evaluated
“The natural language processor was not distinguishable from the physicians and was superior to all other comparison subjects. “
![Page 12: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/12.jpg)
Symbolic Text Processor (SymText) - LDS
syntactic + semantic (Bayesian Network)“In extracting pneumonia related concepts from chest x-ray reports, the performance of the natural language processing system was similar to that of physicians and better than that of lay persons and keyword searches.”
![Page 13: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/13.jpg)
Statistical Natural Language Processing of Medical Report - UCLA
Statistical parser and semantic interpreter“Recall and precision reached a percentile in the mid 80's from a little over one hundred training sentences and reached recall 90% at precision 89% by one thousand training sentences.”
![Page 14: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/14.jpg)
Outline
NLP overviewHITEx overviewUse casesNLP cell in the hive
![Page 15: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/15.jpg)
Rationale
Why another NLP system?There lacked open source and easy-to-adopt solutions
![Page 16: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/16.jpg)
HITEx OverviewBuilds on top of GATE NLP FrameworkConsists of:
Collection of NLP components specifically created to extract clinical information.Runtime environment management moduleUMLS databaseSupporting libraries
Uses GATE document for inter-component communicationComponents are assembled into task-specific NLP pipeline applications that parse the unstructured text records
![Page 17: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/17.jpg)
HITEx ComponentsCurrently, there are 12 components that are part of HITEx:
TokenizerPart-of-Speech TaggerSentence SplitterNoun Phrase SplitterSectionizerUMLS Concept FinderSmoking Status FinderRegular Expression Concept FinderFamily History FinderTemporal FinderNegation FinderN-gram Finder
![Page 18: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/18.jpg)
HITEx Pipeline
Dynamically created from HITEx componentsThe order of components is configurableIndividual component’s parameters are configurableThe input of the pipeline is the unstructured report textEach component’s may use the output of the previous component(s) as its inputThe application’s output is a GATE document that contains task-specific results of NLP parsing
![Page 19: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/19.jpg)
Evaluation 1
The principal diagnosis, co-morbidity and smoking status extracted by HITEx from a set of 150 discharge summaries were compared to an expert-generated gold standard.
The accuracy of HITEx was 82% for principal diagnosis, 87% for co-morbidity, and 90% for smoking status extraction, when cases labeled "Insufficient Data" by the gold standard were excluded.
![Page 20: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/20.jpg)
Evaluation 2
The diagnosis and family history-related diagnosis extracted by HITEx from a set 350 sentences were compared to a human rater.
The precision and recall of extracting all diagnoses were 85.12% and 86.93%, respectively. The precision and recall of differentiating family history from patient history diagnoses were 96.30% and 92.86%, respectively. Both the precision and recall of exact family member assignment were 92.31%.
![Page 21: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/21.jpg)
Outline
NLP overviewHITEx overviewUse casesNLP cell in the hive
![Page 22: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/22.jpg)
Typical Work Flow
Define taskObtain text files from the data martDesign a HITEx pipelineIteratively test and refine
Change configuration of the pipeline or individual modulesDevelop new modules
Evaluation Review results on the sentence levelEvaluate information retrieval performance on the report/patient level
![Page 23: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/23.jpg)
Case 1: Identify co-morbidities
Problem: Identify patient’s co-morbidities in the unstructured medical report. Co-morbidities are all findings that are not principal diagnoses.NLP Solution:
Identify document sections; exclude all sections that are categorized as principal diagnoses related.Identify findings in the relevant sections, apply negation.
The main task is performed by the UMLS Concept Findercomponent.Section identification and filtering is performed by the SectionizercomponentNegation is performed by the Negation FinderAll components are part of the HITEx core.
![Page 24: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/24.jpg)
Co-morbidities Pipeline Application
![Page 25: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/25.jpg)
Case 2: Identify smoking status
Problem: Identify patient’s smoking status (Current, Past, Denies, Non-smoker) using the unstructured medical report data.
NLP solution: Build the classification model using manually annotated sentences (gold standard)Plug in the model into the NLP pipeline application to determinesmoking status of each sentence that mentions smoking
The main task is performed by the Smoking Status Finder component (part of HITEx)
![Page 26: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/26.jpg)
Smoking Status Pipeline Application
![Page 27: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/27.jpg)
Case 3: Identify Erosion
Problem: Identify erosion in the unstructured medical records
SolutionUse regular expression matching to capture all occurrences of erosion-related keywords.Use negation detection to exclude all negated erosions.
The main task is performed by the Regular Expression Concept Finder component (part of HITEx), configured with special rules to capture erosions.Negation Finder is used to identify negated erosions.
![Page 28: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/28.jpg)
Erosions Pipeline Application
![Page 29: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/29.jpg)
Outline
NLP overviewHITEx overviewUse casesNLP cell in the hive
![Page 30: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/30.jpg)
NLP Cell Overview
NLP Cell uses HITEx core as a back endInstalls as an optional cell in the I2B2 HiveCommunicates with the clients using SOAP protocolProvides the following services to its clients:
getDiagnoses: returns a list of principal diagnoses codes.getDischargeMedications: returns a list of discharge medications.getSmokingStatus: returns the smoking status (e.g., current smoker).getAllConcepts: returns a list of all available concepts from a document (i.e., principal diagnoses, discharge medications and smoking status)getCustomConcepts: returns a list of custom concepts, given a custom task configuration.
![Page 31: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/31.jpg)
Cell CommunicationClients communicate with the cell using standard, pre‐defined I2B2 XML request and response messages.Client sends
the type of operation to performthe unstructured text to perform operation onCustom pipeline configuration (optional)
Cell returns Status of the requestActual results in the form of concept codes, if request was successful
![Page 32: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/32.jpg)
NLP Cell Operation
![Page 33: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/33.jpg)
NLP Cell Client
Available as Eclipse plug-inInstalls into the I2B2 workbenchProvides an interface to run standard NLP pipelines on the user-specified report. Allows advanced users to build custom pipelines to solve complex NLP tasksProvides graphical interface for individual component configuration
![Page 34: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/34.jpg)
Co-morbidities Pipeline Application
Original unstructured textOriginal unstructured text
![Page 35: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/35.jpg)
Co-morbidities Pipeline Application
Tokens and space tokensTokens and space tokens
![Page 36: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/36.jpg)
Co-morbidities Pipeline Application
SentencesSentences
![Page 37: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/37.jpg)
Co-morbidities Pipeline Application
[NNP] [:] [NNP] [,] [NNP] [NNP] [NN] [:] [JJ]
[NNP] [NNP] [:] [CD] [NNP] [NNP] [:] [CD]
[NNP] [NNP] [:]
[NNP] [JJR] [NN] [JJ] [JJ] [NNS] [.]
[NNP] [NNP] [:]
[LS] [.] [JJR] [NN] [NN] [.][LS] [.] [NNP] [.][LS] [.] [NN] [IN] [NN] [NN] [.][LS] [.] [NNP] [NN] [.]
[NNP] [NNP] [:] [NN] [.]
…. ….. ….. ….
Part-of-speech tags associated with tokensPart-of-speech tags associated with tokens
![Page 38: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/38.jpg)
Co-morbidities Pipeline Application
Noun phrasesNoun phrases
![Page 39: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/39.jpg)
Co-morbidities Pipeline Application
Section category: “Secondary Diagnoses”Section category: “Secondary Diagnoses”
![Page 40: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/40.jpg)
Co-morbidities Pipeline Application
UMLS ConceptsUMLS Concepts
![Page 41: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/41.jpg)
Co-morbidities Pipeline Application
Negation status: not negatedNegation status: not negated
![Page 42: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/42.jpg)
Co-morbidities Pipeline Application
Co-morbidities:===========================1. [C0024050] Lower gastrointestinal hemorrhage (disorder)2. [C0002871] Anemia3. [C0004238] Atrial Fibrillation4. [C0151636] Premature ventricular contractions
Co-morbidities:===========================1. [C0024050] Lower gastrointestinal hemorrhage (disorder)2. [C0002871] Anemia3. [C0004238] Atrial Fibrillation4. [C0151636] Premature ventricular contractions
![Page 43: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/43.jpg)
![Page 44: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/44.jpg)
![Page 45: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/45.jpg)
![Page 46: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/46.jpg)
![Page 47: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/47.jpg)
![Page 48: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/48.jpg)
![Page 49: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/49.jpg)
Smoking Status Pipeline Application
Smoking status: “Current Smoker”
![Page 50: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/50.jpg)
![Page 51: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/51.jpg)
![Page 52: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/52.jpg)
![Page 53: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/53.jpg)
![Page 54: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/54.jpg)
Erosions Pipeline Application
Concept Type: Regular Expression ConceptConcept Name: Erosion KeywordConcept Code: LCS-I2B2:erosionKeyword: erosionNegation status: Not Negated
![Page 55: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/55.jpg)
Erosion detection rules<?xml version="1.0" encoding="ISO-8859-1" ?> <concepts>
<concept><def>
<![CDATA[ (?i)(?m)(\W+|\d+)(ero[a-zA-Z]+)(\W+|\d+) ]]> </def><capt_group_num>2</capt_group_num> <type>RegEx</type> <name>Erosion Keyword</name> <features>
<feature><name>code</name>
<value><![CDATA[ LCS-I2B2:erosion ]]>
</value></feature>
</features></concept>
………</concepts>
![Page 56: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/56.jpg)
![Page 57: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/57.jpg)
![Page 58: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/58.jpg)
![Page 59: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/59.jpg)
![Page 60: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/60.jpg)
![Page 61: NLP Cell - i2b2: Informatics for Integrating Biology & the … Steps in Natural Language Processing (NLP) Morphological analysis (tokenizing) Syntactic analysis (transforming linear](https://reader031.vdocuments.mx/reader031/viewer/2022013014/5ab021557f8b9aa8438e384a/html5/thumbnails/61.jpg)