Download - BioMISS: Language Diversity of Computing
![Page 1: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/1.jpg)
The Language Diversity of Computing
Or, how to talk with a computer.
Jeremy Yang(Mgr., Systems & Programming)
Translational Informatics Div.Dept. of Internal MedicineUniversity of New Mexico
BioMISS -- Thursday, Oct 15, 2015 1
![Page 2: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/2.jpg)
Language Diversity Examples
Python Perl Fortran C R
C++ Java Basic SQL Sparql
XML XSD XPath URLs bash
HTML HTTP ASCII UTF-8 regex
Scala ICD-10 Ruby OWL RDF
2
![Page 3: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/3.jpg)
A Working Definition of “Language”
● Coherent symbology (symbolic system)
3
![Page 4: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/4.jpg)
Languages: Some major advances
COBOL(1960) Sparql
(2008)
Java (1995) 4
1950
FORTRAN (1953)
1960 1970 1980 1990 2000 2010
SQL(1979)
C(1969)
C++ (1979)
Perl (1987)
Python (1989)
HTML (1990)
XML (1997)
RDF (1999)
![Page 5: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/5.jpg)
Language merit vs. elitism
5
![Page 6: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/6.jpg)
Why do we care about languages?
● Compatibility● Efficiency● Usability
● Knowledge representation
● Intelligence● Evolution
Naturellement!6
![Page 7: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/7.jpg)
7
℅ Prof Harald Sack, Hasso Plattner Institute, U. Potsdam, MOOC: “Semantic Web Technologies”
![Page 8: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/8.jpg)
Programming paradigms
Object Oriented● classes● instances● methods● ~ nouns
8
Functional● functions● routines● parameters● ~ verbs
Programming paradigms are language paradigms.
![Page 9: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/9.jpg)
9
Object Oriented Example:
CDK = Chemistry Development Kit
Open source Java package & API
Computers have “evolved” from numerical calculators to knowledge processors.
Knowledge representation and processing via language!
![Page 10: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/10.jpg)
10
Italian Music Terms
Choice of language should be guided by the domain.
![Page 11: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/11.jpg)
Q: So what is the problem?A: Language gaps
CODE
JARGON
MEANING
“Interpretation”
MATH
11
![Page 12: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/12.jpg)
Q: So what is the problem?A: Standards (so many!)
“Why can’t my iPhone talk to my ...”
● TV● Audio system● Car● Medical records
12
![Page 13: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/13.jpg)
Q: So what is the problem?
A: Language shapes, empowers, limits thought. (Sapir-Whorf Hypothesis, aka Linguistic Relativity)
13
![Page 14: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/14.jpg)
Q: So what is the problem?A: Abstraction
● Overgeneralizing● Reality is concrete!● But: abstraction organizes knowledge● (a feature, not a bug!)
14
“We think in generalities, but we live in detail.” -- Alfred North Whitehead
![Page 15: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/15.jpg)
15
Abstraction: Shakespeare quotes
“Full of sound and fury, signifying nothing.”
![Page 16: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/16.jpg)
16
"On to this one quicker than a jackrabbit on a hot date. Look at this finish! That is beyond world class."
"Braver than a matador in a pink tutu he was."
"Racing Santander’s butcher men tried to hack down Xavi. Xavi dancing over the combine harvesters that are coming after him."
“He could make an onion cry.” (on Lionel Messi) "Where the insane
becomes the routine with this man. He is nothing less than a ball whisperer."
Abstraction: Ray Hudson Quotes
![Page 17: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/17.jpg)
17
“You campaign in poetry. You govern in prose.” - Mario Cuomo
But maybe all language is poetic.
![Page 18: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/18.jpg)
Languages of Biomedical Knowledge
18
![Page 19: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/19.jpg)
19
Which cirrhosis?Specificity?
http://apps.who.int/classifications/icd10
![Page 20: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/20.jpg)
Translation and mapping terms
20
story
history
![Page 21: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/21.jpg)
Our project:Illuminating the Druggable Genome (IDG)
$4.9M21
![Page 22: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/22.jpg)
Illuminating the Druggable GenomeKnowledge Management Center (IDG-KMC)
Translational Informatics DivisionChief: Tudor Oprea, MD, PhD
IDG-KMC Workflow
22
![Page 23: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/23.jpg)
IDG-KMC Collaborator Network
23
![Page 24: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/24.jpg)
Slide ℅ Tudor Oprea
24
Heterogeneous data integration. Language diversity.
![Page 25: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/25.jpg)
IDG-KMC Language Challenge:Case #1: Drug Nomenclature
25http://pasilla.health.unm.edu/tomcat/drugdb
![Page 26: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/26.jpg)
IDG-KMC Language Challenge:Case #2:Disease Nomenclature
26
![Page 27: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/27.jpg)
27
ICD Disease Ontology● The International Classification of
Diseases (ICD) is the standard diagnostic tool for epidemiology, health management and clinical purposes.
● WHO● Clinical emphasis ● Procedures (CM)● EMR● Versions
● The mission the Disease Ontology (DO) is to provide an open source ontology for the integration of biomedical data that is associated with human disease.
● Academic network● Research emphasis● Community driven● Continual updates
![Page 28: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/28.jpg)
Disease nomenclature● Nosology, classification, ontology● 17k codes in ICD-9. 155k codes in ICD-10.● Implicit: Disease model of medicine
28
![Page 29: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/29.jpg)
My recent Dx: Otitis
Disease vs. Condition vs. Symptom vs. Phenotype
29
℅ WebMD
![Page 30: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/30.jpg)
30
IDG KMC: Gene expression vs. Tissues; Different sources, tissue terms.
![Page 31: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/31.jpg)
IDG-KMC: TCRD - Target Central Research Db+------------+------------+--------+------+------------------------------------------------------------------+--------+-------+| doid | Disease | zscore | conf | Protein | idgfam | tdl |+------------+------------+--------+------+------------------------------------------------------------------+--------+-------+| DOID:13189 | Gout | 3.512 | 1.8 | Alpha-protein kinase 1 | Kinase | Tbio || DOID:13189 | Gout | 3.214 | 1.6 | Serine/threonine-protein kinase SIK1 | Kinase | Tchem || DOID:13189 | Gout | 2.922 | 1.5 | Melanocortin receptor 3 | GPCR | Tchem || DOID:13189 | Gout | 2.797 | 1.4 | Taste receptor type 2 member 30 | GPCR | Tbio || DOID:13189 | Gout | 2.576 | 1.3 | Taste receptor type 2 member 16 | GPCR | Tbio || DOID:13189 | Gout | 2.379 | 1.2 | Hepatocyte nuclear factor 4-gamma | NR | Tbio || DOID:13189 | Gout | 2.441 | 1.2 | Tyrosine-protein kinase SYK | Kinase | Tchem || DOID:13189 | Gout | 1.948 | 1.0 | cGMP-dependent protein kinase 2 | Kinase | Tchem || DOID:13189 | Gout | 1.798 | 0.9 | Pannexin-1 | IC | Tbio || DOID:13189 | Gout | 1.517 | 0.8 | Taste receptor type 2 member 38 | GPCR | Tbio || DOID:13189 | Gout | 1.565 | 0.8 | Transient receptor potential cation channel subfamily A member 1 | IC | Tclin || DOID:13189 | Gout | 1.531 | 0.8 | Transient receptor potential cation channel subfamily V member 1 | IC | Tclin || DOID:13189 | Gout | 1.388 | 0.7 | Adenosine kinase | Kinase | Tchem || DOID:13189 | Gout | 1.427 | 0.7 | Interleukin-1 receptor-associated kinase 1 | Kinase | Tchem || DOID:13189 | Gout | 1.375 | 0.7 | Transient receptor potential cation channel subfamily M member 3 | IC | Tbio || DOID:13189 | Gout | 1.255 | 0.6 | Free fatty acid receptor 4 | GPCR | Tchem || DOID:13189 | Gout | 1.231 | 0.6 | P2X purinoceptor 2 | IC | Tbio || DOID:13189 | Gout | 1.198 | 0.6 | Proto-oncogene tyrosine-protein kinase Src | Kinase | Tclin || DOID:13189 | Gout | 1.108 | 0.6 | Tribbles homolog 1 | Kinase | Tbio || DOID:13189 | Gout | 1.093 | 0.5 | Activin receptor type-1B | Kinase | Tchem || DOID:13189 | Gout | 1.048 | 0.5 | Transient receptor potential cation channel subfamily V member 2 | IC | Tbio |+------------+------------+--------+------+------------------------------------------------------------------+--------+-------+
Disease-gene associations via literature text mining. 31
![Page 32: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/32.jpg)
32
Text mining, named entity recognition, term frequencyNatural language processing, Google, Watson, Siri, and the state of the art
![Page 33: BioMISS: Language Diversity of Computing](https://reader031.vdocuments.mx/reader031/viewer/2022030312/58ed67491a28ab4e428b4571/html5/thumbnails/33.jpg)
Language Diversity of Computers
Final Thought:
“Can we talk?”*
℅ Joan Rivers, 1933-201433