big data e inteligencia artificial y hpc en salud y ... · spanish node of elixir (elixir-es)...
TRANSCRIPT
Plan TL InfoDay21 October 19
Alfonso Valencia. Ph.D.ICREA ProfessorDirector Life Sciences Dept. BSCDirector Spanish Bioinformatics InstituteINB-ISCIII ELIXIR-ES
Big data e inteligencia artificial y HPC en salud y biomedicina
•DATOS
Datos sobre el tiempo
Datos sobre Biología: Genómica, Transcriptómica, Epigenómica, Proteómica, Metabolómica, Lipidómica, ….
Diez mil millones de cell phones
16
Spanish National Bioinformatics Institute (INB)Spanish Node of ELIXIR (ELIXIR-ES)
TransBioNet
network of Bioinformaticians in
Research Institutes of Spanish
Hospitals
(INB hosted)
ELIXIR:
European infrastructure for biological information
13
•DATOS
• INTELIGENCIA ARTIFICIAL
10/12/2019
Advances in AI and HPC go hand by handSince GPUs were first used in AI (2012), computing power available to generate AI models has increased exponentially – and improvements in computing power has been key for AI progress.
Petaflops/dayused to trainneural networks
10/12/2019
10/12/2019
BSC works on Medical Imaging
Glaucoma EpiretinalMembrane
Nevus Macular Degeneration
● Detecting retina pathologies
○ Trained models competitive with ophthalmologists
○ With Lenovo & Hospital Vall Hebron
● Learning from liver conditions
○ Learning about rare diseases
○ With Hospital Clinic
● Predicting and guiding in-vitro success
○ Finding the best embryo ASAP
○ With Hospital Clinic
● Supporting medical doctors on Rx review
○ Aid for Dr. in rural areas
○ With Asepeyo and ICS
By Ulises Cortes and Dario Garcia UPF & BSC
Predicting protein structure from the sequence is one of the fundamental problems in molecular biology.
It is the key to the prediction of the consequences of mutations in human diseases and to drug design
•DATOS
• INTELIGENCIA ARTIFICIAL
•HPC
The Evolution of the Research Paradigm
Numerical Simulation and
Big Data Analysis
• Reduce expense
• Avoid suffering
• Help to build knowledge where experiments are impossible or not affordable
Digital Twin for Future Medicine
Is this scenario possible? When?
By Mariano Vazquez, CASE - BSC
Simulations of biological systems at different levels
Model’s readouts
Drug target
~48 h simulation time, 30 min wall time~2500 cells
Slide from M. Ponce de L, BSC
By Victor Guallar ICREA & BSC
Exploration of the parameter
space
Monitoring and tailoring simulations
during execution time
Analysing the results of the simulations
Large Scale Simulations
Event Recognition System
AI / ML systems
Advances in AI and HPC go hand by handSince GPUs were first used in AI (2012), computing power available to generate AI models has increased exponentially – and improvements in computing power has been key for AI progress.
Petaflops/dayused to trainneural networks
From Jordi Torres
Text mining & Cognitive computing for melanoma research
Bio-Knowledge
corpus
Healthcare dictionarie
s
IBM Text Analytics Catalog
Content Analytics
graph
Bio-entities associations
Functional interpretation
Translational applications
16M abstracts
BSCText mining
WATSONCognitive
computing
In collaboration with IBM Spain
BSC Specialized info Watson General info
Corpus/entities extractionNetwork of entitiesInference of relationships
myastheniaazathioprine
nephrectomy
sarcoidosis
breast cancer
tyrosine kinase inhibitor
potency
apctetanus
hypophysectomy diplopia
dysphagia
lung cancer
aidselastosis
cheilitis
lobectomy
chest pain
leukoderma
autoimmunity pancreatic cancer
pancreatitislymphadenitis
toxoplasmosislymphadenopath
y
colon cancer
vitiligo
hepatomauveitis
ascitessplenectomyirritation
vulvectomy hyperactivityanemia
appendectomy
appendicitis
neurosurgery
sequela
dilation and curettage
endometriosis
fertility
carcinoid
thyroidectomy
hypercalcemia
goiter
photocoagulationamputation
hemosiderosis
scleritis
iritis
retinal detachmen
t
enucleation
atypicality
psoriasis
melanosis
exenteration
neurofibroma
teratoma gastrectomy
amaurosis
LOX
MIF
IL10
SMAD3CTGFSKI
MMP1CAV1TGFB1
NOTCH1 IRF8
IGF1
MST1
ITK
CD81
XIAP
FAS TP53
CCND
BRCA1
BRCA2
CDKN2
CDKN2B
MSH2
MLH1
PTEN
LTBP2
CD55
TFIL6
Word embeddings
100M tokens
v1 0,78 0,65 0,98
v2 0,23 0,12 0,32
v3 0,90 0,32 0,56
v4 0,08 0,43 0,65
v5 0,77 0,88 0,77
|V|
Evaluación intrínsica: cálculo similitud entre términos (sinónimos en SNOMED)
Evaluación extrínsica: comprobar su utilidad en otras tareas PLN (neuroNER)
Word2Vec
fastText
85M tokens
v1
v2
v3
v4
Word vectors
Wo
rd e
mb
edd
ings
MAPK3
H3K9 methylation
cerebellar granule cells
individualized Paediatric Cure (iPC)
Cloud-based virtual-patient models
for precision paediatric oncology
Gender and other biases …
Explainable Artificial Inteligence
10/12/2019