presentation - people isg (intelligent systems group) researching group donostia- san sebastián...
Post on 22-Dec-2015
213 views
TRANSCRIPT
![Page 1: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/1.jpg)
PRESENTATION - people
ISG (Intelligent Systems Group) Researching Group
http://www.si.ehu.es/isg
Donostia- San SebastiánComputer Science Faculty - University of the Basque
Country Group leader: Pedro Larrañaga Ph.D.: Jose Lozano, Endika Bengoetxea, Iñaki Inza Ph.D. Students: Rosa Blanco, Jose L. Flores, Cristina
González, Aritz Pérez, Ramón Sagarna, Guzmán Santafé Collaborator: Jose M. Peña (Ph.D., Aalborg University),
Rubén Armañanzas
![Page 2: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/2.jpg)
RESEARCH TOPICS Machine Learning – Data mining:
Learning of Bayesian networks (learning the joint probability) Bayesian networks for (supervised – unsupervised) classification Preprocess tasks: feature subset selection problem, discretization,
imputation of missing values... Optimization:
Genetic Algorithms Estimation of Distribution Algorithms (EDAs) Bayesian
networks for optimization in NP-hard problems
Applications: Medical applications (brain images, cirrhotic patients,breast
cancer, skin melanoma, etc.) Bioinformatics: classification in DNA microarrays Software testing
![Page 3: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/3.jpg)
SEVERAL RESEARCH PROJECTS
Data mining in bioinformatics Software testing ELVIRA project:
Open source code for building-managing Bayesian networks (building, inference, propagation, abduction, classification, explanation...)
Written in Java Concurrently programmed by 5 spanish universities
http://leo.ugr.es/~elvira/
![Page 4: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/4.jpg)
DATA MINING IN BIOINFORMATICSDNA microarrays
Genome Human Project (U.C. Santa Cruz)
http://genome.ucsc.edu
![Page 5: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/5.jpg)
A DNA microarray sample One of the developments within Genome Project From the tissue to the scanned image Tissue microarray chip DNA mRNA hybridization
on a microarray fluorescent image scanning reflecting the expression level of thousands of genes at a time
![Page 6: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/6.jpg)
A DNA MICROARRAY COLLECTION
Rows genes; Columns cases, samples, biopsyes, tissues, ‘cell-lines’...
![Page 7: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/7.jpg)
SEVERAL MICROARRAY DATASETSDATASET GENES A special characteristic about
each tissue
Colon 2,000 Biopsy: ‘tumor’ vs. ‘normal’
Leukemia 7,129 Leukemia type: AML, ALL
NCI-60 1,376 9 types of tumor
Alizadeh’00 2,984 2 types of lymphoma: ‘center B-like’, ‘activates B-like’
Chen’02 17,400 Hepato celular carcinoma vs. Not liver cancer
Garber’01 >24,000 Subtypes of lung cancer
![Page 8: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/8.jpg)
PROBLEM GOAL-TASK
The usual for biologists: Hierarchical clustering of genes Hierarchical clustering of tissues
Focusing on the specific nature of each tissue: Building of a supervised model which accurately
predicts the specific nature - characteristic of future and doubtful tissues: cancer vs. normal benignant vs. malignant tumor specific type of cancer,...
![Page 9: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/9.jpg)
Our work: selection of relevant genes in DNA microarray SUPERVISED tasks
Small area within bioinformatics. Huge dimensionality (> 1,000) can not learn the model
at first glance selection of genes, crucial task Application goals:
Development of drugs to act over the relevant genes Therapy development Diagnostic purposes
Supervised tasks (i.e., benignant – malignant tumor) Literature: Golub et al.’99, Brazma’00, Friedman’00,
Xing & Jordan’01... For a specific disease 10-15 genes seem relevant
![Page 10: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/10.jpg)
OUR APPROACH TO GENE SELECTION
Search algorithms: sequential (forward), EDAs... Wrapper - Filter evaluation functions Classification algorithms: naive-Bayes and
Bayesian networks, K-NN, IF-THEN rules... Made-own software and freeware software
(ELVIRA,WEKA, MLC++...) Our ‘Talón de Aquiles’ (weak point):
Biological interpretation of induced models and selected genes, validity of obtained recognition accuracy...
![Page 11: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/11.jpg)
PUBLICATIONS IN BIOINFORMATICS
R. Blanco, P. Larrañaga, I. Inza, B. Sierra (2004). “Gene selection for cancer classification using wrapper approaches”. International Journal of Pattern Recognition and Artificial Intelligence
I. Inza, P. Larrañaga, R. Blanco, A. J. Cerrolaza (2003). “Filter versus wrapper gene selection approaches in DNA microarray domains”. Artificial Intelligence in Medicine Journal. Special issue in “Data mining in Genomics and Proteomics”
I. Inza, B. Sierra, R. Blanco, P. Larrañaga (2002). “Gene selection by sequential search wrapper approaches in microarray cancer class prediction”. Journal of Intelligent and Fuzzy Systems. Special issue in Bioinformatics
![Page 12: PRESENTATION - people ISG (Intelligent Systems Group) Researching Group Donostia- San Sebastián Computer Science Faculty - University](https://reader035.vdocuments.mx/reader035/viewer/2022070323/56649d805503460f94a63cc5/html5/thumbnails/12.jpg)
INTERESTING REFERENCES Conferences:
ISMB: International Symposium on Molecular Biology ECCB: European Conference on Computational Biology CAMDA: Critical Assesment of Microarray Data Analysis WABI: Workshop on Algorithms in Bioinformatics
Reference journal: “Bioinformatics” and special issues of machine learning journals on the topic
Web sites: Stanford Genomic Resources Stanford Microarray Database http://www.gene-chips.com/ Hebrew University (N. Friedman, D. Pe’er, I. Nachman...) Tel Aviv University (R. Shamir) Human Genome Working Draft: http://genome.ucsc.edu
...............................................