ai and bioinformatics from database mining to the robot scientist
Post on 18-Dec-2015
223 views
TRANSCRIPT
AI and Bioinformatics
From Database Mining to the Robot Scientist
History of Bioinformatics
Definition of Bioinformatics is debated In 1973, Herbert Boyer and Stanely Cohen
invented DNA cloning. By 1977, a method for sequencing DNA was
discovered In 1981 The Smith-Waterman algorithm for
sequence alignment is published
History of Bioinformatics
By 1981, 579 human genes had been mapped
In 1985 the FASTP algorithm is published.
In 1988, the Human Genome organization (HUGO) was founded.
History of Bioinformatics Bioinformatics was fuelled by the need to create
huge databases. AI and heuristic methods can provide key solutions
for the new challenges posed by the progressive transformation of biology into a data-massive science. Data Mining
1990, the BLAST program is implemented. BLAST: Basic Local Alignment Search Tool. A program for searching biosequence databases
History of Bioinformatics
Scientists use Computer scripting languages such as Perl and Python
By 1991, a total of 1879 human genes had been mapped.
In 1996, Genethon published the final version of the Human Genetic Map. This concluded the end of the first phase of the Human Genome Project.
History of BioinformaticsYear Subject Name MBP
(Millions of base pairs)
1995 Haemophilus Influenza 1.8
1996 Bakers Yeast 12.1
1997 E.Coli 4.7
2000 Pseudomonas aeruginosa A. Thaliana
D. Melonagaster
6.3
100
180
2001 Human Genome 3,000
2002 House Mouse 2,500
Bioinformatics Today
There are several important problems where AI approaches are particularly promising Prediction of Protein Structure Semiautomatic drug design Knowledge acquisition from genetic data
Functional Genomics and the Robot Scientist Robot scientist developed by University of
Wales researchers Designed for the study of functional genomics Tested on yeast metabolic pathways Utilizes logical and associationist knowledge
representation schemes
Ross D. King, et al., Nature, January 2004
The Robot Scientist
Source: BBC News
Yeast Metabolic Pathways
Hypothesis Generation and Experimentation Loop
Ross D. King, et al., Nature, January 2004
Integration of Artificial Intelligence Utilizes a Prolog database to store
background biological information Prolog can inspect biological information,
infer knowledge, and make predictions Optimal hypothesis is determined using
machine learning, which looks at probabilities and associated cost
Experimental Results Performance similar to humans Performance significantly better than “naïve” or
“random” selection of experiments
Ross D. King, et al., Nature, January 2004
For 70% classification accuracy:A hundredth the cost of randomA third the cost of naive
Major Challenges and Research Issues
Requires individuals with knowledge of both disciplines
Requires collaboration of individuals from diverse disciplines
Major Challenges and Research Issues Data generation in biology/bioinformatics is
outpacing methods of data analysis Data interpretation and generation of hypotheses
requires intelligence AI offers established methods for knowledge
representation and “intelligent” data interpretation Predict utilization of AI in bioinformatics to increase
References and Additional ResourcesRoss D. King, Kenneth E. Whelan, Ffion M. Jones, Philip G. K. Reiser, Christopher H.
Bryant, Stephen H. Muggleton, Douglas B. Kell & Stephen G. Oliver. Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist. Nature 427 (15), 2004.
A Short History of Bioinformatics - http://www.netsci.org/Science/Bioinform/feature06.html
History of Bioinformatics - http://www.geocities.com/bioinformaticsweb/his.html
National Center for Biotechnology Information - http://www.ncbi.nih.gov
Pubmed - http://www.pubmed.gov