conference on biological informatics 6-8 july 1998 australian academy of science, canberra,...

160
Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

Upload: olivia-ray

Post on 27-Mar-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

Conference on

BIOLOGICAL INFORMATICS

6-8 July 1998

Australian Academy of Science, Canberra, Australia

Page 2: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

What is Bioinformatics?( http://www.esp.org/rjr/canberra.pdf )

Robert J. RobbinsFred Hutchinson Cancer Research Center

1100 Fairview Avenue North, LV-101Seattle, Washington 98109

[email protected]://www.esp.org/rjr

(206) 667 2920

Page 3: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

AbstractAbstract

In the last 25 years, Moore's Law has transformed society, delivering exponentially better computers at exponentially lower prices. Bioinformatics is the application of powerful, affordable information technology to the problems of biology. With $2500 desktop PCs now delivering more raw computing power than the first Cray, bioinformatics is rapidly becoming the critical technology for 21st Century biology.

DNA is legitimately seen as a biological mass-storage device, making bioinformatics a sine qua non for genomic research. Others areas of biological investigation are equally information rich –– an exhaustive tabulation of the Earth's biodiversity would involve a cross–index of the millions of known species against the approximately 500,000,000,000,000 square meters of the Earth's surface.

Bioinformatics is also becoming a scholarly discipline in its own right, melding information science with computer science, seasoning it with engineering methods, and applying it to the most information rich component of the known universe –– the Biosphere.

Page 4: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

4

What is Bioinformatics?

Bioinformatics is:

• the use of computers in pursuit of biological research.

• an emerging new discipline, with its own goals, research program, and practitioners.

• the sine qua non for 21st Century biology.

• all of the above.

Bioinformatics is:

• the use of computers in pursuit of biological research.

• an emerging new discipline, with its own goals, research program, and practitioners.

• the sine qua non for 21st Century biology.

• all of the above.

Page 5: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

5

Topics

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

Page 6: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

6

Topics

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

Page 7: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

7

Topics

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Information Technology (IT) has a special relationship with biology.

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Information Technology (IT) has a special relationship with biology.

Page 8: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

8

Topics

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Information Technology (IT) has a special relationship with biology.

• 21st-Century biology will be based on bioinformatics.

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Information Technology (IT) has a special relationship with biology.

• 21st-Century biology will be based on bioinformatics.

Page 9: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

9

Topics

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Information Technology (IT) has a special relationship with biology.

• 21st-Century biology will be based on bioinformatics.

• Bioinformatics is emerging as an independent discipline.

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Information Technology (IT) has a special relationship with biology.

• 21st-Century biology will be based on bioinformatics.

• Bioinformatics is emerging as an independent discipline.

Page 10: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

10

Topics

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Information Technology (IT) has a special relationship with biology.

• 21st-Century biology will be based on bioinformatics.

• Bioinformatics is emerging as an independent discipline.

• A connected, federated information infrastructure for biology is needed.

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Information Technology (IT) has a special relationship with biology.

• 21st-Century biology will be based on bioinformatics.

• Bioinformatics is emerging as an independent discipline.

• A connected, federated information infrastructure for biology is needed.

Page 11: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

11

Topics

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Information Technology (IT) has a special relationship with biology.

• 21st-Century biology will be based on bioinformatics.

• Bioinformatics is emerging as an independent discipline.

• A connected, federated information infrastructure for biology is needed.

• Current support for public bio-information infrastructure seems inadequate.

• Biotechnology and information technology will be the “magic” technologies of the 21st Century.

• Moore’s Law constantly transforms IT (and everything else).

• Information Technology (IT) has a special relationship with biology.

• 21st-Century biology will be based on bioinformatics.

• Bioinformatics is emerging as an independent discipline.

• A connected, federated information infrastructure for biology is needed.

• Current support for public bio-information infrastructure seems inadequate.

Page 12: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

Introduction

MagicalTechnology

Page 13: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

13

Magic

To a person from 1897, much current technology would seem like magic.

Page 14: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

14

Magic

To a person from 1897, much current technology would seem like magic.

What technology of 2097 would seem magical to a person from 1997?

Page 15: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

15

Magic

To a person from 1897, much current technology would seem like magic.

What technology of 2097 would seem magical to a person from 1997?

Candidate: Biotechnology so advanced that the distinction between living and non-living is blurred.

Information technology so advanced that access to information is immediate and universal.

Candidate: Biotechnology so advanced that the distinction between living and non-living is blurred.

Information technology so advanced that access to information is immediate and universal.

Page 16: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

16

Magic

To a person from 1897, much current technology would seem like magic.

What technology of 2097 would seem magical to a person from 1997?

Candidate: Biotechnology so advanced that the distinction between living and non-living is blurred.

Information technology so advanced that access to information is immediate and universal.

Candidate: Biotechnology so advanced that the distinction between living and non-living is blurred.

Information technology so advanced that access to information is immediate and universal.

Page 17: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

Moore’s Law

Transforms InfoTech(and everything else)

Page 18: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

18

Moore’s Law: The Statement

Every eighteen months, the number of transistors that can be placed on a chip doubles.

Gordon Moore, co-founder of Intel...Gordon Moore, co-founder of Intel...

Page 19: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

19

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Time

100,000

10,000

1,000

100

10

Page 20: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

20

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Cos

t(c

onst

ant

perf

orm

ance

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

Page 21: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

21

Moore’s Law: The Effect

Three Phases of Novel IT Applications

• It’s Impossible

Three Phases of Novel IT Applications

• It’s Impossible

Page 22: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

22

Moore’s Law: The Effect

Three Phases of Novel IT Applications

• It’s Impossible

• It’s Impractical

Three Phases of Novel IT Applications

• It’s Impossible

• It’s Impractical

Page 23: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

23

Moore’s Law: The Effect

Three Phases of Novel IT Applications

• It’s Impossible

• It’s Impractical

• It’s Overdue

Three Phases of Novel IT Applications

• It’s Impossible

• It’s Impractical

• It’s Overdue

Page 24: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

24

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Cos

t(c

onst

ant

perf

orm

ance

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

Page 25: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

25

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Cos

t(c

onst

ant

perf

orm

ance

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

Page 26: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

26

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Cos

t(c

onst

ant

perf

orm

ance

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

Page 27: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

27

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Cos

t(c

onst

ant

perf

orm

ance

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

Page 28: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

28

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Cos

t(c

onst

ant

perf

orm

ance

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

C

Page 29: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

29

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Cos

t(c

onst

ant

perf

orm

ance

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

C

Page 30: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

30

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Cos

t(c

onst

ant

perf

orm

ance

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

AA

C

Page 31: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

31

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Cos

t(c

onst

ant

perf

orm

ance

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

A

C

Page 32: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

32

Moore’s Law: The Effect

Per

form

ance

(con

stan

t cos

t)

Cos

t(c

onst

ant

perf

orm

ance

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

A

C

Relevance for biology?Relevance for biology?

Page 33: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

33

Cost (constant performance)

1975 1980 1985 1990 1995 2000 2005

1,000

10,000

100,000

1,000,000

10,000,000 UniversityPurchase

Page 34: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

34

Cost (constant performance)

1975 1980 1985 1990 1995 2000 2005

1,000

10,000

100,000

1,000,000

10,000,000 UniversityPurchase

DepartmentPurchase

Page 35: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

35

Cost (constant performance)

1975 1980 1985 1990 1995 2000 2005

1,000

10,000

100,000

1,000,000

10,000,000

RO1 GrantPurchase

UniversityPurchase

DepartmentPurchase

Page 36: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

36

Cost (constant performance)

1975 1980 1985 1990 1995 2000 2005

1,000

10,000

100,000

1,000,000

10,000,000

PersonalPurchase

RO1 GrantPurchase

UniversityPurchase

DepartmentPurchase

Page 37: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

37

Cost (constant performance)

1975 1980 1985 1990 1995 2000 2005

1,000

10,000

100,000

1,000,000

10,000,000

PersonalPurchase

RO1 GrantPurchase

UniversityPurchase

DepartmentPurchase

UnplannedPurchases

Page 38: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

IT-BiologySynergismIT-BiologySynergism

Page 39: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

39

IT is Special

Information Technology:

• affects the performance and the management of tasks

Information Technology:

• affects the performance and the management of tasks

Page 40: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

40

IT is Special

Information Technology:

• affects the performance and the management of tasks

• allows the manipulation of huge amounts of highly complex data

Information Technology:

• affects the performance and the management of tasks

• allows the manipulation of huge amounts of highly complex data

Page 41: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

41

IT is Special

Information Technology:

• affects the performance and the management of tasks

• allows the manipulation of huge amounts of highly complex data

• is incredibly plastic

Information Technology:

• affects the performance and the management of tasks

• allows the manipulation of huge amounts of highly complex data

• is incredibly plastic (programming and poetry are both exercises in pure thought)

Page 42: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

42

IT is Special

Information Technology:

• affects the performance and the management of tasks

• allows the manipulation of huge amounts of highly complex data

• is incredibly plastic

Information Technology:

• affects the performance and the management of tasks

• allows the manipulation of huge amounts of highly complex data

• is incredibly plastic (programming and poetry are both exercises in pure thought)

• improves exponentially (Moore’s Law)

Page 43: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

43

Biology is Special

Life is Characterized by:

• individuality

Life is Characterized by:

• individuality

Page 44: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

44

Biology is Special

Life is Characterized by:

• individuality

• historicity

Life is Characterized by:

• individuality

• historicity

Page 45: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

45

Biology is Special

Life is Characterized by:

• individuality

• historicity

• contingency

Life is Characterized by:

• individuality

• historicity

• contingency

Page 46: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

46

Biology is Special

Life is Characterized by:

• individuality

• historicity

• contingency

• high (digital) information content

Life is Characterized by:

• individuality

• historicity

• contingency

• high (digital) information content

Page 47: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

47

Biology is Special

Life is Characterized by:

• individuality

• historicity

• contingency

• high (digital) information content

Life is Characterized by:

• individuality

• historicity

• contingency

• high (digital) information contentNo law of large numbers... No law of large numbers...

Page 48: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

48

Biology is Special

Life is Characterized by:

• individuality

• historicity

• contingency

• high (digital) information content

Life is Characterized by:

• individuality

• historicity

• contingency

• high (digital) information contentNo law of large numbers, since every living thing is genuinely unique.

No law of large numbers, since every living thing is genuinely unique.

Page 49: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

49

IT-Biology Synergism

• Physics needs calculus, the method for manipulating information about statistically large numbers of vanishingly small, independent, equivalent things.

• Physics needs calculus, the method for manipulating information about statistically large numbers of vanishingly small, independent, equivalent things.

Page 50: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

50

IT-Biology Synergism

• Physics needs calculus, the method for manipulating information about statistically large numbers of vanishingly small, independent, equivalent things.

• Biology needs information technology, the method for manipulating information about large numbers of dependent, historically contingent, individual things.

• Physics needs calculus, the method for manipulating information about statistically large numbers of vanishingly small, independent, equivalent things.

• Biology needs information technology, the method for manipulating information about large numbers of dependent, historically contingent, individual things.

Page 51: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

51

Biology is Special

For it is in relation to the statistical point of view that the structure of the vital parts of living organisms differs so entirely from that of any piece of matter that we physicists and chemists have ever handled in our laboratories or mentally at our writing desks.

For it is in relation to the statistical point of view that the structure of the vital parts of living organisms differs so entirely from that of any piece of matter that we physicists and chemists have ever handled in our laboratories or mentally at our writing desks.

Erwin Schrödinger. 1944. What is Life.Erwin Schrödinger. 1944. What is Life.

Page 52: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

52

Genetics as Code

[The] chromosomes ... contain in some kind of code-script the entire pattern of the individual's future development and of its functioning in the mature state. ... [By] code-script we mean that the all-penetrating mind, once conceived by Laplace, to which every causal connection lay immediately open, could tell from their structure whether [an egg carrying them] would develop, under suitable conditions, into a black cock or into a speckled hen, into a fly or a maize plant, a rhodo-dendron, a beetle, a mouse, or a woman.

[The] chromosomes ... contain in some kind of code-script the entire pattern of the individual's future development and of its functioning in the mature state. ... [By] code-script we mean that the all-penetrating mind, once conceived by Laplace, to which every causal connection lay immediately open, could tell from their structure whether [an egg carrying them] would develop, under suitable conditions, into a black cock or into a speckled hen, into a fly or a maize plant, a rhodo-dendron, a beetle, a mouse, or a woman.

Erwin Schrödinger. 1944. What is Life.Erwin Schrödinger. 1944. What is Life.

Page 53: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

53

One Human SequenceWe now know that Schrödinger’s mysterious human “code-script” consists of 3.3 billion base pairs of DNA.

Page 54: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

54

One Human Sequence

Typed in 10-pitch font, one human sequence would stretch for more than 5,000 miles. Digitally formatted, it could be stored on one CD-ROM. Biologically encoded, it fits easily within a single cell.

We now know that Schrödinger’s mysterious human “code-script” consists of 3.3 billion base pairs of DNA.

Page 55: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

55

One Human Sequence

For details on Ripley’s interest in DNA, see

HTTP://LX1.SELU.COM/~rjr/factoids/genlen.html

A variant of this factoid actually made it into Ripley’s Believe It or Not, but that’s another story...

Page 56: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

56

Bio-digital Information

DNA is a highly efficient digital storage device:

• There is more mass-storage capacity in the DNA of a side of beef than in all the hard drives of all the world’s computers.

DNA is a highly efficient digital storage device:

• There is more mass-storage capacity in the DNA of a side of beef than in all the hard drives of all the world’s computers.

Page 57: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

57

Bio-digital Information

DNA is a highly efficient digital storage device:

• There is more mass-storage capacity in the DNA of a side of beef than in all the hard drives of all the world’s computers.

• Storing all of the (redundant) information in all of the world’s DNA on computer hard disks would require that the entire surface of the Earth be covered to a depth of three miles in Conner 1.0 gB drives.

DNA is a highly efficient digital storage device:

• There is more mass-storage capacity in the DNA of a side of beef than in all the hard drives of all the world’s computers.

• Storing all of the (redundant) information in all of the world’s DNA on computer hard disks would require that the entire surface of the Earth be covered to a depth of three miles in Conner 1.0 gB drives.

Page 58: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

Genomics:An ExampleGenomics:

An Example

Page 59: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

59

Human Genome Project - Goals– construction of a high-resolution genetic map of the human

genome;

– construction of a high-resolution genetic map of the human genome;

USDOE. 1990. Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years.

USDOE. 1990. Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years.

Page 60: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

60

Human Genome Project - Goals– construction of a high-resolution genetic map of the human

genome;

– production of a variety of physical maps of all human chromosomes and of the DNA of selected model organisms;

– construction of a high-resolution genetic map of the human genome;

– production of a variety of physical maps of all human chromosomes and of the DNA of selected model organisms;

USDOE. 1990. Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years.

USDOE. 1990. Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years.

Page 61: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

61

Human Genome Project - Goals– construction of a high-resolution genetic map of the human

genome;

– production of a variety of physical maps of all human chromosomes and of the DNA of selected model organisms;

– determination of the complete sequence of human DNA and of the DNA of selected model organisms;

– construction of a high-resolution genetic map of the human genome;

– production of a variety of physical maps of all human chromosomes and of the DNA of selected model organisms;

– determination of the complete sequence of human DNA and of the DNA of selected model organisms;

USDOE. 1990. Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years.

USDOE. 1990. Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years.

Page 62: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

62

Human Genome Project - Goals– construction of a high-resolution genetic map of the human

genome;

– production of a variety of physical maps of all human chromosomes and of the DNA of selected model organisms;

– determination of the complete sequence of human DNA and of the DNA of selected model organisms;

– development of capabilities for collecting, storing, distributing, and analyzing the data produced;

– construction of a high-resolution genetic map of the human genome;

– production of a variety of physical maps of all human chromosomes and of the DNA of selected model organisms;

– determination of the complete sequence of human DNA and of the DNA of selected model organisms;

– development of capabilities for collecting, storing, distributing, and analyzing the data produced;

USDOE. 1990. Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years.

USDOE. 1990. Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years.

Page 63: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

63

Human Genome Project - Goals– construction of a high-resolution genetic map of the human

genome;

– production of a variety of physical maps of all human chromosomes and of the DNA of selected model organisms;

– determination of the complete sequence of human DNA and of the DNA of selected model organisms;

– development of capabilities for collecting, storing, distributing, and analyzing the data produced;

– creation of appropriate technologies necessary to achieve these objectives.

– construction of a high-resolution genetic map of the human genome;

– production of a variety of physical maps of all human chromosomes and of the DNA of selected model organisms;

– determination of the complete sequence of human DNA and of the DNA of selected model organisms;

– development of capabilities for collecting, storing, distributing, and analyzing the data produced;

– creation of appropriate technologies necessary to achieve these objectives.

USDOE. 1990. Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years.

USDOE. 1990. Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years.

Page 64: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

64

Infrastructure and the HGP

Progress towards all of the [Genome Project] goals will require the establishment of well-funded centralized facilities, including a stock center for the cloned DNA fragments generated in the mapping and sequencing effort and a data center for the computer-based collection and distribution of large amounts of DNA sequence information.

Progress towards all of the [Genome Project] goals will require the establishment of well-funded centralized facilities, including a stock center for the cloned DNA fragments generated in the mapping and sequencing effort and a data center for the computer-based collection and distribution of large amounts of DNA sequence information.

National Research Council. 1988. Mapping and Sequencing the Human Genome. Washington, DC: National Academy Press. p. 3

National Research Council. 1988. Mapping and Sequencing the Human Genome. Washington, DC: National Academy Press. p. 3

Page 65: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

65

GenBank Totals (Release 103)

DIVISION

Phage Sequences (PHG)Viral Sequences (VRL)

Bacteria (BCT)

Plant, Fungal, and Algal Sequences (PLN)

Invertebrate Sequences (INV)

Rodent Sequences (ROD)Primate Sequences (PRI1–2)

Other Mammals (MAM)Other Vertebrate Sequences (VRT)

High-Throughput Genome Sequences (HTG)

Genome Survey Sequences (GSS)Structural RNA Sequences (RNA)

Sequence Tagged Sites Sequences (STS)Patent Sequences (PAT)

Synthetic Sequences (SYN)Unannotated Sequences (UNA)

EST1-17

TOTALS

Entries

1,313 45,355 38,023

44,553

29,657

36,967 75,587 12,744 17,713

1,120

42,628 4,802 52,824 87,767 2,577 2,480

1,269,737

1,765,847

Base Pairs

2,138,810 44,484,848 88,576,641

92,259,434

105,703,550

45,437,309 134,944,314 12,358,310 17,040,159

72,064,395

22,783,326 2,487,397 18,161,532 27,593,724 5,698,945 1,933,676

466,634,317

1,160,300,687

Per Cent

0.184%3.834%7.634%

7.951%

9.110%

3.916%11.630%

1.065%1.469%

6.211%

1.964%0.214%1.565%2.378%0.491%0.167%

40.217%

100.000%

Per Cent

0.074%2.568%2.153%

2.523%

1.679%

2.093%4.280%0.722%1.003%

0.063%

2.414%0.272%2.991%4.970%0.146%0.140%

71.905%

100.000%

Page 66: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

66

Base Pairs in GenBank

0

200,000,000

400,000,000

600,000,000

800,000,000

1,000,000,000

1,200,000,000

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105

GenBank Release Numbers

9493929190898887 95 96 97

Page 67: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

67

Base Pairs in GenBank

0

200,000,000

400,000,000

600,000,000

800,000,000

1,000,000,000

1,200,000,000

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105

GenBank Release Numbers

9493929190898887 95 96 97

Growth in GenBank is exponential. More data were added in the last ten weeks than were added in the first ten years of the project.

Growth in GenBank is exponential. More data were added in the last ten weeks than were added in the first ten years of the project.

Page 68: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

68

Base Pairs in GenBank

0

200,000,000

400,000,000

600,000,000

800,000,000

1,000,000,000

1,200,000,000

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105

GenBank Release Numbers

9493929190898887 95 96 97

Growth in GenBank is exponential. More data were added in the last ten weeks than were added in the first ten years of the project.

Growth in GenBank is exponential. More data were added in the last ten weeks than were added in the first ten years of the project.

At this rate, what’s next...At this rate, what’s next...

Page 69: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

69

ABI Bass-o-Matic Sequencer

In with the sample, out with the sequence...

TGCGCATCGCGTATCGATAG

speed

gB/sec

EnterDefrost

7 8 9

4 5 6

1 2 3

0

+

-

Page 70: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

70

What’s Really Next

The post-genome era in biological research will take for granted ready access to huge amounts of genomic data.

The challenge will be understanding those data and using the understanding to solve real-world problems...

Page 71: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

71

Base Pairs in GenBank

0

20,000,000

40,000,000

60,000,000

80,000,000

100,000,000

120,000,000

140,000,000

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105

GenBank Release Numbers

9493929190898887 95 96 97

Net ChangesNet Changes

Page 72: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

72

Base Pairs in GenBank (Percent Increase)

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

85 86 87 88 89 90 91 92 93 94 95 96

Year

Percent Increaseaverage = 56%

Percent Increaseaverage = 56%

Page 73: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

73

Projected Base Pairs

110

1001,000

10,000100,000

1,000,00010,000,000

100,000,0001,000,000,000

10,000,000,000100,000,000,000

1,000,000,000,00010,000,000,000,000

100,000,000,000,000

90 95 0 5 10 15 20 25

Year

Assumed annual growth rate: 50%(less than current rate)

Page 74: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

74

Projected Base Pairs

110

1001,000

10,000100,000

1,000,00010,000,000

100,000,0001,000,000,000

10,000,000,000100,000,000,000

1,000,000,000,00010,000,000,000,000

100,000,000,000,000

90 95 0 5 10 15 20 25

Year

Assumed annual growth rate: 50%(less than current rate)

Is this crazy?One trillion bp by 2015100 trillion by 2025

Is this crazy?One trillion bp by 2015100 trillion by 2025

Page 75: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

75

Projected Base Pairs

110

1001,000

10,000100,000

1,000,00010,000,000

100,000,0001,000,000,000

10,000,000,000100,000,000,000

1,000,000,000,00010,000,000,000,000

100,000,000,000,000

90 95 0 5 10 15 20 25

Year

Is this crazy?One trillion bp by 2015100 trillion by 2025

Is this crazy?One trillion bp by 2015100 trillion by 2025

500,00050,0005,000

Maybe not...Maybe not...

Projected database size, indicated as the number of base pairs per individual medical record in the US.

Page 76: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

21st CenturyBiology

Post-Genome Era

Page 77: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

77

The Post-Genome Era

Post-genome research involves:

• applying genomic tools and knowledge to more general problems

• asking new questions, tractable only to genomic or post-genomic analysis

• moving beyond the structural genomics of the human genome project and into the functional genomics of the post-genome era

Post-genome research involves:

• applying genomic tools and knowledge to more general problems

• asking new questions, tractable only to genomic or post-genomic analysis

• moving beyond the structural genomics of the human genome project and into the functional genomics of the post-genome era

Page 78: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

78

The Post-Genome Era

Suggested definition:

• functional genomics = biology

Suggested definition:

• functional genomics = biology

Page 79: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

79

The Post-Genome Era

An early analysis:An early analysis:

Walter Gilbert. 1991. Towards a paradigm shift in biology. Nature, 349:99.

Page 80: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

80

Paradigm Shift in Biology

To use [the] flood of knowledge, which will pour across the computer networks of the world, biologists not only must become computer literate, but also change their approach to the problem of understanding life.

To use [the] flood of knowledge, which will pour across the computer networks of the world, biologists not only must become computer literate, but also change their approach to the problem of understanding life.

Walter Gilbert. 1991. Towards a paradigm shift in biology. Nature, 349:99.Walter Gilbert. 1991. Towards a paradigm shift in biology. Nature, 349:99.

Page 81: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

81

Paradigm Shift in Biology

The new paradigm, now emerging, is that all the ‘genes’ will be known (in the sense of being resident in databases available electronically), and that the starting point of a biological investigation will be theoretical. An individual scientist will begin with a theoretical conjecture, only then turning to experiment to follow or test that hypothesis.

The new paradigm, now emerging, is that all the ‘genes’ will be known (in the sense of being resident in databases available electronically), and that the starting point of a biological investigation will be theoretical. An individual scientist will begin with a theoretical conjecture, only then turning to experiment to follow or test that hypothesis.

Walter Gilbert. 1991. Towards a paradigm shift in biology. Nature, 349:99.Walter Gilbert. 1991. Towards a paradigm shift in biology. Nature, 349:99.

Page 82: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

82

Paradigm Shift in Biology

Case of Microbiology

< 5,000 known and described bacteria

5,000,000 base pairs per genome

25,000,000,000 TOTAL base pairs

If a full, annotated sequence were available for all known bacteria, the practice of microbiology would match Gilbert’s prediction.

If a full, annotated sequence were available for all known bacteria, the practice of microbiology would match Gilbert’s prediction.

Page 83: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

21st CenturyBiology

The Science

Page 84: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

84

Fundamental Dogma

The fundamental dogma of molecular biology is that genes act to create phenotypes through a flow of information from DNA to RNA to proteins, to interactions among proteins (regulatory circuits and metabolic pathways), and ultimately to phenotypes.

Collections of individual phenotypes, of course, constitute a population.

DNA

RNA

Proteins

Circuits

Phenotypes

Populations

Page 85: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

85

Fundamental DogmaDNA

RNA

Proteins

Circuits

Phenotypes

Populations

GenBankEMBLDDBJ

MapDatabases

SwissPROTPIR

PDB

Although a few databases already exist to distribute molecular information,

Although a few databases already exist to distribute molecular information,

Page 86: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

86

Fundamental DogmaDNA

RNA

Proteins

Circuits

Phenotypes

Populations

GenBankEMBLDDBJ

MapDatabases

SwissPROTPIR

PDB

Gene Expression?

Clinical Data ?

Regulatory Pathways?Metabolism?

Biodiversity?

Neuroanatomy?

Development ?

Molecular Epidemiology?

Comparative Genomics?

the post-genomic era will need many more to collect, manage, and publish the coming flood of new findings.

the post-genomic era will need many more to collect, manage, and publish the coming flood of new findings.

Although a few databases already exist to distribute molecular information,

Although a few databases already exist to distribute molecular information,

Page 87: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

21st CenturyBiology

The Literature

Page 88: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

88

P I R -- Beta Hemoglobin----------------------------------------------DEFINITION

[HBHU] Hemoglobin beta chain - Human, chimpanzee, pygmy chimpanzee, and gorilla

SUMMARY [SUM] #Type Protein #Molecular-weight 15867 #Length 146 #Checksum 1242

SEQUENCEV H L T P E E K S A V T A L W G K V N V D E V G G E A L G R L L V V Y P W T Q R F F E S F G D L S T P D A V M G N P K V K A H G K K V L G A F S D G L A H L D N L K G T F A T L S E L H C D K L H V D P E N F R L L G N V L V C

G D B -- Beta Hemoglobin---------------------------------------------- ** Locus Detail View **-------------------------------------------------------- Symbol: HBB Name: hemoglobin, beta MIM Num: 141900 Location: 11p15.5 Created: 01 Jan 86 00:00-------------------------------------------------------- ** Polymorphism Table **--------------------------------------------------------Probe Enzyme =================================beta-globin cDNA RsaI beta-globin cDNA,JW10+ AvaII Pstbeta,JW102,BD23,pB+ BamHI pRK29,Unknown HindIIbeta-IVS2 probe HphI IVS-2 normal HphI Unknown AvrII beta-IVS2 probe AsuI

Electronic Data Publishing

O M I M -- Beta Hemoglobin----------------------------------------------Title*141900 HEMOGLOBIN--BETA LOCUS [HBB; SICKLE CELL ANEMIA, INCLUDED; BETA-THALASSEMIAS, INCLUDED; HEINZ BODY ANEMIAS, BETA-GLOBIN TYPE, ...]

The alpha and beta loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. By autoradiography using heavy-labeled hemoglobin-specific messenger RNA, Price et al. (1972) found labeling of a chromosome 2 and a group B chromo- some. They concluded, incorrectly as it turned out, that the beta-gamma-delta linkage group was on a group B chromosome since the zone of labeling was longer on that chromosome than on chromosome 2 (which by this reasoning

GenBank -- Beta Hemoglobin----------------------------------------------DEFINITION [DEF] [HUMHBB] Human beta globin regionLOCUS [LOC] HUMHBB ACCESSION NO. [ACC]J00179 J00093 J00094 J00096 J00158 J00159 J00160 J00161

KEYWORDS [KEY]Alu repetitive element; HPFH; KpnI repetitive sequence; RNA polymerase III; allelic variation; alternate cap site;

SEQUENCEgaattctaatctccctctcactactgtctagtatccctcaaggagtggtggctcatgtcttgagctcaagagtttgatataaaaaaaaattagccaggcaaatgggaggatcccttgagcgcactcca

Page 89: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

89

Electronic Scholarly Publishing

HTTP://WWW.ESP.ORG

The ESP site is dedicated to the electronic publishing of scientific and other scholarly materials. Of particular interest are the history of science, genetics, computational biology, and genome research.

Page 90: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

90

Electronic Scholarly Publishing

The Classical Genetics: Foundations series provides ready access to typeset-quality, electronic editions of important publications that can otherwise be very difficult to find.

Page 91: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

91

Electronic Scholarly Publishing

“Hardy” (of Hardy-Weinberg) is a name well known to most students of biology.

Page 92: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

92

Electronic Scholarly Publishing

But how many have read, or even seen, all of Hardy’s biological writings?

This is it: A single, one-page letter to the editor of Science.

Page 93: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

93

Electronic Scholarly Publishing

http://www.esp.org/books/darwin/beagle

Entire monographs can be made instantly available to readers world-wide..

Page 94: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

94

Electronic Scholarly Publishing

Today’s computer technology was nearly unimaginable just ten years ago. The technol-ogy of ten years from now will also bring many surprises.

How is it that IT can maintain such an amazing rate of sustained change?

And what, if any, are the implications of that rate of change for biology?

Page 95: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

95

Traditional Publishing

ScientificLiterature

ResearcherResearcher

Print publication seems straightforward, ...Print publication seems straightforward, ...

Page 96: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

96

Traditional Publishing

ScientificLiterature

ResearcherResearcher

... with an infrastructure that is largely invisible, ... ... with an infrastructure that is largely invisible, ...

Creationand

PublicationInfrastructure

Distributionand

ManagementInfrastructure

Page 97: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

97

Traditional Publishing

Management

Classify

Acquire

Order

Catalog

Shelve

Circulate

Research

BenchWork

Analyze

AcquireLiterature

InterpretData

Synthesis

Write

Distribution

Pick

ProcessOrders

Store

Package

Address

Ship &Invoice

ResearchDraft

ScientificLiterature

Collections& Libraries

Publishing

Edit

Revise

Review

Typeset

Print

Bind

ResearcherResearcher

... yet essential.... yet essential.

Page 98: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

98

Electronic Publishing

Management

???

???

???

???

???

???

Research

BenchWork

Analyze

AcquireLiterature

InterpretData

Synthesis

Write

Distribution

Edit

Review

DatabaseOperation

UserAssistance

DistributeOutput

Access

DateSubmission

ScientificDatabase

Federations& Libraries

Publishing

Revise

Review

Submit

Accession

Edit

Release

ResearcherResearcher

Some of the needed infrastructure is undefined.Some of the needed infrastructure is undefined.

Page 99: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

21st CenturyBiology

The People

Page 100: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

100

Human Resources Issues

• Reduction in need for non-IT staff• Reduction in need for non-IT staff

Page 101: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

101

Human Resources Issues

• Reduction in need for non-IT staff

• Increase in need for IT staff, especially “information engineers”

• Reduction in need for non-IT staff

• Increase in need for IT staff, especially “information engineers”

Page 102: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

102

Human Resources Issues

• Reduction in need for non-IT staff

• Increase in need for IT staff, especially “information engineers”

• Reduction in need for non-IT staff

• Increase in need for IT staff, especially “information engineers”

In modern biology, a general trend is to convert expert work into staff work and finally into computation. New expertise is required to design, carry out, and interpret continuing work.

In modern biology, a general trend is to convert expert work into staff work and finally into computation. New expertise is required to design, carry out, and interpret continuing work.

Page 103: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

103

Human Resources Issues

Elbert Branscomb: “You must recognize that some day you may need as many computer scientists as biologists in your labs.”

Elbert Branscomb: “You must recognize that some day you may need as many computer scientists as biologists in your labs.”

Page 104: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

104

Human Resources Issues

Elbert Branscomb: “You must recognize that some day you may need as many computer scientists as biologists in your labs.”

Craig Venter: “At TIGR, we already have twice as many computer scientists on our staff.”

Elbert Branscomb: “You must recognize that some day you may need as many computer scientists as biologists in your labs.”

Craig Venter: “At TIGR, we already have twice as many computer scientists on our staff.”

Exchange at DOE workshop on high-throughput sequencing.

Exchange at DOE workshop on high-throughput sequencing.

Page 105: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

New Discipline of Informatics

New Discipline of Informatics

Page 106: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

106

What is Informatics?

Informatics

ComputerScience

Research

BiologicalApplicationPrograms

Page 107: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

107

What is Informatics?

Informatics combines expertise from:

• domain science (e.g., biology)

• computer science

• library science

• management science

Informatics combines expertise from:

• domain science (e.g., biology)

• computer science

• library science

• management science

All tempered with an engineering mindset...All tempered with an engineering mindset...

Page 108: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

108

What is Informatics?

ISIS

MedicalInformatics

MedicalInformatics

BioInformatics

BioInformatics

OtherInformatics

OtherInformatics

LibraryScience

LibraryScience

ComputerScience

ComputerScience

MgtScience

MgtScience

DomainKnowledge

EngineeringPrinciples

Page 109: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

109

Engineering Mindset

Engineering is often defined as the use of scientific knowledge and principles for practical purposes. While the original usage restricted the word to the building of roads, bridges, and objects of military use, today's usage is more general and includes chemical, electronic, and even mathematical engineering.

Engineering is often defined as the use of scientific knowledge and principles for practical purposes. While the original usage restricted the word to the building of roads, bridges, and objects of military use, today's usage is more general and includes chemical, electronic, and even mathematical engineering.

Parnas, David Lorge. 1990. Computer, 23(1):17-22.Parnas, David Lorge. 1990. Computer, 23(1):17-22.

Page 110: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

110

Engineering Mindset

Engineering is often defined as the use of scientific knowledge and principles for practical purposes. While the original usage restricted the word to the building of roads, bridges, and objects of military use, today's usage is more general and includes chemical, electronic, and even mathematical engineering.

Engineering is often defined as the use of scientific knowledge and principles for practical purposes. While the original usage restricted the word to the building of roads, bridges, and objects of military use, today's usage is more general and includes chemical, electronic, and even mathematical engineering.

Parnas, David Lorge. 1990. Computer, 23(1):17-22.Parnas, David Lorge. 1990. Computer, 23(1):17-22.

... or even information engineering.

Page 111: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

111

Engineering Mindset

Engineering education ... stresses finding good, as contrasted with workable, designs. Where a scientist may be happy with a device that validates his theory, an engineer is taught to make sure that the device is efficient, reliable, safe, easy to use, and robust.

Engineering education ... stresses finding good, as contrasted with workable, designs. Where a scientist may be happy with a device that validates his theory, an engineer is taught to make sure that the device is efficient, reliable, safe, easy to use, and robust.

Parnas, David Lorge. 1990. Computer, 23(1):17-22.Parnas, David Lorge. 1990. Computer, 23(1):17-22.

Page 112: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

112

Engineering Mindset

Engineering education ... stresses finding good, as contrasted with workable, designs. Where a scientist may be happy with a device that validates his theory, an engineer is taught to make sure that the device is efficient, reliable, safe, easy to use, and robust.

Engineering education ... stresses finding good, as contrasted with workable, designs. Where a scientist may be happy with a device that validates his theory, an engineer is taught to make sure that the device is efficient, reliable, safe, easy to use, and robust.

Parnas, David Lorge. 1990. Computer, 23(1):17-22.Parnas, David Lorge. 1990. Computer, 23(1):17-22.

The assembly of working, robust systems, on time and on budget, is the key requirement for a federated information infrastructure for biology.

Page 113: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

113

Informatics Triangle

BS

LSCS

MS

MS MS

Page 114: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

114

Informatics Triangle

BS

LSCS

compbio

bioinfo

SILS

Page 115: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

115

Informatics Triangle

MS

LSCS

CIS/IT MIS

SILS

Page 116: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

116

Informatics Triangle

BS

LSCS

MS

MS MS

IS

IS

ISIS

Page 117: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

117

What is Informatics?

MS

IS

BS

LSCS

Page 118: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

Federated Information

Infrastructure

Federated Information

Infrastructure

Page 119: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

119

National Information Infrastructure

analog

commercialuses

non-commercialuses

digital

Edu Libother ResETC

Page 120: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

120

ODN Model

ODN Bearer ServiceLayer 1

ATM

Network TechnologySubstrate

LANs

SMDS

Wireless

Dial-upModems

FrameRelay

Point-to-PointCircuits

DirectBroadcastSatellites

Middleware ServicesLayer 3

ServiceDirectories

Privacy

ElectronicMoney

StorageRepositories

MultisiteCoordination

FileSystems

Security

NameServers

ApplicationsLayer 4

VideoServer

FaxAudioServer

FinancialServices

ElectronicMail

Tele-Conferencing

InteractiveEducation

InformationBrowsing

RemoteLogin

Open BearerService Interface

Transport Services andRepresentation Standards

Layer 2

(fax, video, audio, text, etc.)

A recent NRC report, Realizing the Information Future, laid out a vision of an Open Data Network model, in which any information appliance could be operated over generic networking protocols...

A recent NRC report, Realizing the Information Future, laid out a vision of an Open Data Network model, in which any information appliance could be operated over generic networking protocols...

Page 121: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

121

FIIST & NII

FIIST(science & technology)

FIIE(engineering)

FIIS(science)

FII(climatology)

FII(chemistry)

FII(geology)

FII(physics)

FII(biology)

FII(physiology)

FII(ecology)

FII(structural biology)

FII(• • •)

FII(geography)

FII(human)

FII(mouse)

FII(• • •)

FII(Arabidopsis)

FII(genomics)

FII(E. coli)

FII(zoology)

FII(botany)

FII(• • •)

FII(systematics)

FII(• • •)

analog

commercialuses

non-commercialuses

digital

Edu Libother ResETC

The research component of the NII contains a Federated Information Infrastructure for Science and Technology..

The research component of the NII contains a Federated Information Infrastructure for Science and Technology..

Page 122: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

122

FIIST

FIIST(science & technology)

FIIE(engineering)

FIIS(science)

FII(climatology)

FII(chemistry)

FII(geology)

FII(physics)

FII(biology)

FII(• • •)

FII(geography)

Page 123: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

123

FIIB

FII(biology)

FII(physiology)

FII(ecology)

FII(structural biology)

FII(human)

FII(mouse)

FII(• • •)

FII(Arabidopsis)

FII(genomics)

FII(E. coli)

FII(zoology)

FII(botany)

FII(• • •)

FII(systematics)

FII(• • •)

Page 124: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

124

Public Funding of Databases

Stand-alone Criteria:

• Is there a need?

• Will this meet the need?

• Can they do it?

• Is it worth it?

Page 125: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

125

Public Funding of Databases

Global Criteria:

• Does it adhere to standards?

• Will it interoperate?

• Is there commitment to federation?

• Is it worth it?

Page 126: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

126

Information Resources and the GII

Guiding Principles:

• Global value explosion

• Componentry

• Anonymous interoperability

• Technical scalability

• Social scalability

• Value additivity

Page 127: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

Funding forBio-Information

Infrastructure

Funding forBio-Information

Infrastructure

Page 128: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

128

Call for Change

Among the many new tools that are or will be needed (for 21st-century biology), some of those having the highest priority are:

• bioinformatics

• computational biology

• functional imaging tools using biosensors and biomarkers

• transformation and transient expression technologies

• nanotechnologies

Among the many new tools that are or will be needed (for 21st-century biology), some of those having the highest priority are:

• bioinformatics

• computational biology

• functional imaging tools using biosensors and biomarkers

• transformation and transient expression technologies

• nanotechnologies

Impact of Emerging Technologies on the Biological Sciences: Report of a Workshop. NSF-supported workshop, held 26-27 June 1995, Washington, DC.

Impact of Emerging Technologies on the Biological Sciences: Report of a Workshop. NSF-supported workshop, held 26-27 June 1995, Washington, DC.

Page 129: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

129

The Problem

• IT moves at “Internet Speed” and responds rapidly to market forces.

• IT moves at “Internet Speed” and responds rapidly to market forces.

Page 130: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

130

The Problem

• IT moves at “Internet Speed” and responds rapidly to market forces.

• IT will play a central role in 21st Century biology.

• IT moves at “Internet Speed” and responds rapidly to market forces.

• IT will play a central role in 21st Century biology.

Page 131: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

131

The Problem

• IT moves at “Internet Speed” and responds rapidly to market forces.

• IT will play a central role in 21st Century biology.

• Current levels of support for public bio-information infrastructure are too low.

• IT moves at “Internet Speed” and responds rapidly to market forces.

• IT will play a central role in 21st Century biology.

• Current levels of support for public bio-information infrastructure are too low.

Page 132: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

132

The Problem

• IT moves at “Internet Speed” and responds rapidly to market forces.

• IT will play a central role in 21st Century biology.

• Current levels of support for public bio-information infrastructure are too low.

• Reallocation of federal funding is difficult, and subject to political pressures.

• IT moves at “Internet Speed” and responds rapidly to market forces.

• IT will play a central role in 21st Century biology.

• Current levels of support for public bio-information infrastructure are too low.

• Reallocation of federal funding is difficult, and subject to political pressures.

Page 133: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

133

The Problem

• IT moves at “Internet Speed” and responds rapidly to market forces.

• IT will play a central role in 21st Century biology.

• Current levels of support for public bio-information infrastructure are too low.

• Reallocation of federal funding is difficult, and subject to political pressures.

• Federal-funding decision processes are ponderously slow and inefficient.

• IT moves at “Internet Speed” and responds rapidly to market forces.

• IT will play a central role in 21st Century biology.

• Current levels of support for public bio-information infrastructure are too low.

• Reallocation of federal funding is difficult, and subject to political pressures.

• Federal-funding decision processes are ponderously slow and inefficient.

Page 134: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

134

Federal Funding of Bio-Databases

The challenges:The challenges:

Page 135: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

135

Federal Funding of Bio-Databases

The challenges:

• providing adequate funding levels

The challenges:

• providing adequate funding levels

Page 136: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

136

Federal Funding of Bio-Databases

The challenges:

• providing adequate funding levels

• making timely, efficient decisions

The challenges:

• providing adequate funding levels

• making timely, efficient decisions

Page 137: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

IT Budgets

A Reality Check

Page 138: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

138

Rhetorical Question

Which is likely to be more complex:

• identifying, documenting, and tracking the whereabouts of all parcels in transit in the US at one time

• identifying, documenting, and analyzing the structure and function of all individual genes in all economically significant organisms; then analyzing all significant gene-gene and gene-environment interactions in those organisms and their environments

Which is likely to be more complex:

• identifying, documenting, and tracking the whereabouts of all parcels in transit in the US at one time

• identifying, documenting, and analyzing the structure and function of all individual genes in all economically significant organisms; then analyzing all significant gene-gene and gene-environment interactions in those organisms and their environments

Page 139: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

139

Business Factoids

United Parcel Service:

• uses two redundant 3 Terabyte (yes, 3000 GB) databases to track all packages in transit.

• has 4,000 full-time employees dedicated to IT

• spends one billion dollars per year on IT

• has an income of 1.1 billion dollars, against revenues of 22.4 billion dollars

United Parcel Service:

• uses two redundant 3 Terabyte (yes, 3000 GB) databases to track all packages in transit.

• has 4,000 full-time employees dedicated to IT

• spends one billion dollars per year on IT

• has an income of 1.1 billion dollars, against revenues of 22.4 billion dollars

Page 140: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

140

Business ComparisonsCompany Revenues IT Budget Pct

Bristol-Myers Squibb 15,065,000,000 440,000,000 2.92 %

Pfizer 11,306,000,000 300,000,000 2.65 %

Pacific Gas & Electric 10,000,000,000 250,000,000 2.50 %

K-Mart 31,437,000,000 130,000,000 0.41 %

Wal-Mart 104,859,000,000 550,000,000 0.52 %

Sprint 14,235,000,000 873,000,000 6.13 %

MCI 18,500,000,000 1,000,000,000 5.41 %

United Parcel 22,400,000,000 1,000,000,000 4.46 %

AMR Corporation 17,753,000,000 1,368,000,000 7.71 %

IBM 75,947,000,000 4,400,000,000 5.79 %

Microsoft 11,360,000,000 510,000,000 4.49 %

Chase-Manhattan 16,431,000,000 1,800,000,000 10.95 %

Nation’s Bank 17,509,000,000 1,130,000,000 6.45 %

Page 141: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

141

Federal Funding of Biomedical-IT

Appropriate funding level:

• approx. 5-10% of research funding

• i.e., 1 - 2 billion dollars per year

Appropriate funding level:

• approx. 5-10% of research funding

• i.e., 1 - 2 billion dollars per year

Source of estimate:

- Experience of IT-transformed industries.

- Current support for IT-rich biological research.

Source of estimate:

- Experience of IT-transformed industries.

- Current support for IT-rich biological research.

Page 142: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

Conference on

BIOLOGICAL INFORMATICS

6-8 July 1998

Australian Academy of Science, Canberra, Australia

Page 143: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

Conference on Biological Informatics

Conference Sessions

• Overview of Biological Informatics

• Biodiversity Informatics

• Environmental Informatics

• Molecular Informatics

• Medical / Neuroinformatics

• Teaching and Training in Informatics

Conference Sessions

• Overview of Biological Informatics

• Biodiversity Informatics

• Environmental Informatics

• Molecular Informatics

• Medical / Neuroinformatics

• Teaching and Training in Informatics

Page 144: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

ExtrasExtras

Page 145: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

145

Slides:

http://www.esp.org/rjr/canberra.pdfhttp://www.esp.org/rjr/canberra.pdf

Page 146: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

ExtrasExtras

Page 147: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

Basics

Business 101

Page 148: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

148

Market Forces

Vendors

productsservices

Buyers

$

purchases

In a simple market economy, vendors try to anticipate the needs of buyers and offer products and services to meet those needs.

Real users decide whether or not to buy a product or service, depending upon whether or not it meets a real need at a reasonable price.

In a simple market economy, vendors try to anticipate the needs of buyers and offer products and services to meet those needs.

Real users decide whether or not to buy a product or service, depending upon whether or not it meets a real need at a reasonable price.

Business 101 Insight:

Successful vendors target a niche and excel at meeting the needs of that niche.

Business 101 Insight:

Successful vendors target a niche and excel at meeting the needs of that niche.

Page 149: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

149

Market Forces

VentureCapital

Vendors

$

Buyers

$

StockOfferings

Funding to initiate the development of products and services come from investors, not from buyers.

Investors decide whether or not to provide start-up funding based upon the estimated ability of the vendor to create products and services that will meet real needs at competitive prices.

Funding to initiate the development of products and services come from investors, not from buyers.

Investors decide whether or not to provide start-up funding based upon the estimated ability of the vendor to create products and services that will meet real needs at competitive prices.

$

VendorInvestment

productsservices

$

purchases

Page 150: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

150

Federal Funding

Investors

Database

$

Users

productsservices

$

purchases

If biological databases were driven by market forces, individual users would choose what services they need and individual database providers would choose what services to make available.

Investors would provide start-up money on the likelihood of successful products and services being developed.

Ultimate success would depend on meeting the needs of real users. Decisions could be made rapidly, in response to changing needs and emerging opportunities.

If biological databases were driven by market forces, individual users would choose what services they need and individual database providers would choose what services to make available.

Investors would provide start-up money on the likelihood of successful products and services being developed.

Ultimate success would depend on meeting the needs of real users. Decisions could be made rapidly, in response to changing needs and emerging opportunities.

Page 151: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

151

Federal Funding

Agency

Database

Reviewers

OtherAgencies

AgencyAdvisors

Congress

productsservices

OMB $

$ DatabaseAdvisors

Users

Instead, funding decisions for grant-supported biological databases can follow a ponderously slow course, with almost no opportunity for real-time input from real users.

Even with the best of intentions at all levels, this process is slow, inefficient, risk-averse, and non-responsive to the real and changing needs of users.

Instead, funding decisions for grant-supported biological databases can follow a ponderously slow course, with almost no opportunity for real-time input from real users.

Even with the best of intentions at all levels, this process is slow, inefficient, risk-averse, and non-responsive to the real and changing needs of users.

Page 152: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

152

Federal Funding of Bio-Databases

Possible solutions:

• increase the direct support of federal service organizations providing information infrastructure (e.g., NCBI).

• reduce support for investigator-initiated, grant-funded public database projects.

• create market forces, initially through subsidization, later simply through direct support for affected science (e.g., NSFnet into internet).

Possible solutions:

• increase the direct support of federal service organizations providing information infrastructure (e.g., NCBI).

• reduce support for investigator-initiated, grant-funded public database projects.

• create market forces, initially through subsidization, later simply through direct support for affected science (e.g., NSFnet into internet).

Page 153: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

153

Federal Funding of Bio-Databases

Creating market forces:

• stop supporting the supply side of biodatabases through slow, inefficient processes.

Creating market forces:

• stop supporting the supply side of biodatabases through slow, inefficient processes.

Page 154: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

154

Federal Funding of Bio-Databases

Creating market forces:

• stop supporting the supply side of biodatabases through slow, inefficient processes.

• start supporting the demand side through fast, efficient processes.

Creating market forces:

• stop supporting the supply side of biodatabases through slow, inefficient processes.

• start supporting the demand side through fast, efficient processes.

Page 155: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

155

Federal Funding of Bio-Databases

Creating market forces:

• stop supporting the supply side of biodatabases through slow, inefficient processes.

• start supporting the demand side through fast, efficient processes.

• provide guaranteed supplementary funding, redeemable only for access to bio-databases.

Creating market forces:

• stop supporting the supply side of biodatabases through slow, inefficient processes.

• start supporting the demand side through fast, efficient processes.

• provide guaranteed supplementary funding, redeemable only for access to bio-databases.

Page 156: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

156

Federal Funding of Bio-Databases

Creating market forces:

• stop supporting the supply side of biodatabases through slow, inefficient processes.

• start supporting the demand side through fast, efficient processes.

• provide guaranteed supplementary funding, redeemable only for access to bio-databases.

• data stamps

Creating market forces:

• stop supporting the supply side of biodatabases through slow, inefficient processes.

• start supporting the demand side through fast, efficient processes.

• provide guaranteed supplementary funding, redeemable only for access to bio-databases.

• data stamps

Page 157: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

157

Federal Funding of Bio-Databases

Creating market forces:

• stop supporting the supply side of biodatabases through slow, inefficient processes.

• start supporting the demand side through fast, efficient processes.

• provide guaranteed supplementary funding, redeemable only for access to bio-databases.

• data stamps, AKA food (for-thought) stamps ?!

Creating market forces:

• stop supporting the supply side of biodatabases through slow, inefficient processes.

• start supporting the demand side through fast, efficient processes.

• provide guaranteed supplementary funding, redeemable only for access to bio-databases.

• data stamps, AKA food (for-thought) stamps ?!

Page 158: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

158

Food (for thought) Stamps

Funding Agencies could:

• provide a 10% supplement to every research grant in the form of “stamps” redeemable only at database providers.

• allow the “stamps” to be transferable among scientists, so that a market for them could emerge.

• provide funding only after the stamps have been redeemed at a database provider.

Funding Agencies could:

• provide a 10% supplement to every research grant in the form of “stamps” redeemable only at database providers.

• allow the “stamps” to be transferable among scientists, so that a market for them could emerge.

• provide funding only after the stamps have been redeemed at a database provider.

Page 159: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

159

Food (for thought) Stamps

Problems:

• how to estimate the amount of FFT stamps that would actually be redeemed (and thus the required budget set-aside).

• how to identify “approved” database providers.

• how to initiate the FFT system.

• etc etc

Problems:

• how to estimate the amount of FFT stamps that would actually be redeemed (and thus the required budget set-aside).

• how to identify “approved” database providers.

• how to initiate the FFT system.

• etc etc

Page 160: Conference on BIOLOGICAL INFORMATICS 6-8 July 1998 Australian Academy of Science, Canberra, Australia

160

Food (for thought) Stamps

Alternatives (if no solution emerges):

• increasingly inefficient research activities (abject failure will occur when it becomes simpler to repeat research than to obtain prior results).

• loss of access to bio-databases for public-sector research.

• movement of majority of “important” biological research into the private sector.

• loss of American pre-eminence (if other countries solve the problems first).

Alternatives (if no solution emerges):

• increasingly inefficient research activities (abject failure will occur when it becomes simpler to repeat research than to obtain prior results).

• loss of access to bio-databases for public-sector research.

• movement of majority of “important” biological research into the private sector.

• loss of American pre-eminence (if other countries solve the problems first).