what the iot should learn from the life sciences
TRANSCRIPT
WHAT THE IOT SHOULD LEARN FROM THE LIFE SCIENCES
• Computational biologist• Research group leader• Lecturer in genome biology• Advisor at• 2015 Fellow of the
Who is@BorisAdryan
DNA = storage of a blueprint
RNA = ‘active copy’ of DNA
protein = the building blocks of cells and tissues
LIFE AS WE KNOW IT
transcription
translation
Gregor Johann Mendel,exhibited in the Library at the NIMR
• Reading DNA information
• Determining “the sequence of a gene” was a PhD in the early 1980s
• Data processing was mainly transcribing the observation into a research paper
BIOLOGY THEN AND NOWSEQUENCE INFORMATION
Sanger sequencing ca. 1980
http://www.eplantscience.com
181,563,676,918 bases base pairs on 15th October 2014(from 165,722,980,375 bases on 24th August 2014)
• We can sequence a human genome in half a day
• Sequence databases grow faster than storage capacity
• Data processing is the key step in scientific understanding
BIOLOGY THEN AND NOWSEQUENCE INFORMATION
BIOLOGY THEN AND NOWGENE ACTIVITY INFORMATION
• When are genes needed?
• Classical molecular biology workflow, taking days…
• Data is semi-quantitative; testing one gene at the time
Northern blot for d-vhlca. May 1999
• High-throughput gene expression profiling since mid-1990s
• Quantitative information for every gene in an organism
• Key challenge is the presentation and interpretation of the data
BIOLOGY THEN AND NOWGENE ACTIVITY INFORMATION
26 ATP
• Signal transduction and metabolic pathways
• Characterisation of proteins and substrates that mediate chemical reactions
• Nobel prize material
BIOLOGY THEN AND NOWBIOCHEMISTRY
• We know about 250k metabolites
• 100k protein structures
• on the order of 10k different chemical reactions
BIOLOGY THEN AND NOWBIOCHEMISTRY
‣ Everything is connected‣ Big, noisy, often
unstructured data
‣We are learning how biological entities depend on each other
‣ Everything is connected‣ Big, noisy, often
unstructured data
www.thingslearn.comAnalytics, context integration, machine learning and predictive modelling for the IoT.
THERE’S NO ANALYTICAL FLEXIBILITY IN M2M/IOT
Matt Hatton, Machina Research The BLN IoT ‘14
Internet replaces wire
It’s all about the connectedness
M2M
consumer
IoT
LIFE SCIENCE STRATEGIES DON’T WORK IN THE IOT- There are no commonly accepted
- ‘catalogue’ of things,- ‘ontology’ of things,- ‘data format’ of things,- ‘meta data’ for things.
- Most businesses are driven by revenue, not long-term strategic vision
- Service providers have no need to publish
- Data can be highly personal (cheap excuse)
unless they’re
WE FIXED OUR KNOWLEDGE REPRESENTATION PROBLEM
FORMALISING KNOWLEDGE
FORMALISING KNOWLEDGE WITH GENE ONTOLOGY
CURRENT GOVERNMENT INVESTMENTS INTO GENE ONTOLOGY
NIH alone spent $44,616,906 on the ontology structure since 2001(no data for UK/EU spendings)
~100 full-time salaries for experts with domain-specific knowledge
~40,000 terms
Oct. 1995
TOWARDS MIAMI AND DATA REPOSITORIES
cf. IoTNov. 1993
META DATA, SHARING AND DATA REPOSITORIES
founded in Nov. 1999
But this is a complex and ambitious project, and is one of the biggest challenges that bioinformatics has yet faced. Major difficulties stem from the detail required to describe the conditions of an experiment, and the relative and imprecise nature of measurements of expression levels. The potentially huge volume of data only adds to these difficulties.
NatureFeb. 2000
“
“
Nov. 2000 Oct. 2002
Wide adoption as requirement for publication in scientific journals
META DATA, SHARING AND DATA REPOSITORIES
cf. IoT 2014
since 2003
Semantic Sensor Network Ontologyhttp://en.wikipedia.org/wiki/Silo
story
measurements + meta data
open, public repositories
human curators
ontology terms
community
PUBLISH OR PERISH
ok?
journal
informal exchange - no credit!
funders
assessment
The majority of this infrastructure is paid for by governments and charities
industry!
measurements + meta data
storage & provenance
human curators
ontology terms
user
PUBLISH OR YOU’RE NOT DOING IOT
ok?
Maybe the majority of this infrastructure should be paid for by governments?
companycloud
device registration
“ “
privileges dataadded value
WHAT THE IOT SHOULD LEARN FROM THE LIFE SCIENCES
• Given the predicted importance and impact of the IoT, we can and should not leave the development of infrastructure to commercial stakeholders alone.
• We need a lot more incentives to participate and targeted investment from the government (“the funders”) into reliable infrastructure.
• It took the computational life sciences less than 4 years(!) to grow from a grass roots movement to having industry-scale, expandable infrastructure.
• Shared vision, dogmatic implementation, effective lobbying.
@BorisAdryan is interested to hear about IoT job opportunities.