vivo: sharing data for research discovery mike conlon university of florida mconlon@ufl.edu

VIVO: Sharing Data for Research

Discovery

Mike ConlonUniversity of Florida

mconlon@ufl.edu

Public, structured linked data about investigators interests, activities and accomplishments, and toolsto use thatdata to advancescience

VIVO Searchlight

Data Production

Producing Data

processOrg<-function(uri){ x<-xmlParse(uri) u<-NULL name<-xmlValue(getNodeSet(x,"//rdfs:label")[[1]]) subs<-getNodeSet(x,"//j.1:hasSubOrganization") if(length(subs)==0) list(name=name,subs=NULL) else { for(i in 1:length(subs)){ sub.uri<-getURI(xmlAttrs(subs[[i]])["resource"]) u<-c(u,processOrg(sub.uri)) } list(name=name,subs=u) }}

VIVO produces human and machine

readable formats

Software reads RDF from VIVO

and displays

Data Sharing

Photograph by J. G. Park. Flicker.com Photograph by Ell Brown Flicker.com

Information is stored using the Resource Description

Framework (RDF) as subject-predicate-object “triples”

Jane Smith

professor in

author ofhas affiliation with

Dept. of Genetic

s College of

Medicine

Journal article

Book chapte

Genetics Institute

Subject Predicate Object

A Web of Data – The Semantic Web

The Role of the Archive• Collate data, final semantics, ready for

consumption

Data Archive

Institutions record activities, interests, accomplishments

Data, Tools and Scientists

Data Consumption

Photograph by ScoopMedia. Flicker.com

Photograph by Janet Tarbox. Flicker.com

A Consumption ScenarioFind all faculty members whose genetic work

is implicated in breast cancer

VIVO will store information about faculty and associate to genes. Diseaseome associates genes to diseases.

Query resolves across VIVO and data sources it links to.

Data ReasoningData integration continues to be a serious bottleneck for the expectations of increased productivity in the pharmaceutical and biotechnology domain.

“Linked Life Data” integrates common public datasets that describe the relationships between gene, protein, interaction, pathway, target, drug, disease and patient and currently consist of more than 5 billion RDF statements.

The dataset interconnects more than 20 complete data sources and helps to understand the “bigger picture” of a research problem by linking previously unrelated data from heterogeneous knowledge.

From the LarKC (Large Knowledge Collider) http://www.larkc.eu/overview/

http://vivo.ufl.edu/individual/mconlon

vivo: sharing data for research discovery mike conlon university of florida mconlon@ufl.edu

technology7producing

structured linked data

institutions scientists

study science

advance science

world of science

scientists interests

compatible systems

Documents

conlon nancarrow - study no. 4 for player piano

presentation developed by: sean mccann dr. jamie ellis...

clinical immunology conleth feighery john jackson niall...

celebrating 50 years - conlon · pdf fileit is a very proud...

joan catoni conlon, director

ag econ & community development chair person email chris...

peter t. gallagher, paul a. conlon

luncheon program building better karen d. conlon, mcam ·...

living with africanized bees michael k. o’malley, afbee...

h conlon writing samples

gatoraid: identity management at the university of florida...

eddie conlon spanjolski gradjanski rat

| issue 19 insite - conlon construction _conlon_ project...

florida’s ephemeral cities erich kesse - kesse@ufl.edu -...

the workbook - ciara conlon

considerations when adapting books shelly voelker, m.ed.,...

laura conlon presentation 92204

1. 2 chap 1 preliminary concepts and linear finite elements...

evolving geometries in the precipitation patterns of...

mike conlon john corson- rikert paul albert