the web-enabled research commons: applications, goals, and trends thinh nguyen october 2009

32
The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Upload: stanley-hamilton

Post on 19-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

The Web-Enabled Research Commons: Applications,

Goals, and Trends

Thinh Nguyen

October 2009

Page 2: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Use Case #1

NeuroCommons Project:

Science Commons project using Semantic Web to link massive amounts of data

Page 3: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009
Page 4: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

27,266 papers

4,563 papers

41,985 papers

10,365 papers

128,437 papers

Page 5: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

NeuronDBBAMS

Literature

Homologene

SWAN

Entrez Gene

Gene Ontology

Mammalian Phenotype

PDSPki

BrainPharm

AlzGene

Antibodies

PubChem

MESH

Reactome

Allen Brain Atlas

credit: W3C HCLS

Page 6: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

NeuronDB

BAMS

Literature

Homologene

SWAN

Entrez Gene

Gene Ontology

Mammalian Phenotype

PDSPki

BrainPharm

AlzGene

Antibodies

PubChem

MESH

Reactome

Allen Brain Atlas

Page 7: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Web page Web pagelinks to

making computers understand linkages

(the WWW)

Page 8: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

receptorCell

membrane

is located in

http://ontology.foo.org/receptor

directed, contextual links

Page 9: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

receptorCell

membrane

is located in

“URI”(unique names for things on the web)

http://ontology.foo.org/receptorhttp://ontology.foo.org/compartmenthttp://ontology.foo.org/receptor

http://ontology.foo.org/is_located_in

Page 10: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

receptorCell

membrane

is located in

channelCell

membrane

is located in

neuronCell

membrane

has

Page 11: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Cell membrane

““compartmencompartment”t”

““container”container”

““doohickey”doohickey” http://ontology.foo.org/compartment

using the web to integrate data and

databases

Page 12: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009
Page 13: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

prefix go: <http://purl.org/obo/owl/GO#>prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>

prefix owl: <http://www.w3.org/2002/07/owl#>prefix mesh: <http://purl.org/commons/record/mesh/>

prefix sc: <http://purl.org/science/owl/sciencecommons/>prefix ro: <http://www.obofoundry.org/ro/ro.owl#>

select ?genename ?processnamewhere

{ graph <http://purl.org/commons/hcls/pubmesh> { ?paper ?p mesh:D017966 .

?article sc:identified_by_pmid ?paper. ?gene sc:describes_gene_or_gene_product_mentioned_by ?article.

} graph <http://purl.org/commons/hcls/goa>

{ ?protein rdfs:subClassOf ?res. ?res owl:onProperty ro:has_function.

?res owl:someValuesFrom ?res2. ?res2 owl:onProperty ro:realized_as.

?res2 owl:someValuesFrom ?process. graph <http://purl.org/commons/hcls/20070416/classrelations>

{{?process <http://purl.org/obo/owl/obo#part_of> go:GO_0007166} union

{?process rdfs:subClassOf go:GO_0007166 }} ?protein rdfs:subClassOf ?parent.

?parent owl:equivalentClass ?res3. ?res3 owl:hasValue ?gene.

} graph <http://purl.org/commons/hcls/gene>

{ ?gene rdfs:label ?genename } graph <http://purl.org/commons/hcls/20070416>

{ ?process rdfs:label ?processname}}

Mesh: Pyramidal Neurons

Pubmed: Journal Articles

Entrez Gene: Genes

GO: Signal Transduction

better answers through better formats:

Page 14: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

•reformat what we already have

•reformat into a commons, not a closed system

•get the materials into the emerging research web

Page 15: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

What data sharing protocol (legal and policy) best enables use of Web technology?

Page 16: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

“Licensing” Archetypes

• Public Domain: No restrictions on use or distribution, no contracts, copyright waived.

• Community Licenses: standard “open access” licenses, a range of rights, some rights reserved, available to all

• Private Licenses: custom agreements, varies by institution, privately negotiated, may be offered only to some

Page 17: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Goals

• Interoperable: data from many sources can be combined without restriction

• Reusable: data can be repurposed into new and interesting contexts

• Administrative Burden: low transaction costs and administrative costs over time

• Legal Certainty: users can rely on legal usability of the data

• Community Norms: consistent with community expectations and usages

Page 18: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Interoperability

• Public Domain ****– Can be combined with other data sources with

ease

• Community Licenses *** / **– Depends on type of license: share-alike or copyleft

are unsuitable, but attribution-only licenses are less problematic

• Private Licenses * / **– Depends on restrictions, but not scalable;

permutations too large

Page 19: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Reusable

• Public Domain ****– No restrictions on subsequent use

• Community Licenses ***– Depends on license, but some licenses

such as NC / ND can be restrictive

• Private Licenses **– Depends on license, but typically restrictive

Page 20: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Administrative Burden

• Public Domain ****– No paperwork or legal review needed

• Community License ***– Little paperwork, but some legal review

needed (attribution stacking issues)

• Private Licenses *– Large amounts of paperwork, frequent

legal review needed

Page 21: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Legal Certainty

• Public Domain **** / ***– Clear rights; generally irrevocable; (copyright

should be addressed)

• Community Licenses ***– Generally credible, good track record with open

access and open source licenses

• Private Licenses **– Must be considered individually; few private

licenses tested by time

Page 22: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Community Norms

• Public Domain ***– Traditional method for scientific data sharing

(citation)

• Community Licenses ***– Relatively new, but familiar to computer scientists

and open source community (attribution)

• Private Licenses **– tendency to emphasize private / individual

interests rather than community norms

Page 23: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Overall Grade

• Public Domain *** – Easiest and least restrictive form of sharing

• Community Licenses **– Can be used to implement community

expectations, but can be burdensome / restrictive

• Private Licenses *– High transaction costs, burdensome,

unpredictable

Page 24: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Convergence

Page 25: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

CC0

• Released by Creative Commons in 2009

• Result of a 3-year policy exploration process

• Not a license but a waiver of copyright

Page 26: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Why is it needed

• “Borderline” copyright

• European sui generis database rights

• Varying legal standards for copyright protection in different countries

Page 27: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

CC0

• [deed]

Page 28: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

CC0

• Waiver of copyright

• Waiver of sui generis database rights

• Waiver of “neighboring rights”

• Does not affect trademarks or patents

• Only affects rights of person making assertion

Page 29: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Use Case #2

• Coordination and Sustainability of International Mouse Informatics Resources (CASIMIR) (EU Project)

• Commentary in Letter to Nature (Sept 2009) recommends PD and use of CC0 for sharing mouse genomic data

• Recommendations endorsed by scientists, NIH representatives, Jackson Labs, and editors of top scientific journals

Page 30: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Use Case #3

• Personal Genome Project - personalized medicine project from George Church lab

• Adopted CC0 to release sequence and medical data collected from volunteers

Page 31: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Summary

• Solving some bioinformatics problems require ability to integrate massive quantities of data from diverse sources

• Public Domain sharing best fits this need

• CC0 waiver can be used to enrich public domain and provide clarity

Page 32: The Web-Enabled Research Commons: Applications, Goals, and Trends Thinh Nguyen October 2009

Thank You

• Thinh Nguyen ([email protected])

• On the Web: http://www.sciencecommons.org