gen2phen gam8 meeting leiden - identifiers for lsdbs
DESCRIPTION
TRANSCRIPT
![Page 1: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/1.jpg)
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM
GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012
Identification of G2P databases - challenges and proposal for a solution
1
-- Overview --
✴ Identification difficulties - the Knowledge Centre perspective
✴ Or, why we need persistent identifiers for database resources
✴ Proposal to collaborate with the BioDBCore initiative
✴ standardizing registration & description of bio-databases
This work is published under the Creative Commons Attribution license (CC BY: http://creativecommons.org/licenses/by/3.0/) which means that it can be freely copied, redistributed and adapted, as long as proper attribution is given.
Gudmundur A. Thorisson <[email protected]> ULEICAdam J. Webb <[email protected]> ULEIC
Raymond Dalgleish <[email protected]> ULEICJuha Muilu <[email protected]> FIMM
Friday, 27 January 12
![Page 2: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/2.jpg)
c.103C>
T
c.321G>
T
Linking resources
c.301C>
T
c.465A>G
c.555G>
T
DB maintainer SubmiIerPerson
Resource
Databases
DB maintainerSubmiIer
External records / annotaEons
Friday, 27 January 12
![Page 3: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/3.jpg)
URLs are unstable
• Domain names / subdomains can change–hgvbaseg2p.org -‐> gwascentral.org– server1.example.com -‐> server2.example.com
• Paths can change–e.g /LOVD2/ change to /LOVD3/
• LSDB genes can move –e.g gene ADAM19 moves from one LOVD install to another
• Databases can merge– i.e gene ADAM19 on two different installs are reconciled into a single install
hIp://subdomain.example.com/path/to/resource
Friday, 27 January 12
![Page 4: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/4.jpg)
• Gene name not suitable–> 1 database for a given gene• gene.lovd.nl -‐> returns list of databases (or redirects if only 1 is known)–1 to many
• lovd.nl/gene -‐> redirects to *one* database–1 to one, but many resource do not receive idenEfiers
• These are locators, not idenEfiers
• Non-‐gene based resources• Ideally the idenEfier should also operate as the locator (like DOIs via a DOI resoluEon service)–hIp://dx.doi.org/10.19192 resolves DOI 10.19192
IDENTIFIER DATA RESOURCE1:1
Friday, 27 January 12
![Page 5: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/5.jpg)
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM
GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012
Proposal to collaborate with BioDBCore• BioDBCore aims
– annotation - organize the bio-database ‘resourceome’
– discovery - e.g. which protein sequence databases are available?
• Who’s behind it?– International Society for Biocuration– Resource catalogues: Bioinformatics
Links, BioSiteMaps, NAR db-issue etc
– Working group includes reps from NAR and DATABASE journals, MIBBI, Model organism db’s, CASIMIR mouse informatics consortium, others
5
Friday, 27 January 12
![Page 6: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/6.jpg)
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM
GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012 6
Friday, 27 January 12
![Page 7: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/7.jpg)
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM
GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012
Persistent resource identifiers in BioDBCore• They plan to use MIRIAM registry / ID resolution service
– unique, persistent and unambiguous identification of various kind of concepts.• http://identifiers.org/ec-code/1.1.1.1
• http://identifiers.org/pubmed/16333295• http://identifiers.org/doi/10.1038/nbt1156
• Decouples identification from location• Many resourcesa are already registered with MIRIAM • Operated by EBI <-- long-term sustainability prospect• Adoption by players LS Semantic Web comunity
– URIs for identifying entities in biological information represented in RDF– http://lsrn.org, Shared Names, Bio2RDF, others
7
Friday, 27 January 12
![Page 8: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/8.jpg)
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM
GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012
How might this work?
• Using database URIs - plausible scenario– Persistent canonical URI: http://identifiers.org/biodbcore/10235900
– Click URL, browser redirects to http://biodbcore.org/resource/10235900– BioDBCore metadata record for the database (akin to “landing page” online journal
site)
• BioDBCore “landing page” presents database metadata– Information *about* the “thing”– Name: Ehlers-Danlos Syndrome Variant Database
Main resource URL: https://eds.gene.le.ac.uk <-- the “thing” itself [scope, data standards, other metadata]
• Location of database = the “thing” itself
8
Friday, 27 January 12
![Page 9: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/9.jpg)
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM
GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012
Mututal benefits
• To GEN2PHEN / G2P community– Identification - slot into resource identifier scheme for bio-databases globally, build
more detailed catalogues & annotation systems around this
– Discovery - finding relevant LSDB and other G2P resources via range of search/query tools outside the KC or LSDB lists
– BioDBCore could possibly evolve into a sort of live “database publishing platform” , instead of the static “snapshot” conventional papers.
• To BioDBCore initiative– Acquire an entire category’s worth of metadata records & link to community– Extra pairs of eyes on what they’re doing, alternative perspective– Potential for further collaboration on contrib. tracking tools & ORCID integration
9
Friday, 27 January 12
![Page 10: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/10.jpg)
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM
GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012
Open questions, known unknowns etc.
• BioDBCore quite new, many things remain in flux– e.g. the MIRIAM / identifiers.org technical details are vague
• DOIs for BioDBCore records - register database DOIs for fuller integration into publishing process?
• How will this work with existing LSDB lists?
10
Friday, 27 January 12
![Page 11: GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs](https://reader035.vdocuments.mx/reader035/viewer/2022081702/54919a76ac79593f288b4588/html5/thumbnails/11.jpg)
G. A. Thorisson, ULEIC
GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012
Acknowledgements GEN2PHEN Consortium
http://www.gen2phen.org/about-gen2phen/partners
Prof Anthony J. Brookes Bioinformatics Group, Leicester
11
This work has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013)under grant agreement number 200754 - the GEN2PHEN project.
Contact me!
<[email protected]> |<[email protected]>http://www.linkedin.com/in/mummihttp://www.twitter.com/gthorisson
http://www.gthorisson.namePublished under the CC BY license (http://creativecommons.org/licenses/by/3.0/)
Friday, 27 January 12