3 aquares toolsodnature.naturalsciences.be/downloads/aquares/aquares... · 2016-09-05 ·...
TRANSCRIPT
AquaRES tools for taxonomic editors & external users
Leen Vandepitte
On behalf of the WoRMS DMT
• Aquacache– toolfor comparing taxonomic checklists
• Improved dataservicesinthe framework ofaquares– Taxonmatchservices
– Occurrence checking services
– General quality control and dataformatchecking services
• Hands-ondemoofdataservices
• Dataexchangeand toolstargeting international initiatives
AquaCache
• Fromtheprojectdescription:– wewill design andbuild acentraldatacachelinking the threedatabases FADA,
WoRMS andRAMS.
– This data cachewill behosted atVLIZand is primarily meant toactasaninternalsystem for running common web services intermsoftaxonmatching anddatacleaning &refinement.
– Eachofthe databases willretain itsimport, update mechanism andquality control.
• Inreality:– Thegoal oftheAquaCache is toserve asaninternal datamanagement tool, helping
theinvolved editors tosearch through and compare lists andidentify possibleoverlaps anddiscrepancies between lists available inmore thanone speciesregister, e.g.onthe level ofhigher classification orthe statusof aname.
– Lists canbeuploaded intotheAquaCache asDarwinCore Taxon files andneed toinclude toSpecies Profile extension, indicating whether ataxonismarine, fresh,brackish orterrestrial.
– Thefunctionalities of theAquaCache will bebroadened towards thefuture,depending ontheneeds of theinvolved editors.
– AftertheAquaRES project (2013-2016), theAquaCache will continue under theLifeWatch project, where itsfunctionalities and applications canbefurtherdeveloped.
• AquaCache =management tool
• Search&compare the involved systems (“search”)
• Taxonomy:– Green:exactmatch (taxonname+higherclassification)
– Yellow: taxonmatch identifiesinconsistencies between thenames (e.g. spellingvariation)
– Red:higherclassificationisdifferent
• Environment:• !:flag missinginWoRMS
• X: flags differ between WoRMS &FADA
• V:flags correspond between WoRMS &FADA
Work inprogress…
Demonstration ofAquaCachehttp://aquacache.lifewatch.be/
A. Taxonmatchservices
B. Occurrence checking services
C. General quality control and dataformatchecking services
Improved dataservicesinthe framework ofAquaRES:
Anoverview,including somedemonstrations
Taxonmatchservices• Thisserviceallowsuserstomatchtheirtaxonomiclisttoavailableonline
standards,including– WoRMS
– Catalogue ofLife(CoL)
– Pan-European Species Infrastructure (PESI)
– IntegratedTaxonomic Information System (ITIS)
– Index Fungorum (IF)
– Paleo-DB
– Global Names Index (GNI)
– International PlantName Index(IPNI)
– FADA
– InterimRegisterofMarine andNon-Marine Genera(IRMNG)
– RAMS
– FishBase
• Optiontosearchallofthelistedtaxonomicstandardsorjustaselection
Available
Indevelopment
• Taxonmatchtool=>available through LifeWatch:www.lifewatch.be/data-services
Occurrence checking servicesPlottingsamplinglocationsonamap:• Enablesaquickvisualqualitycheck ofthedata• Usersareabletodetectpossibleerrorsinthecoordinates
Commonmistakes:– Switching oflatitude and longitude
– Lackof aminus sign toindicateWest orSouth
=>Thesekindsofflawscaneasilybefixedbytheuser,improvingoverallqualityofthedata.
Comparingyourownoccurrenceswithdocumenteddistributions• Takingthisfurther,userscanalsocomparetheiroccurrenceswiththe
documenteddistributionsinthetaxonomicdatabases(WoRMS,RAMS,FADA).• Detection ofpossible errors orgaps can goboth ways:
– Gaps inyour own dataó gaps inthe taxonomic database
– Errors inyour own dataó errors in the taxonomic database
• DEMO:Showonmaptool=>available through LifeWatch:www.lifewatch.be/data-services=>Useable for marine&non-marinelocations
• DEMO:Compareown occurrences to documented distributions=>soon available through LifeWatch:www.lifewatch.be/data-services=>“only asgood asthe available data”
=>Still under development!
Generalquality control&dataformatchecking services
Fromtheprojectdescription:• Theseservicesincludee.g.mappingoftheuploadedfieldnameswitha
standardsetoffields,highlightingnon-matchesormissingrequiredfields,andcheckingofthedataformatofe.g.thedate-relatedfields.
• Thesequalitycontrolstepsareprimarilytargetingdataproviderstoallowthemtoeasilychecktheformatandcontentoftheirdatabeforesubmission
• Suchqualitycontrolserviceswerespecificallybeingdevelopedfordatathatwillcontributeto(Eur)OBIS(cfr.now largely replacedby IPT).
• Withinthisproject,thedifferentdataformatsusedforWoRMS,FADA,SCAR-MarBIN,AntaBIF andBioFresh willbecomparedandwherepossiblemappedtoacommonstandard(e.g.DarwinCore)inordertobuildmoregenericwebservicesforcheckingthequalityandformatofthesedata.
What hasbeendone?=>what isalready outthere?• IPT – Integrated PublishingToolkit(by GBIF)
#inherentcheckswhen uploading your file(s)• Check whether occurrenceIDs areprovided &unique (core-extension)• Check whether ‘basis ofrecord’ isprovided• Check dataformat:
– EventDate asISO-standard
– Lat-lon asdecimal degrees,with correct separator
– Character encoding of filecan be indicated (preferred: UTF-8)– IndividualCount fieldas‘integer’
• DarwinCore Archive Validator– Checks the DarwinCore Archives: inspects files &compares the mapped concepts
to GBIFextensions– Specific focus on unique Identifiers that links the different extensions to the core
table
– http://tools.gbif.org/dwca-reports/148-7656490821008157004.html
• LifeWatch webservices– Dataformatvalidation:
• Latitude&longitude <>0,0
• Latitude&longitude between acceptable boundaries (-180/180 &-90/90)
• EventDate in correctformat
=>DEMO
• Already lotoftoolsexist…nouse to re-invent thewheel…
A. General overview
B. Catalogue ofLife
C. Encyclopedia ofLife
D. GBIF
Dataexchangewith international initiatives
Dataexchange
• Largely automated
GENETICS
LifeWatch Taxonomic Backbone
WoRMS – providerto Catalogue ofLife• MemorandumofUnderstanding- 2009
– WoRMS ascontributor to CoL, through its editorial network &global speciesdatabases
– Datawill be displayed in original form, without editing
– Dataareshared freely, butIPRremains with original custodians (=editors)
• Yearly updatesof#GlobalSpeciesDatabases(2009-2012)• Monthly automated updatesto CoL,indefined exchangeformat(since 2013)
Brachiopoda
Cumacea
OphiuroideaPhoronida
PoriferaProseriata - Kalyptorhyncha
Acanthocephala
Acoelomorpha Cephalochordata
Foraminifera
Kinorhyncha
Merostomata
Octocorallia
Orthonectida
RhombozoaAsteroidea
Bochusacea
Brachypoda
Bryozoa
Isopoda
Mystacocarida
Nemertea
Oligochaeta
Polychaeta
Remipedia
Tantulocarida
Thermosbaenacea
Cestoda
Chaetognatha
GastrotrichaGnathostomulida
Mollusca
Monogenea
Myxozoa
Placozoa
Priapulida
Trematoda
113,764speciesnamesacross 23(sub)phyla
• 46GlobalSpeciesDatabasesdelivered to CoL,withmonthlyupdates
Brachyura
Echinoidea
Holothuroidea
Hydrozoa
Leptostraca
Scaphopoda
Tanaidacea
Xenoturbellida
Polycystina
Brachiopoda
Cumacea
OphiuroideaPhoronida
PoriferaProseriata - Kalyptorhyncha
Acanthocephala
Acoelomorpha Cephalochordata
Foraminifera
Kinorhyncha
Merostomata
Octocorallia
Orthonectida
RhombozoaAsteroidea
Bochusacea
Brachypoda
Bryozoa
Isopoda
Mystacocarida
Nemertea
Oligochaeta
Polychaeta
Remipedia
Tantulocarida
Thermosbaenacea
Cestoda
Chaetognatha
GastrotrichaGnathostomulida
Mollusca
Monogenea
Myxozoa
Placozoa
Priapulida
Trematoda
113,764speciesnamesacross 23(sub)phyla
• 46GlobalSpeciesDatabasesdelivered to CoL,withmonthlyupdates
Brachyura
Echinoidea
Holothuroidea
Hydrozoa
Leptostraca
Scaphopoda
Tanaidacea
Xenoturbellida
Polycystina
1species
35,030 species
• EncyclopediaofLife(EoL)
– EoLgets access to all theWoRMS content (ó Catalogue ofLife)
– MoU between WoRMS &EoL
– Selected information:
• Accepted taxonnames
• Higher classification
• Distributions
• Selection ofnotes
– Datatransfer based onmonthly exports from WoRMS
http://eol.org/