sri international bioinformatics data import / export markus krummenacker bioinformatics research...
TRANSCRIPT
SRI InternationalBioinformatics
Data Import / Export
Markus Krummenacker
Bioinformatics Research Group
SRI, International
Q3 2012
SRI InternationalBioinformatics
Data Exchange Overview
Java API and Perl API : read & modify
BioPAX Export: since Pathway Tools 9.0 Biopax.org
Export of entire PGDB as a set of Flatfiles Export of Reactions as SBML -- sbml.org Import/Export of Pathways: between PGDBs Import/Export of Selected Frames, for Spreadsheets Import/Export of Compounds as Molfile, CML Registering/Publishing PGDBs on WWW Export PGDB as Genbank
BioWarehouse : Loader for Flatfiles, SQL access http://bioinformatics.ai.sri.com/biowarehouse/
SRI InternationalBioinformatics
Import/Export of Pathways, etc.
Export selected pathways (and related objects) as a file
Import this file into a different PGDBCan be used for submitting pathways to MetaCyc.
See http://metacyc.org/MetaCycPosting.shtmlVisit page of pathway (or object), and right-click
choose Edit->Add Object to File Export List
File->Export->Selected Objects to Lisp-Format FileFile->Import->Frames from Lisp-Format File
SRI InternationalBioinformatics
Dump PGDB into Flatfiles
Export of entire PGDB as Flatfiles Format Description:
http://bioinformatics.ai.sri.com/ptools/flatfile-format.html Column delimited: 1 line per frame Attribute-value: 1 record per frame
Multiple slot values: Column delimited: several values per column Attribute-value: several lines for several values
SRI InternationalBioinformatics
Frame Import/Export
Import/Export of Selected Frames, for Spreadsheets Allows external editing of frames, and also frame creation Detailed Description: UG section 5.6 Export: GUI for Frame selection, Slot selection
Slots depend on selected class Caveat: value annots in slots get lost ! Direct or all instances under class can be exported
Import: Many choices for merging or replacing data values File Format Choices like the Flatfiles:
Column delimited: 1 line per frame Attribute-value: 1 record per frame
Multiple slot values: Column delimited: several values per column Attribute-value: several lines for several values
SRI InternationalBioinformatics
Misc.
Export of a replicon as a Genbank file Pathologic is the inverse, “Import” But: information loss, e.g. gene product comments have no feature
qualifier in Genbank Importing protein features from UniProt
Connection to MySQL BioWarehouse needed See UG section 5.8
Importing Citations from PubMed