hmp 201512
TRANSCRIPT
Methods for detecting differential abundance in metagenomic data
Statistical and Visualization Methods for Metagenomic AnalysisHctor Corrada Bravo Center for Bioinformatics and Computational Biology
metagenomeSeq16S differential abundanceR/Bioconductor infrastructure for metagenomic assaysLongitudinal data
metagenomicFeaturesIncipient attempt regularizing 16S feature annotations in R/BioconductorE.g., greengenes13.5MgDb
msd16sExample data, as infrastructure object
R/Bioconductor StrengthsInfrastructure objectsInteroperability, speed up startup time for method developmentStrict development practicesDocumentation, use cases, vignettesAnnotation infrastructureAgain, interoperability across experiments and data typesExploratory analysisReproducibilityVignettes, Rmarkdown, etc.Recently, exploratory and interactive visualizationShiny, epiviz
Integrative, visual and computational exploratory analysis of genomic dataBrowser-basedInteractiveIntegration of dataReproducible disseminationCommunication with R/Bioconductor: epivizr package
software systems to support creative exploratory analysis of large genome-wide datasets...
Computed Measurements: create new measurements from integrated measurements and visualize
Summarization: summarize integrated measurements (computed on data subsets)
Dynamically extensible: Easily integrate new data sources, data types and add new visualizations.
Data providers define coordinatespace
One interpretation of Big Data is many sources of relevant contextual data
Easily access/integrate contextual dataDriven by exploratory analysis of immediate dataIterative processVisual and computational exploration go hand in hand
Visualization design goals
Context Integrate and align multiple data sources; navigate; searchConnect: brushingEncode: map visualization properties to data on the flyReconfigure: multiple views of the same data
Visualization design goals
DataSelect and filter: tight-knit integration with R/Bioconductor(current work) filters on visualization propagate to data environmentModelNew 'measurements' the result of modeling; suggested by data context
Metagenomic VisualizationHow to effectively navigate large datasets where features are organized hierarchically?
Metaviz: browser-based, interactiveexploratory analysis of metagenomicdata
Connection to R/Bioconductor withmetavizr packageBuilt on metagenomeSeq and metagenomeFeatures infrastructure
MetavizExploration of hierarchically organized featuresGeared towards 16S for nowHierarchical organization relevant to WGSIntegration is a big part of designFramework designed for data integration
AcknowledgementsBrianna Lindsey, O. Colin Stine, Owen White, Anup Mahurkar: University of Maryland BaltimoreJim Nataro: University of VirginiaNIGMS, Genentech
Florin Chelaru(now @ MIT)
Joseph Paulson(now @ Harvard)
Mihai Pop (@ UMD)