scalable graph analytics for metagenomics and metaproteomics
DESCRIPTION
Scalable graph analytics for metagenomics and metaproteomics. Ananth Kalyanaraman @ HPCBio lab ( [email protected] ) Associate Professor, School of EECS, Washington State University, Pullman, WA. - PowerPoint PPT PresentationTRANSCRIPT
Scalable graph analytics for metagenomics and metaproteomics
Ananth Kalyanaraman @ HPCBio lab ([email protected])Associate Professor, School of EECS, Washington State University, Pullman, WA
Research Areas: Parallel algorithms, Computational biology/bioinformatics, Graph algorithms, String algorithms, Parallel
architectures
Research Areas: Parallel algorithms, Computational biology/bioinformatics, Graph algorithms, String algorithms, Parallel
architectures
Workshop on Future Computing Platforms to Accelerate Next-Gen Sequencing (NGS) Applications, May 19, 2013, held in conjunction with IPDPS’13, Boston, MA
Applications: bioenergy alternatives human health environmental monitoring soil and forest ecology ocean microbiology …
Environmental microbial community analytics
DNA, RNA, protein,mass spec/peptide
NGS
Data scale: #studies: >350 #samples: >2,500 #genic/ORF reads: >100M+ …
Funding relevance:
Image courtesy: www.genomesonline.org
Some graph-theoretic problems in environmental microbial community analytics
Problems: Network construction Clustering Community annotation Network comparison Heterogeneity …
Source data:Protein/ORF sequence homologyMass spectral library construction Interaction networks (gene, protein)
Parallelism: mostly rudimentary/ad hoc
in standard workflows distributed memory
MPI, MapReduce Intra-node
Multicore, GPUs
Some challenges: inherits graph-related challenges
and choice of architectures availability of networks/inference data integration low sampling, species diversity qualitative metrics automated workflows …Workshop on Future Computing Platforms to Accelerate Next-Gen Sequencing
(NGS) Applications, May 19, 2013, held in conjunction with IPDPS’13, Boston, MA
SIAM CSE'13, Boston, MA 3
Graphs are pervasive in Computational Biology
2/28/2013
genemotifs
read
s
Genome
mRNA
proteindatabase
search
Comparativegenomics
Phy
loge
netic
tree
Proteinfamilies
….
STRING GRAPHSCLIQUE
PROBABILISTICGRAPH MODELS
COMPARATIVENETWORK ANALYSIS
CLASSICAL NETWORKANALYSIS
TREES,DAGS,TSP,ML
PATTERNMATCHING
Populationgenomics