bioteam bhanu rekepalli presentation at bicob 2015
Post on 14-Jul-2015
168 Views
Preview:
TRANSCRIPT
Fast Genomic Sequence Searches with Symmetric Implementation of Parallel Blast
Bhanu Rekepalli (BioTeam) Eduardo Ponce and Greg Peterson (UTK)
BICoB 2015, March 10, 2015
“The Freedom To Discover”
3
BioTeam
• Independent consulting firm
• Staffed by scientists forced to learn IT, SW & HPC to get our own research done
• Assess, Design, Implement & Train
• Bridging the “gap” between science, IT & high performance computing since 2002
• Skilled Bio-‐IT Evolutionary Anthropologists (> 400 studied in the last year)
Outline
• The genomics data problem • Highly-‐scalable parallel wrapper • Parallel BLAST on Xeon Phi • Optimizations • Performance evaluation • Conclusion • Future work
The genomic data problem
• Advances in next-‐generation sequencing techniques are producing complete genomes at faster rates than data analysis can process
• Data is managed by community-‐centered databases (updated routinely) • e.g., GenBank, EMBL, NR, PDB
• Challenge: Bioinformatics research requires high-‐throughput processing and analytic tools to sustain the exponential growth in the genomic data Add the fact that HPC Is difficult to utilize
• Solution: modify algorithms and frameworks to allow scalable analytics in modern architectures
Intel Xeon Phi
• A many integrated core (MIC) for massive parallelism • Programming models
• Native – all code on MIC • Offload – main code on CPU, other on MIC • Symmetric – both CPU and MIC using message passing • Upload – main code on MIC, other on CPU
Case study: NCBI BLAST
• The “Swiss Army knife” of biologists • BLAST (Basic Local Alignment Search Tool) aligns genomic chains of amino acids using fast heuristic algorithms to find regions of local similarity. • Compares query sequences to sequence databases and calculates the statistical significance of matches.
• Sequencing programs: blastp, blastn, blastx, psi-‐blast
• Query file and formatted database (FASTA format)
Highly-‐scalable parallel wrapper • HSP-‐wrap • Software framework for scaling life science informatics applications to HPC environments via task parallelism • Bioinformatics and chemoinformatics domains • Portable à written in C/C++ and MPI • Load balance, parallel output, fault-‐tolerance, check-‐pointing
• Successfully ported tools • BLAST, HMMER, MUSCLE • DOCK6, AutoDock Vina, LINUS
HSP-‐wrap: architecture The Wrapper Approach
Database(NR,
Pfam, …)
InputQueries
Results 1
Results 2
Results N
Lustre FS
Database
Query Block 1…
Query Block M
Output Buffer
Tool Process 1(BLAST, DOCK6, HMMER, …)…
Compression
Data-‐base
Query Block
Worker Nodes [1..N]
Master Node
CompressedBuffer
Main Memory
Tool Process P(BLAST, DOCK6, HMMER, …)
Main Memory
Preload Database
…
Bhanu Rekapalli et. al. BMC Bioinformatics 2013, 14: S3
HSP-‐wrap: memory management
• stdiowrap – module for file management • Function interposition to standard I/O calls • Minimal modification to original code (if any) • Input file management • Files are mapped to main memory on-‐demand • Tracks parallel reads
• Output file management • Double buffered parallel support • Minimizes number of data transfers
In symmetric execution, both Xeon and Xeon Phi processors used as network hosts for distributed processing.
Symmetric HSPH-‐BLAST
Xeon/Phi Configuratio
n
Input sequence
s
Worker nodes:[Xeon, Phi]
Physical cores:[Xeon, Phi] =
Total 3x_8p 17000 [2, 8] [48, 488] = 536
5x_16p 34000 [4, 16] [80, 976] = 1056
9x_32p 68000 [8, 32] [144, 1952] = 2096
17x_64p 136000 [16, 64] [272, 3904] = 4176
Weak scaling parameters
• Parallel wrappers can be adapted to current informatics applications to greatly improve processing throughput and scalability on supercomputing platforms due to similar programming models and I/O characteristics.
• The wrapped tools can be used to identify species, perform DNA mapping, infer on functional and evolutionary relationships between sequences as well as help identify members of gene families.
• Symmetric weak scaling studies showed linear speedup and balanced workload distribution for course-‐grained parallelization of BLAST.
• Finer-‐grained vectorization of the BLAST process could improve utilization of Xeon Phi processors, thus we are collaborating with Intel engineers and NCBI BLAST developers to address this.
Conclusions
Future work
• Extend Highly-‐Scalable Parallel Hybrid Software Wrapper • Replication and fragmentation schemes for large data
management • Input models: dot/cross product and hybrid approaches
• Make tools available as standard software modules on HPC architectures
• Integrate HSP-‐tools into scientific workflow pipelines to provide fast processing for high-‐impact scientific discovery • Incorporate with web-‐enabled science gateways
• Optimize NCBI BLAST for Xeon Phi processors and scale it with HSP-‐Wrap
• Adapt parallel wrappers to other primary tools used in informatics fields of the life sciences
• Build data analysis pipelines for novel data mining and large-‐scale knowledge discovery
top related