netweb technologies delivers india’s fastest hybrid

8
Netweb Technologies Delivers India’s Fastest Hybrid Supercomputer with Breakthrough Performance Using the Intel ® Xeon ® processor and new Intel ® Xeon Phi™ coprocessor enabled Netweb Technologies to develop and deploy India’s fastest and largest hybrid supercomputer. With over 30,000 processing cores and a total 14TB of memory, the next-generation PARAM Yuva II System at the Centre for Development of Advanced Computing (C-DAC) is enabling entirely new possibilities for research in India. WHITE PAPER Intel, Netweb Technologies & Tyrone The discovery of key insights and answers to many of the world’s most challenging and complex problems is in large part possible today thanks to the exceptional performance and groundbreaking capabilities of modern supercomputers. In India, significant strides are now possible with deployment of the next-generation PARAM Yuva II System at C-DAC. Using the combined supercomputing capabilities enabled by the Intel ® Xeon ® processor E5-2670 and Intel ® Xeon Phi™ coprocessor 5110P, highly parallel computing can be carried out faster than ever before. The result is swift assessment of complicated simulation and modeling and the subsequent advancement of research and discovery in diverse areas including biotechnology, computational fluid dynamics, seismic, atmospheric, computational science, disaster mitigation, engineering and more. The next-generation PARAM Yuva II System at the Centre for Development of Advanced Computing (C-DAC) offers peak performance of 524 Teraflops, requires only half the footprint of the previous system and consumes 35% less energy.

Upload: others

Post on 18-Dec-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Netweb Technologies Delivers India’s Fastest Hybrid

Netweb Technologies Delivers India’s Fastest Hybrid Supercomputer with Breakthrough PerformanceUsing the Intel® Xeon® processor and new Intel® Xeon Phi™ coprocessor enabled Netweb Technologies to develop and deploy India’s fastest and largest hybrid supercomputer. With over 30,000 processing cores and a total 14TB of memory, the next-generation PARAM Yuva II System at the Centre for Development of Advanced Computing (C-DAC) is enabling entirely new possibilities for research in India.

wHITe PaPerIntel, Netweb Technologies & Tyrone

The discovery of key insights and answers to many of the world’s most challenging and complex problems is in large part possible today thanks to the exceptional performance and groundbreaking capabilities of modern supercomputers. In India, significant strides are now possible with deployment of the next-generation PARAM Yuva II System at C-DAC. Using the combined supercomputing capabilities enabled by the Intel® Xeon® processor E5-2670 and Intel® Xeon Phi™ coprocessor 5110P, highly parallel computing can be carried out faster than ever before. The result is swift assessment of complicated simulation and modeling and the subsequent advancement of research and discovery in diverse areas including biotechnology, computational fluid dynamics, seismic, atmospheric, computational science, disaster mitigation, engineering and more.

The next-generation PARAM Yuva II System at the Centre for Development of Advanced Computing (C-DAC) offers peak performance of 524 Teraflops, requires only half the footprint of the previous system and consumes 35% less energy.

Page 2: Netweb Technologies Delivers India’s Fastest Hybrid

Netweb Delivers Fastest Supercomputer in IndiaConfronted with the task of deploying the most capable supercomputer in India, the Centre for Development of Advanced Computing (C-DAC), chose Netweb Technologies to deliver an ideal hybrid solution featuring cutting-edge Intel technologies. The next-generation PARAM Yuva II System deployed by Netweb went above and beyond articulated project goals, balancing optimal performance with the lowest possible power consumption and performance per watt. Specific solution goals included a minimum computing performance of 300 Teraflops, solution implementation within a 6-week timeframe, and setup compatible with existing C-DAC data center space, power, and cooling infrastructure constraints. Intel components played a vital role in meeting the performance and efficiency objectives of the new PARAM System. Beyond the key Intel® Xeon® processor E5-2670 and the Intel® Xeon Phi™ coprocessor 5110P, other Intel components included the Intel® Solid-State Drive 330 Series, and Intel® Cluster Studio XE 2013 for Linux. With this cohesive foundation, Netweb was able to provide swift delivery, installation and benchmarking processes—ultimately deploying the supercomputing system in record time.

Netweb Utilizes Advanced Hybrid Technology With their specialization in creating flexible solutions based on the latest technologies, and reputation as an experienced, trusted supplier, Netweb was chosen as the best fit to carry out C-DAC’s vision. Netweb successfully implemented a system featuring both Intel Xeon processors and the new Intel Xeon Phi coprocessors. The use of Intel components and hybrid technology allowed for a final technical computing solution with industry-leading performance for highly parallel applications, flexible execution models to accommodate a wide range project types and workloads, extreme energy efficiency and better performance per watt—all within a unified hardware and software environment.

Netweb’s specialization in a concentrated product gamut enables them to create focused solutions highly tuned to specific project goals. With over 10 years of hands-on experience in high-performance computing (HPC), and more than 150 successful installations across India in various prestigious research and education labs, Netweb was identified as well-equipped to handle the complex task of developing and deploying India’s fastest supercomputer.

“The software was loaded on the servers and configured as per C-DAC's requirements. Since it was an HPC installation, there was a significant amount of work to be done beyond installation of basic server systems. We installed the operating system, configured it as an HPC, installed compilers and libraries and ran required benchmarks. We also had to ensure compatibility with other infrastructure such as the PFS storage and backup system. Running benchmarks for best results requires a considerable amount of fine-tuning, which needs domain expertise.”

Sanjay LodhaChief Executive Officer Netweb Technologies

Page 3: Netweb Technologies Delivers India’s Fastest Hybrid

3

Intel® Xeon Phi™ Coprocessor Impacts Computing Performance The addition of the Intel® Xeon Phi™ coprocessor to the Intel® Xeon® processor increased peak performance for each computing node from 332.8 Gigaflops to 2,354.5 Gigaflops.

C-DAC Dedicated to Future-Forward HPC Solutions C-DAC has emerged as a premier research and development (R&D) organization in IT&E (Information Technologies and Electronics) in India and is committed to strengthening India’s computing capabilities in light of ever-changing global developments. As an institution committed to high-end R&D, C-DAC has been at the forefront of the Information Technology (IT) revolution, building out their computing capacity through use of the latest technologies. Additionally, C-DAC applies their expertise in developing and deploying IT solutions to support a wide range of public and private organizations.

As a nodal agency, C-DAC supports over 200 users running different types of code for various HPC needs. According to Dr. Pradeep K Sinha, Senior Director of HPC at C-DAC, “The upgraded PARAM Yuva installed at C-DAC, Pune by Netweb Technologies is based on Intel Xeon Phi Many Integrated Core architecture coprocessors, and has become the most powerful supercomputer for India’s scientific community with theoretical peak performance that exceeds one half of a Petaflop. This largest system was supported by the India Department of IT, the Ministry of Communications & IT, and the government of India and will provide unprecedented

“The upgraded PARAM Yuva installed at C-DAC, Pune by Netweb Technologies is based on Intel Xeon Phi Many Integrated Core architecture coprocessors, and has become the most powerful supercomputer for India’s scientific community with theoretical peak performance that exceeds one half of a Petaflop. This largest system was supported by the India Department of IT, the Ministry of Communications & IT, and the government of India and will provide unprecedented computing power for performing research in the fields of biotechnology, computational fluid dynamics, seismic, atmospheric, computational science, disaster mitigation, engineering and other disciplines. This will pave the way for a wide range of achievements in science and technology for India. I would also like to express our full satisfaction and admiration for Netweb Technologies that completed the project in record time.”

Dr. Pradeep K SinhaSenior Director, HPC

C-DAC, India

Page 4: Netweb Technologies Delivers India’s Fastest Hybrid

computing power for performing research in the fields of biotechnology, computational fluid dynamics, seismic, atmospheric, computational science, disaster mitigation, engineering and other disciplines. This will pave the way for a wide range of achievements in science and technology for India.” C-DAC’s alignment with key research institutions was also critical in the successful uptake of the new PARAM System—allowing for maximum impact and complete utilization of the system’s advanced capabilities.

Advantages of Integrated Intel® Xeon® Processor E5-2670 and Intel® Xeon Phi™ Coprocessor 5110P in the Next-Gen PARAM Yuva II System As Intel’s latest-generation processor able to deliver extremely high performance and exceptional energy efficiency for CPU-bound applications, the Intel® Xeon® processor E5-2670 was chosen as the best fit to meet C-DAC’s goals for their HPC deployment. Coupled with the Intel® Xeon Phi™ coprocessor 5110P, this hybrid solution enabled reduced latency, dramatic performance gains for demanding applications and highly parallel technical computing workloads, better energy efficiency, high throughput, improved performance per watt and more.

The Intel Xeon Phi 5110P was selected as the best coprocessor accelerator available—bringing tremendous computing capability to each node with a minimal increase in power requirements. With the addition of the Intel Xeon Phi coprocessor, the peak performance of each node

• 222 x Tyrone* 2U Server Compute Nodes• 2 x Tyrone 2U Server Compile Nodes• 448 x Intel® Xeon® E5-2670 Processors

Bundled with Node(s)• 1808 x 8GB DDR3 Memory Modules

Bundled with Node(s)• 448 x 180GB Intel® Solid-State Drive 330

Series, Bundled with Node(s) (146GB SAS Hard Disk)

• 236 x MCX353A-FCBT ConnectX-3* Cards• 224 x Rack/Rail Mounting Kits Bundled

with Node(s)• 448 x Intel® Xeon PhiTM 5110P Coprocessors • 1 x Mellanox* MSX6536-NR 648-Port FDR-

Capable Modular Chassis (648-Port FDR Infiniband* Switch)

• 1 x Intel® Cluster Studio XE 2013 for Linux*, 25-user Floating License, 3 Years

• 224 x UFM (Unified Fabric Manager) for InfiniBand Switch

“The combination of Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors provide the unique capability to cover a very wide range of applications, taking full advantage of various levels of code parallelism. Using the same programming model for Xeon processors and Xeon Phi coprocessors allows single code optimization to deliver better performance on both processor and coprocessor.”

raj HazraVice President and General Manager of Technical Computing Intel

Utilization of Intel Technologies — Build of Materials

Page 5: Netweb Technologies Delivers India’s Fastest Hybrid

5

ParaM System Server Setup Peak Performance Processing Cores

PARAM Yuva 4-way, 4U 54 Teraflops 4,608

PARAM Yuva II 2-way, 2U (plus coprocessors) 524 Teraflops 30,000

increased from 332.8 Gigaflops to 2,354.5 Gigaflops. Power consumption rose from 400 Watts to 980 Watts. Flexible execution models are also available for utilization with the combined capabilities of the Intel Xeon processor E5-2670 and Intel Xeon Phi coprocessor 5110P—allowing for unique, tailored models to accommodate each of C-DAC’s over 200 users running different types of code for various HPC needs.

Additionally, with a microarchitecture featuring Intel’s 22nm process technology and 3-D Tri-Gate transistors, the Intel Xeon processor E5-2670 and Intel Xeon Phi coprocessor 5110P based-solution maintains the highest level of energy efficiency when running highly parallel applications. While the Intel Xeon E5-2670 processor is a versatile solution for most workloads, the Intel Xeon Phi 5110P coprocessor provides superior scalability and optimized performance when running highly parallel computing workloads.

Many smaller cores, more threads, and wider vector units compensate for the reduced speed of each individual core. The result is higher aggregate performance for workloads that can be subdivided into a sufficiently large number of simultaneous tasks. The Intel Xeon Phi coprocessor is Intel’s first processor to feature Intel® Many Integrated Core (Intel® MIC) architecture and utilizes a high degree of parallelism in smaller, lower-power Intel® processor cores. The result is advanced performance when running highly parallel applications. With the ability to carry out trillions of calculations per second, the Intel Xeon processor E5-2670 and Intel Xeon Phi coprocessor 5110P-based solution provides a solid foundation with performance optimization for virtually any workload.

Advantages of the New PARAM Yuva II Versus Previous-Generation PARAM Yuva System The original homogenous PARAM Yuva System was based on 4-way, 4U Intel® Xeon® processor-based systems with 4,608 processing cores and peak performance of 54 Teraflops, while the new hybrid PARAM Yuva II System is based on 2-way 2U Intel® Xeon® processor-based systems with 2 coprocessors per system, a peak performance of 524 Teraflops and over 30,000 processing cores. The new system requires only half the footprint of the previous PARAM Yuva System—providing an opportunity for future upgrades and the expansion of C-DAC’s compute capacity. The new PARAM Yuva II System also consumes 35% less energy compared to the original PARAM Yuva. The new system delivers sustained performance of 360.8 Teraflops on the LINPACK benchmark.

Advantages of Intel® Solid-State DrivesThe Intel® Solid-State Drive 330 Series was selected for local node storage due to its high performance and low power consumption when running data-intensive HPC workloads. Based on 25nm Multi-Level Cell (MLC) Intel® NAND Flash Memory, the Intel Solid-State Drive 330 Series combines storage reliability and responsiveness for a wide range of applications. Combined with other Intel components, the Intel Solid-State Drive 330 Series played a significant role in supporting the demanding data requirements of C-DAC’s technical computing solution.

Intel® Xeon Phi™ Coprocessor

Intel® Xeon® Processor

Page 6: Netweb Technologies Delivers India’s Fastest Hybrid

Intel® Cluster Studio XE 2013 for Linux Boosts PerformanceIntel® Cluster Studio XE is the first comprehensive HPC hybrid parallelism development suite specially created to support increased performance with highly parallel applications. It provides a comprehensive set of parallel programming standards driven by C/C++ and Fortran development tools and programming models, enabling more efficient optimization for HPC applications for the Intel Xeon processor and Intel Xeon Phi coprocessor. This uniformity can greatly reduce the complexity of developing, optimizing, and maintaining software code. Code can be optimized once for both Intel Xeon processors and Intel Xeon Phi coprocessors. The same techniques—such as scaling applications to many cores and threads, blocking data for hierarchical memory and caches, and effective use of single instruction, multiple data (SIMD)—delivers optimal performance for both processor and coprocessor. The investment made in parallelizing code delivers benefits across the full range of computing environments.

Netweb uses Tyrone* Servers to Deliver a Seamless SolutionFor over a decade, Tyrone* servers have helped thousands of companies around the world run more efficiently, securely, and reliably. As a leading provider of servers, storage, back-up, and HPC solutions, Tyrone delivers results that combine speed, reliability, scalability and energy efficiency. Tyrone DI200T3R-28R4 servers based on Intel products were chosen due to their compliance with preset C-DAC limitations—including existing space and power constraints and support for the Intel Xeon Phi coprocessor. At the time the project was initiated, these boards had yet to be launched. The Tyrone DI200T3R-28R4 was one of the few servers in a position to ensure compatibility for seamless solution integration. The Tyrone server delivered maximum memory capacity, I/O flexibility, and provided a well-matched addition for integration with other Intel components.

“The system is the first installation in India with Intel based Tyrone Servers and Intel Xeon Phi, a debut product from Intel.”

Sanjay LodhaChief Executive Officer Netweb Technologies

2U Tyrone* servers met the space and power constraints of C-DAC's solution while providing support for both the Intel® Xeon® processor and Intel® Xeon Phi™ coprocessor.

Page 7: Netweb Technologies Delivers India’s Fastest Hybrid

7

C-DAC required a solution that included 400 compute nodes in 28 racks, with a peak power consumption of 16kW per rack or lower. Netweb and Tyrone included 2U nodes, with 14 nodes per rack and a power consumption of 14.4kW per rack. In the future, up to 16 nodes per rack (up to 448 nodes in the 28 racks) can be accommodated. Through the use of Intel processors and Intel Xeon Phi coprocessors featuring the latest microarchitecture enhancements, C-DAC was able to significantly reduce their cooling infrastructure requirements while boosting computing capacity by ten times. Given the remaining data center space available, this computing capacity could be doubled—allowing for twenty times the original computing capacity within the same power and cooling envelope.

Netweb Deploys Supercomputer in Record TimeTime was the greatest challenge faced by Netweb in the development and deployment of the next-generation PARAM Yuva II System. The project was set for completion in a 6-week timeframe, with 4 weeks for delivery and 2 weeks for installation and benchmarking. The usual timeline for an order of this magnitude is 12 weeks—making timely delivery a notable challenge, especially considering that the solution was based on newly launched Intel Xeon Phi coprocessors. Netweb’s considerable experience in deploying complex solutions with Intel technology aided in overall project success within the aggressive timeframe. The compatibility of Intel components and uniformity with Intel® Cluster Studio XE 2013 for Linux* were key in allowing Netweb to meet their goals in record time.

ClosingThe final Netweb Technologies solution featured tremendous processing capabilities enabled by the Intel Xeon processor E5-2670 and Intel Xeon Phi coprocessor 5110P. C-DAC now manages India’s largest supercomputer built with hybrid technology, and the first Indian supercomputer to achieve more than 500 Teraflops of peak performance. The hybrid solution will continue to have a monumental impact in highly parallel applications for biotechnology, computational fluid dynamics, seismic, atmospheric, computational science, disaster mitigation, engineering and other disciplines. With the release of the new PARAM Yuva II System, supercomputing efforts in India have received a major boost—with many more installations planned for deployment in the next 5 years. Those involved in supercomputing efforts in India look forward to establishing an Exascale-level computing facility in the future, and to continually pushing the boundaries of innovation and scientific discovery in the years to come.

“The innovative technology used to build this system allows for stunning performance and leading energy efficiency. It was possible only due to active collaboration and the single mission work that the teams did together. For that, I want to thank the teams from C-DAC and Netweb for their vision, ability to quickly deal with the challenges, and delivering to the promise.”

raj HazraVice President and General Manager

of Technical Computing Intel

Page 8: Netweb Technologies Delivers India’s Fastest Hybrid

Learn more about Intel and HPC solutions at:www.intel.com/HPC

Learn more about Intel® Cluster Studio XE 2013 at:intel.ly/cluster-studio-xe

Learn more about Intel® Server products at:www.intelserveredge.com

Learn more about Netweb Technologies at:www.netwebindia.com

Learn more about C-DAC at:www.cdac.in

Learn more about Tyrone* Servers at:www.tyronesystems.com

Intel, the Intel logo, Xeon, Xeon Inside and Intel Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries.* Other names and brands may be claimed as the property of others. Copyright © 2013 Intel Corporation. All rights reserved. 0513/GIP/CAF/PDF C Please Recycle 328953-001US