using power at warwick university

29
Using POWER at Warwick University Dugan Witherick 8 th July 2019 / University of Birmingham/ Second PowerAI User Group Meeting

Upload: others

Post on 18-Dec-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using POWER at Warwick University

Using POWER at Warwick UniversityDugan Witherick

8th July 2019 / University of Birmingham/ Second PowerAI User Group Meeting

Page 2: Using POWER at Warwick University

• Established in the 1960s.

• Creation supported by University of Birmingham Vice Chancellor.

• ~27,000 students (undergrad and postgrad)

• Ranked

• 9th in the UK (Guardian 2020 league table)

• 62nd in the world (QS World University Rankings 2020)

• within the UK top 10 for highest earnings in over 11 subjects 5 years after graduating (UK Gov 2018 LEO Dataset).

Who is Warwick University

Page 3: Using POWER at Warwick University

Where is Warwick University (not in Warwick)

Page 4: Using POWER at Warwick University

• One of the UK's leading research universities.

• Theme focused research e.g.:

• The Engineered World: from Molecules to Machines

• Life Sciences and Health• Strong history of collaboration and partnerships

including:

• The Monash Warwick Alliance

• National Automotive Innovation Centre (WMG, JLR, Tata)

Research at Warwick University

Page 5: Using POWER at Warwick University

• World-class technologies and expertise.

• Ready access to research critical tools.

• Responsibility of Pro-Vice Chancellor (Research).

• Includes:

• Advanced Bioimaging

• Electron Microscopy

• X-ray diffraction

• And...

Where do we fit in?

Research Technology Platforms

Page 6: Using POWER at Warwick University

• Located in the Department of Computer Science.

• Providing:

• Scientific desktop (based on Linux).

• Two local HPC clusters providing ~6000 cores (Tinis and Orac).

• Access to the HPC Midlands+ Tier2 (Athena) system.

• One SCRTP Director.

• Four "computing" staff.

• Two dedicated RSEs with additional Project Associate RSEs.

Scientific Computing (SCRTP)

Page 7: Using POWER at Warwick University

• Centre for Scientific Computing (CSC)

• Interdisciplinary research community based around the sharing of knowledge and expertise in computer modelling and simulation.

• Representatives from Departments including Physics, Maths, WMG and Warwick Medical School

• Department of Computer Science

Close Working Relationships

Page 8: Using POWER at Warwick University

• UK's national institute for data science and artificial intelligence.

• University of Warwick one of the five founding partners.

• Over twenty Turing Fellows at Warwick.

• Data science tools for high-performance computing (ATI Research Project).

The Alan Turing Institute and Warwick

Page 9: Using POWER at Warwick University

• Tissue Image Analytics (TIA) Lab

• Deep Learning for Imaging Data.

• Using/developing ML algorithms and data science platforms to understand and improve air quality over London.

• Crowd blackspot intelligence for 5G rollout (COCKPIT-5G).

AI at Warwick

Page 10: Using POWER at Warwick University

• 4 x 16 core Haswell with 2 x Xeon Phi 7120P (co-processor)

• 4 x 16 core Haswell with 2 x K80 dual-GPU

• CentOS 6, QDR Infiniband

• 4 x Xeon Phi 7250F (Knights Landing)• CentOS 7, Omnipath

Accelerators for supporting AI workloads

Page 11: Using POWER at Warwick University

• Power8 Minsky S822LC

• 2 x IBM POWER8 3.259 GHz 8-core processors

• 16 cores per node

• 256 GB DDR4 memory

• 4 x NVIDIA P100 GPGPUs (SXM2 NVLink-enabled)

• Part of the Orac HPC Cluster (Broadwell, NetApp/Spectrum Scale, Omni-Path, CentOS 7, xCAT)

OpenPower TestBed

Page 12: Using POWER at Warwick University

• Version 1.5.4

• OS reinstall number one: CentOS 7.3 -> 7.5

• Not the simplest installation procedure.

• Non-relocatable dependencies!

• I think my "simplified" instructions to users may have put them off!

• User Question: "Where's theano?"

PowerAI Attempt 1

Page 13: Using POWER at Warwick University

• Version 1.6.0

• OS reinstall number two: CentOS 7.5 -> 7.6

• Much simpler installation

• All in a Conda channel (thank you).

• Some actual usage!

PowerAI Attempt 2

Page 14: Using POWER at Warwick University

• Demetris Marnerides

• Warwick Centre for Predictive Modelling

• Converting Low Dynamic Range (LDR) images to High Dynamic Range

• HDR displays readily available but most content still LDR.

Deep Learning for HDR Imaging

Page 15: Using POWER at Warwick University

• Convolutional Neural Networks to learn mapping from LDR to HDR

• PyTorch and OpenCV

• https://github.com/dmarnerides/hdr-expandnet

• https://arxiv.org/abs/1803.02266(Initial Reseach)

ExpandNet

Page 16: Using POWER at Warwick University

• Kieran Kalair

• Mathematics for Real-World Systems Centre for Doctoral Training

• Analysing traffic data particularly extreme cases:

• Accident

• Breakdown

• Random perturbation that causes a cascade of flow breakdown

Large Scale Traffic Data Analysis Problems

Page 17: Using POWER at Warwick University

• Time-Series models poor predictors for very short horizons.

• Neural Networks to improve predictions using UK motorway data.

• PyTorch

Improving Predictions

Image Credit: Jaroslaw Kilian / Shutterstock.com

Page 18: Using POWER at Warwick University

• Version 1.6.1.

• Sorry, Watson Machine Learning Community Edition.

• Please stop renaming your products!

• On my todo list.

PowerAI Attempt 3

Page 19: Using POWER at Warwick University

• How easy is it to build HPC applications currently used on Orac and Tinis on OpenPower?

• Affects support load (manual/adhoc builds take time).

• Currently use EasyBuild.

• Once built, do they produce the expected output?

• How well do these builds perform?

Migrating Non-AI Work to OpenPower

Page 20: Using POWER at Warwick University

• Use EasyBuild to build GPU accelerated HPC applications (commonly used at Warwick).

• EasyBuild fosscuda 2018b toolchain:

• GCC 7.3.0, CUDA 9.2, OpenMPI 3.1.1

• Test "identical" build on Haswell/K80 for baseline performance.

Testing Non-AI Support and Workloads

Page 21: Using POWER at Warwick University

• Large-scale Atomic/Molecular Massively Parallel Simulator.

• Classical Molecular Dynamics Code.

• Distributed by Sandia National Laboratories.

• Used by Warwick Physics.

LAMMPS

Page 22: Using POWER at Warwick University

• “Freeze” internal benchmark:

• 50 x 50 x 50 crystal in lattice units.

• Lennard-Jones Interactions.

• 100,000 steps.

• Patch Release 18 June 2019.

• GPU Package built with DOUBLE_DOUBLE precision.

• Built entirely by EasyBuild (no noticeable issues).

LAMMPS Testing

Page 23: Using POWER at Warwick University

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

CPU

CPU+GPU

Time-steps per second (higher is better)

LAMMPS Freeze Test

Power8 K80

Page 24: Using POWER at Warwick University

•Still using EasyBuild fosscuda 2018b toolchain.

•Manual build (toolchain loaded but LAMMPS build manually).

LAMMPS Testing Attempt 2

Page 25: Using POWER at Warwick University

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Manual

EasyBuild

Time-steps per second (higher is better)

LAMMPS Freeze GPU Test (EasyBuild vs Manual Build)

Power8 K80

Page 26: Using POWER at Warwick University

• Empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).

• Developed at MRC Laboratory of Molecular Biology (Cambridge).

• Used by Warwick Life Sciences.

• Class3D standard benchmark using v. 3.0.6.

• Built entirely by EasyBuild.

RELION

Page 27: Using POWER at Warwick University

0:00 1:00 2:00 3:00

K80

Power8

Time HH:MM (Lower is better)

RELION Class3D Test

Page 28: Using POWER at Warwick University

• Increase in "manual" or ad-hoc builds to get performance for some applications.

• Partial EasyBuild.

• Using toolchain but final build by hand.

• "Ad-hoc" builds using XL Compilers.

Non-AI Testing (conclusions so far)

Page 29: Using POWER at Warwick University

Email: [email protected]

Questions?