python on hpcc
TRANSCRIPT
Dr. Yongjun Choi
Python on HPCCICER Workshop, Oct/12/2020
Scope of this workshop• What we want to do:
• Explain What MSU HPC are doing to support Python users.
• Provide guidance to help users improve Python performance at the the HPC
• Point out tools that support developers of Python in HPC
• What we assume:
• You know and use Python, or
• You know and use HPC and are curious about using Python in your own HPC work.
Getting Started with Python Resouces• https://www.python.org/about/gettingstarted/
• https://wiki.python.org/moin/BeginnersGuide/
• https://www.codecademy.com/learn/python/
• https://www.coursera.org/specializations/python/
• https://software-carpentry.org/lessons/
• https://pymotw.com/
• https://wiki.hpcc.msu.edu/display/ITH/Python
• https://www.youtube.com/watch?v=_uQrJ0TkZlc
• py4e.com
• ……
Python is a very popular languge
Most popular coding Languages of 2020: www.tiobe.com/tiobe-index
https://insights.stackoverflow.com/survey/2020#technology-most-loved-dreaded-and-wanted-languages-loved
Why is Python so popular?• Easy
• Clean, clear syntax
• Multi-paradigm, integreted
• No manual garbage collection
• Flexible, full-feature data strutures
• Extensive standard libraries
• Open-source packages
The Scientific Python Stacks
• Primary Uses:
• Script workflows for both data analysis and simulations
• Perform exploratory, interactive data analysis and vizualization
Python at MSU HPCC• HPCC supports Python
• Maximizing Python performance can be challenging:
• Interpreted languages are difficult to optimize.
• Designed only one native thread can execute at a time.
• Designed and implemented without considering realities of HPC.
Basic Guidelines for Python in HPC• Identify and exploit parallelism at the core, node, and cluster levels
• Understand and apply Numpy array syntax and its broadcasting rules:
• https://numpy.org/doc/stable
• https://numpy.org/doc/stable/user/basics.broadcasting.html
• Measure your codes’ performance using profiling tools
• https://stackify.com/how-to-use-python-profilers-learn-the-basics/
Python at HPCC• HPC Module
• module avail python
• Module spider python
• module load python
• module load Python/2.7.9
• Or install your own Python (many options, but we suggest Anaconda)
• System python (/usr/bin/python): risky, not recommended.
Using Python In HPC• limited packages: only a few very famous packages are installed such as
Numpy, Matplotlib
• Why? Python has a lot of packages, modules and libraries that researchers may want to use. However, it is difficult for HPCC to keep up with and avoid conflicts between different versions of packages and libraries.
• https://wiki.hpcc.msu.edu/display/ITH/Python
• Virtual environment (virtualenv)
• Anaconda
Virtualenv• Based on HPC Python: Users control packages. HPC controls Python
• https://wiki.hpcc.msu.edu/x/xIEVAg
Anaconda (recommended)• Easy to install.
• Install on your home or research space
• Fully control by users
• https://www.anaconda.com
• Download https://www.anaconda.com/products/individual
• https://wiki.hpcc.msu.edu/display/ITH/Using+conda
• pip and Anaconda can be used for package installation. However, it would be better to stick to one way.
• pip/conda can not uninstall packages which were installed via conda/pip.
Jupyter notebook• https://ondemand.hpcc.msu.edu/pun/sys/dashboard
Can my Python code be faster?• Vectorization
• Do not using loop if possible. Instead, use Numpy.
• eg: ex01.py
• Parallelization (MPI, OpenMP, OpenACC, Thread)
• Workflows - eg: simultaneously launching with job-array (eg: ex02.py, and ex02.sb)
• Numba: has some restrictions, but it makes your code very fast!
• eg: https://murillogroupmsu.com/numba-versus-c/
• ex03.py
Use Threaded Libraries• Packages like NumPy, SciPy are already built with MPI and thread support via
BLAS/LAPACK, MKL
• Don’t reimplement solvers in pure Python
• Many of your favorite threaded libraries and packages already have bindings:
• PyTrilinos
• Petsc4py
• Elemental
• SLEPc
• Do not try to reinvent wheels. If it is not new, probably it is already implemented in a very nice way.
Using Compiled Modules• Methods of using pre-compiled, threaded GIL-free code for speed include:
• Cython
• F2py
• PyBind11
• Swig
• Boost
• Ctypes
• Writing bindings in C/C++ (https://docs.python.org/3/extending/extending.html/)
Profiling: cProfile, SnakeViz, VTune (intel) etc• cProfile: https://docs.python.org/3/library/profile.html
• SankeViz: https://jiffyclub.github.io/snakeviz/
• VTune: https://software.intel.com/content/www/us/en/develop/tools/vtune-profiler.html
• module load Vtune
• Check speed (time), calls (frequency), memory
Parallelization: numba - parallel• Automatic parallelization with numba
• Very easy to use - You need only one line decorator: @njit(paralle=True)
• More information:
• https://numba.pydata.org/numba-doc/latest/user/parallel.html
Parallelization: numba - cuda
• Only works with NVIDIA GPU cards
• Easy to use (at least much easier to use than other languages).
• https://github.com/keipertk/pygpu-workshop
Parallelization: MPI• MPI
• It is the HPC paramdigm for inter-process communications
• MPI makes full use of HPC envirionments
• Well-supported tools exist
• Python-MPI bindings have been developed since 1996
Parallelization: MPI - mpi4py• mpi4py
• Pythonic wrapping of the system’s native MPI
• Provides almost all MPI-1, 2 and common MPI-3 features
• Very well maintained
• Distributed with major Python distributions
• Portalbe and scalable
• Requires only - NumPy, Cython (build only), and MPI library
• https://mpi4py.readthedocs.io/en/stable/#
More Resources• Getting help:
• Office hrs: Mon/Thur 1-2PM
• https://icer.msu.edu/contact
• Documentation
• https://wiki.hpcc.msu.edu