computational fluid large problems dynamics computer...

9
Computational Fluid Dynamics Computational Fluid Dynamics http://www.nd.edu/~gtryggva/CFD-Course/ Grétar Tryggvason Lecture 26 May 1, 2017 Computational Fluid Dynamics Large Problems & Computer Science Issues http://www.nd.edu/~gtryggva/CFD-Course/ Computational Fluid Dynamics As physical problems of interest become more complex and codes grow larger, the importance of software management becomes critical to successful development and maintenance. Tools to streamline the development process have a long history, but have in recent years taken on an even greater role. Indeed, the expectation is that all but the simplest software projects today include tools for version control, regression testing, continuous integration, issue tracking, documentation, etc. http://www.nd.edu/~gtryggva/CFD-Course/ Computational Fluid Dynamics http://www.nd.edu/~gtryggva/CFD-Course/ Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later. Example Git Adopted from: http://git-scm.com/book/en/v2/Getting-Started-About- Version-Control Continuous integration is the practice of merging all developer working copies with a shared mainline several times a day. Example: Jenkins Adopted from: en.wikipedia.org/wiki/Continuous_integration Jenkins has a Git plugin, integrating continuous integration and version control Computational Fluid Dynamics http://www.nd.edu/~gtryggva/CFD-Course/ A documentation generator is a programming tool that generates software documentation intended for programmers or end users, or both, from a set of specially commented source code files, and in some cases, binary files. Example Doxygen Adopted from: http://en.wikipedia.org/wiki/Documentation_generator An issue tracking system is a computer software package that manages and maintains lists of issues, as needed by an organization. Adopted from: http://en.wikipedia.org/wiki/Issue_tracking_system In all cases several software packages, most with different capabilities and features are available Computational Fluid Dynamics Regression testing is a type of software testing that seeks to uncover new software bugs, or regressions, in existing functional and non-functional areas of a system after changes such as enhancements, patches or configuration changes, have been made to them. The intent of regression testing is to ensure that changes such as those mentioned above have not introduced new faults. One of the main reasons for regression testing is to determine whether a change in one part of the software affects other parts of the software. Adopted from: http://en.wikipedia.org/wiki/Regression_testing http://www.nd.edu/~gtryggva/CFD-Course/

Upload: phungdieu

Post on 28-Jun-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Computational Fluid Large Problems Dynamics Computer ...gtryggva/CFD-Course2017/Lecture-26-2017.pdf · Corsera Course on High Performance Scientific Computing Programming-oriented

Computational Fluid Dynamics

Computational Fluid Dynamics

http://www.nd.edu/~gtryggva/CFD-Course/

Grétar Tryggvason

Lecture 26May 1, 2017

Computational Fluid Dynamics

Large Problems&

Computer Science Issues

http://www.nd.edu/~gtryggva/CFD-Course/

Computational Fluid Dynamics

As physical problems of interest become more complex and codes grow larger, the importance of software management becomes critical to successful development and maintenance.

Tools to streamline the development process have a long history, but have in recent years taken on an even greater role. Indeed, the expectation is that all but the simplest software projects today include tools for version control, regression testing, continuous integration, issue tracking, documentation, etc.

http://www.nd.edu/~gtryggva/CFD-Course/Computational Fluid Dynamicshttp://www.nd.edu/~gtryggva/CFD-Course/

Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later. Example GitAdopted from: http://git-scm.com/book/en/v2/Getting-Started-About-Version-Control

Continuous integration is the practice of merging all developer working copies with a shared mainline several times a day. Example: JenkinsAdopted from: en.wikipedia.org/wiki/Continuous_integration

Jenkins has a Git plugin, integrating continuous integration and version control

Computational Fluid Dynamicshttp://www.nd.edu/~gtryggva/CFD-Course/

A documentation generator is a programming tool that generates software documentation intended for programmers or end users, or both, from a set of specially commented source code files, and in some cases, binary files. Example DoxygenAdopted from: http://en.wikipedia.org/wiki/Documentation_generator

An issue tracking system is a computer software package that manages and maintains lists of issues, as needed by an organization.Adopted from: http://en.wikipedia.org/wiki/Issue_tracking_system

In all cases several software packages, most with different capabilities and features are available

Computational Fluid Dynamics

Regression testing is a type of software testing that seeks to uncover new software bugs, or regressions, in existing functional and non-functional areas of a system after changes such as enhancements, patches or configuration changes, have been made to them.

The intent of regression testing is to ensure that changes such as those mentioned above have not introduced new faults. One of the main reasons for regression testing is to determine whether a change in one part of the software affects other parts of the software.Adopted from: http://en.wikipedia.org/wiki/Regression_testing

http://www.nd.edu/~gtryggva/CFD-Course/

Page 2: Computational Fluid Large Problems Dynamics Computer ...gtryggva/CFD-Course2017/Lecture-26-2017.pdf · Corsera Course on High Performance Scientific Computing Programming-oriented

Computational Fluid Dynamicshttp://www.nd.edu/~gtryggva/CFD-Course/

http://en.wikipedia.org/wiki/Git_(software)

http://en.wikipedia.org/wiki/Git_(software)

Git is a distributed revision control system with an emphasis on speed, data integrity, and support for distributed, non-linear workflows. Git was initially designed and developed by Linus Torvalds for Linux kernel development in 2005, and has since become the most widely adopted version control system for software development.

As with most other distributed revision control systems, and unlike most client–server systems, every Git working directory is a full-fledged repository with complete history and full version-tracking capabilities, independent of network access or a central server.

Computational Fluid Dynamicshttp://www.nd.edu/~gtryggva/CFD-Course/

http://en.wikipedia.org/wiki/Darcs

Darcs is a distributed revision control system created by David Roundy. Key features include the ability to choose which changes to accept from other repositories, interaction with either other local (on-disk) repositories or remote repositories via SSH, HTTP, or email, and an unusually interactive interface. The developers also emphasize the use of advanced software tools for verifying correctness: the expressive type system of the functional programming language Haskell enforces some properties, and randomized testing via QuickCheck verifies many others. The name is a recursive acronym for Darcs Advanced Revision Control System

Computational Fluid Dynamicshttp://www.nd.edu/~gtryggva/CFD-Course/

http://en.wikipedia.org/wiki/Jenkins_(software)

Jenkins provides continuous integration services for software development. It is a server-based system running in a servlet container such as Apache Tomcat. It supports SCM tools including AccuRev, CVS, Subversion, Git, Mercurial, Perforce, Clearcase and RTC, and can execute Apache Ant and Apache Maven based projects as well as arbitrary shell scripts and Windows batch commands. The primary developer of Jenkins is Kohsuke Kawaguchi.

Builds can be started by various means, including being triggered by commit in a version control system, scheduling via a cron-like mechanism, building when other builds have completed, and by requesting a specific build URL.

Computational Fluid Dynamics

Doxygen is a documentation generator, a tool for writing software reference documentation. The documentation is written within code, and is thus relatively easy to keep up to date. Doxygen can cross reference documentation and code, so that the reader of a document can easily refer to the actual code.

Doxygen supports multiple programming languages, in particular C++, C, C#, Objective-C, Java, Perl, Python, IDL, VHDL, Fortran, Tcl and PHP. Doxygen is free software, released under the terms of the GNU General Public License.

http://www.nd.edu/~gtryggva/CFD-Course/

http://en.wikipedia.org/wiki/Doxygen

Computational Fluid Dynamics

Docker is a software container platform. Developers use Docker to eliminate “works on my machine” problems when collaborating on code with co-workers.

Using containers, everything required to make a piece of software run is packaged into isolated containers. This makes for efficient, lightweight, self-contained systems and guarantees that software will always run the same, regardless of where it’s deployed.

https://www.docker.com/what-docker

http://www.nd.edu/~gtryggva/CFD-Course/

https://en.wikipedia.org/wiki/Docker_(software)

Computational Fluid Dynamicshttp://www.nd.edu/~gtryggva/CFD-Course/

While a number of software development methodologies and tools have been proposed, most of those methodologies and tools have had little impact on the development of scientific software, particularly in academics where most software is developed by graduate students with little formal training in software engineering.

Rightly or wrongly, this has been of considerable concern of funding agencies and government laboratories and every indication is that in the near future there will be much higher expectations for the use of appropriate methodologies and tools for scientific software.

Page 3: Computational Fluid Large Problems Dynamics Computer ...gtryggva/CFD-Course2017/Lecture-26-2017.pdf · Corsera Course on High Performance Scientific Computing Programming-oriented

Computational Fluid Dynamics

Several opportunities exist for training in various aspects of software development. Two of those are:

Software Carpentryhttps://software-carpentry.org

Randy LaVeque’s Course on Scientific Computinghttp://faculty.washington.edu/rjl/classes/am583s2014/

https://www.coursera.org/course/scicomp

http://www.nd.edu/~gtryggva/CFD-Course/Computational Fluid Dynamics

https://software-carpentry.org

Version 5.3Programming the Unix shellVersion Control with GitVersion Control with MercurialUsing Databases and SQLProgramming with PythonProgramming with RProgramming with MATLAB

Version 4.0 contains some material not included in version 5.0, such as:

Make

http://www.nd.edu/~gtryggva/CFD-Course/

Computational Fluid Dynamics

Corsera Course on High Performance Scientific Computing

Programming-oriented course on effectively using modern computers to solve scientific computing problems arising in the physical/engineering sciences and other fields. Provides an introduction to efficient serial and parallel computing using Fortran 90, OpenMP, MPI, and Python, and software development tools such as version control, Makefiles, and debugging.

http://www.nd.edu/~gtryggva/CFD-Course/Computational Fluid Dynamics

https://software-carpentry.org

http://www.nd.edu/~gtryggva/CFD-Course/

Computational Fluid Dynamics

Large-Scale Simulations on Parallel Computers

http://www.nd.edu/~gtryggva/CFD-Course/Computational Fluid Dynamics

OutlineBasic Machine configurationsParallelizationThe Message Passing Interface (MPI) library

Parallel computing has become the way to increase computer power. All of the worlds fastest computers are massively parallel (see http://www.top500.org) and parallel computers are becoming common in industry and at universities.

Page 4: Computational Fluid Large Problems Dynamics Computer ...gtryggva/CFD-Course2017/Lecture-26-2017.pdf · Corsera Course on High Performance Scientific Computing Programming-oriented

Computational Fluid Dynamics

Machine configurations

SISD: Single Instruction, Single Data

SIMD: Single Instruction, Multiple Data

MIMD: Multiple Instruction, Multiple Data

Computational Fluid Dynamics

Early SIMD: Pipeline and Vector Architectures

Elements of a vector are processed simultaneously

Each operation is broken down into elementary steps performed by each processor

Computational Fluid Dynamics

MIMD: Multiple Instruction, Multiple Data

Shared Memory MIMD

CPU

Memory Memory Memory

CPU CPU

Interconnection Network

…..

…..

Computational Fluid Dynamics

MIMD: Multiple Instruction, Multiple Data

Distributed Memory MIMD

Interconnection Network

…..CPU Memory CPU Memory CPU Memory

Computational Fluid Dynamics

Increasingly, computers come in hybrid configurations with multiple cores, sharing memory, on each node

Interconnection Network

…..CPU

MemoryCPUCPUCPU

CPU

MemoryCPUCPUCPU

CPU

MemoryCPUCPUCPU

Computational Fluid Dynamics

MIMD: Multiple Instruction, Multiple DataDistributed Memory MIMD: Static Interconnection networks

Fully connected interconnection network

3D hypercube

The interconnection network can be either static or dynamic and different vendors have elected to go with different configurations.

Page 5: Computational Fluid Large Problems Dynamics Computer ...gtryggva/CFD-Course2017/Lecture-26-2017.pdf · Corsera Course on High Performance Scientific Computing Programming-oriented

Computational Fluid Dynamics

Parallelizing the solution to the heat equation

An Example

Computational Fluid Dynamics

∂f∂t

= α ∂ 2 f∂x 2

+ ∂ 2 f∂y 2

⎝ ⎜

⎠ ⎟

It used to be unlikely that this would fit into the RAM of one node and we therefore had to divide the problem between several processors

Assume we want to solve this on a very large grid, say 20,000 by 20,000. The storage requirement is: 4 × 108 × 8 × 2 =6.4 GB

Grid points

Bytes per number

Numbers per node

Computational Fluid Dynamics

Domain Decomposition

i

j

NX

NY

00

Computational Fluid Dynamics

Domain Decomposition

Computational Fluid Dynamics

Pr 1 Pr 3 Pr 5

Pr 2 Pr 4 Pr 6

Domain Decomposition

sx ex

ey

sy

Computational Fluid Dynamics

Usually the grids must overlap

Proc 1

Proc 2

Update interior points

Update interior points

Swap boundary data

Page 6: Computational Fluid Large Problems Dynamics Computer ...gtryggva/CFD-Course2017/Lecture-26-2017.pdf · Corsera Course on High Performance Scientific Computing Programming-oriented

Computational Fluid Dynamics

For 2D and 3D problems it is necessary to exchange lines and planes of data

Each processor updates the portion of the grid that resides on that node and swaps data with its neighbors

Computational Fluid Dynamics

c Advance f in time subroutine adv_in_time( f, fo,b,sx,ex,sy,ey,h,dt) integer sx,ex,sy,ey,i,j double precision f(sx-1:ex+1,sy-1:ey+1), fo(sx-1:ex+1,sy-1:ey+1), & b(sx-1:ex+1,sy-1:ey+1) do 10 j=sy, ey do 10 i=sx, ex fo(i,j) = f(i,j) 10 continue do 10 j=sy, ey do 10 i=sx, ex f(i,j) = fo(i,j) +(dt/h*h)* (fo(i-1,j)+fo(i,j+1)+fo(i,j-1)+fo(i+1,j) -4.0*fo(i,j)) - h*h*b(i,j) 10 continue return end

Computational Fluid Dynamics

For the complete problem we must determine:

How the processors are connectedHow to prepare the data to be sent if it is not continuous in memoryHow to transfer the data in the right order so the data is there when needed

Computational Fluid Dynamics

In Message Passing, the processors explicitly send and receive data

Generic form:Send(address, length, destination, tag)

Receive(address, length, source, tag, actlen)actlen: length of the message received

In actual implementations the arguments are slightly more complex, due to the need to deal with data that is not continuous in memory

Computational Fluid Dynamics

Initialize parallelizationDetermine setup, including number of processors and connectivityIf master, determine sizeBroadcast parametersDo itime=1,MaxSteps

Swap dataAdvance f

end doGather data/printEnd parallelization

The structure of the program

Computational Fluid Dynamics

Speedup = ---------------------------Time for 1 processorTime for p processor

Perfect speedup

Speedup = ------ =nTT/n

Measuring how successful the parallelization is:

Usually the performance degrades as n increases

Page 7: Computational Fluid Large Problems Dynamics Computer ...gtryggva/CFD-Course2017/Lecture-26-2017.pdf · Corsera Course on High Performance Scientific Computing Programming-oriented

Computational Fluid Dynamics

Weak Scaling versus Strong Scaling:Often we are more interested in using additional processors to allow us to do a larger problem than doing a problem of a given size faster.

Strong scaling: Time for a fixed problem size as the number of processors is increases

Weak scaling: Time for fixed work per processor as the number of processors is increases

Computational Fluid Dynamics

MPI: Message Passing Interface

Computational Fluid Dynamics

What it is:•  MPI is a library, not a language. It consists of subroutines that are called from FORTRAN, C, or C++ programs to facilitate parallelization of programs

•  MPI is a specification, not a particular implementation. All parallel computer vendors currently offer MPI implementation for their machines

•  MPI is designed for the message passing model

Computational Fluid Dynamics

program main include "mpif.h” call MPI_INIT( ierr ) call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr ) call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr )

call MPI_FINALIZE(ierr)

Initialize MPI:

End MPI:

Actual code

Identity of each processor

Number of Processor

used

General structure of an MPI parallel program

Computational Fluid Dynamics

Run the same program on all processors. Set up problem on master processor by:

if ( myid .eq. 0 ) then Set up problem end

The other processors are referred to as “slaves” or “workers”

Computational Fluid Dynamics

Exchanging dataData transfer needs to be synchronized

Blocking SendOrdered SendSendrecvBuffered SendNoblock Isend

Page 8: Computational Fluid Large Problems Dynamics Computer ...gtryggva/CFD-Course2017/Lecture-26-2017.pdf · Corsera Course on High Performance Scientific Computing Programming-oriented

Computational Fluid Dynamics

Send information to all processors (broadcast)

call MPI_BCAST(data, brows, MPI_DOUBLE_PRECISION, & master, MPI_COMM_WORLD, ierr)

call MPI_SENDRECV(a(sx,ey),nx,MPI_DOUBLE_PRECISION, & nbrtop, 0, & a(sx,sy-1), nx, MPI_DOUBLE_PRECISION, & nbrbottom, 0, comm2d, status, ierr )

Send and receive information

Computational Fluid Dynamics

Set up a “communicator” for a cartesian grid:

c Get a new communicator for a decomposition of the domain. c Let MPI find a "good" decomposition c dims(1) = 0 dims(2) = 0 call MPI_DIMS_CREATE( numprocs, 2, dims, ierr ) call MPI_CART_CREATE( MPI_COMM_WORLD, 2, dims, * periods, .true.,comm2d, ierr ) c c Get my position in this communicator c call MPI_COMM_RANK( comm2d, myid, ierr )

Computational Fluid Dynamics

Timing the programt1 = MPI_WTIME()

t2 = MPI_WTIME()

Derived datatypes:When data is not continuous in memory, MPI allows us to set up derived datatype. The simples one is when we send every nth element of a vector

Several graphics based program can be used to analyze the performance of the code

Computational Fluid Dynamics

Using MPI: Portable Parallel Programming with the Message-Passing Interface by William Gropp, Ewing Lusk, and Anthony Skjellum Published in 1999 by MIT Press, 371 pages.

http://www-unix.mcs.anl.gov/mpi/index.html

Computational Fluid Dynamics

OpenMP

Computational Fluid Dynamics

Designed for shared memory machines:

Much of the multithreading is automatic but the user can insert directives such as:

!$OMP PARALLEL PRIVATE(TID)!$OMP PARALLEL DO SHARED(f,fo,it,j) PRIVATE(i)!$OMP END PARALLEL DO

To control the execution of the program

Page 9: Computational Fluid Large Problems Dynamics Computer ...gtryggva/CFD-Course2017/Lecture-26-2017.pdf · Corsera Course on High Performance Scientific Computing Programming-oriented

Computational Fluid Dynamics

https://computing.llnl.gov/tutorials/openMP/

Many resources available

Computational Fluid Dynamics

Hybrid MPI/OpenMP:

Use OpenMP for threads on cores sharing a memory and MPI for the nodes. This results in a much more complex code and MPI is increasingly capable of providing the same functionality

Computational Fluid Dynamics

Programming Graphical Processing Units (GPUs), originally designed for running computer displays are increasingly used for large scale computing. Programming those is still a challenge!

OpenGLhttp://www.glprogramming.com/red/chapter01.html

CUDAhttp://docs.nvidia.com/cuda/index.html

The “standard” ways to parallelize codes builds mostly on a technology developed a long time ago and it seems likely that this will change, driven by new hardware and the need to make programming simpler. In addition to CUDA for GPUs, programming language extensions such as Unified Parallel C (UPC), Coarray Fortran (CAF) and Java Titanium seem to be promising. However, much is likely to happen in the next few years!

Computational Fluid Dynamics

Conclusion

Computational Fluid Dynamics

Processor clock speed has largely leveled off so increasingly processor developers have explored making the processors more complex, such as by increasing the number of cores on a node and adding GPUs

Heterogeneous processor architecture, along with a general trend towards more complex problems, will call for much more sophisticated software to manage the utilization of the hardware.