the 2d ising model on gpu clusters

13
The 2D Ising Model on GPU Clusters Benjamin Block University of Mainz, Institute for Physics Thanks to: Tobias Preis, Peter Virnau

Upload: benjamin-block

Post on 11-May-2015

487 views

Category:

Education


3 download

DESCRIPTION

This talk was given by me at the Spring Meeting 2010 of the DPG at Regensburg today, in the division "Dynamics and Statistical Physics".

TRANSCRIPT

Page 1: The 2D Ising Model on GPU Clusters

The 2D Ising Model on GPU Clusters

Benjamin BlockUniversity of Mainz, Institute for Physics

Thanks to: Tobias Preis, Peter Virnau

Page 2: The 2D Ising Model on GPU Clusters

Overview

• GPUs: Optimized for massively parallel processing

• Previous work: GPU Accelerated Ising Model

• Architecture specific optimization

• GPU clusters begin to establish – Multi GPU implementation useful

T. Preis, P. Virnau, W. Paul, J. J. Schneider:GPU Accelerated Monte Carlo Simulation of the 2D and 3D Ising Model, J. Comp. Phys., 228 (2009)

Page 3: The 2D Ising Model on GPU Clusters

Ising Model (Ferromagnetism)

T >> TC T ~ TC T << TC

Lattice of spins

Page 4: The 2D Ising Model on GPU Clusters

Metropolis Monte Carlo

Perform successive spin flips!

Probability: Metropolis criterion

Page 5: The 2D Ising Model on GPU Clusters

Parallelization of Metropolis Updates

Idea: Update non-interacting domains in parallel

Checkerboard Update

Page 6: The 2D Ising Model on GPU Clusters

Programming the GPU

Slowglobal

memory

Fastshared

memory

Store spin lattice

Use for local computations

Execute the same code for different data in parallel

Utilize different kinds of memory

Page 7: The 2D Ising Model on GPU Clusters

Reduce slow memory access

Slowglobal

memory

Fastshared

memory

Idea: Store 4x4 spin blocks in 1 unit of GPU memory

For each parallel thread

Access 16 spins with one memory lookup

Perform local computations in (fast) shared memory

Page 8: The 2D Ising Model on GPU Clusters

XOR

Update scheme in shared memoryInteger array in shared memory

Perform Computations

(draw random number, evaluate

Metropolis criterion)

Old spins New spinsUpdate pattern

=

Page 9: The 2D Ising Model on GPU Clusters

Performance measurement

CPU previous

CPU optimized

GPUprevious

GPU optimized

Fair comparison:

Heavily optimized CPU implementation

How to measure performance?Single spin flips per time unit!

~ 20x

~ 200x

Page 10: The 2D Ising Model on GPU Clusters

Multi GPU communication

Distribute spin lattice among many GPUs

Border information has to be passed between GPUs after each complete update step

Page 11: The 2D Ising Model on GPU Clusters

Multi-GPU Performance

Measure: Single spin flips per GPU

Communication overhead

Bottleneck forsmall system sizes

Page 12: The 2D Ising Model on GPU Clusters

Simulation on GPU Clusters

• On 64 GPUs: 256 GB video memory!

• A lattice of 800.000 x 800.000 spins could be processed.

• Processing the whole lattice on 64 GPUs: 3 seconds!

Tesla S1070 UnitAt NEC Nehalem Cluster Stuttgart

128 GPUs

Page 13: The 2D Ising Model on GPU Clusters

Conclusion

• Optimization is important (CPU and GPU) for fair comparison

• The 2D Ising model is a good candidate for parallel processing on GPU clusters

• Submitted to be published in Computer Physics Communications

• Source code will be made available at www.tobiaspreis.de