gpu performance prediction greenlight education & outreach summer workshop ucsd. la jolla,...

25
GPU Performance Prediction GreenLight Education & Outreach Summer Workshop UCSD. La Jolla, California. July 1 – 2, 2009. Javier Delgado Gabriel Gazolla Constantinos Menelaou Lixi Wang Mark Joselli

Upload: toby-floyd

Post on 28-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

GPU Performance Prediction

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Javier DelgadoGabriel Gazolla

Constantinos MenelaouLixi Wang

Mark Joselli

Outline

Motivation Role in Energy Efficiency Performance Modeling GPU programming for Weather Modeling GPU Programming for BLAST Model Testing Conclusion

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Benefits

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

GPU Performance Improvement Over Time

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Source: nVidia.com

Sample Speedups

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Source: nVidia.com

Outline

Motivation Role in Energy Efficiency Performance Modeling GPU programming for Weather Modeling GPU Programming for BLAST Model Testing Conclusion

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Role in Energy Efficiency

Idle GPU = wasted energy Maximally-loaded GPU = a lot of power

consumption For example

Nvidia 8800 GTX consumes 137W @ max load Intel Xeon LS5400 consumes 50W @ max load

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Source: http://mark.zoomcities.com/images/gfx/GFXpowerchartby3d.png (which is derived from data from http://www.xbitlabs.com)

Power Consumption

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

http://www.xbitlabs.com/articles/video/display/gf8800gts320MB-roundup_8.html#sect0

http://www.xbitlabs.com/articles/video/display/xfx-gf-gtx285-gtx295_16.html

GPU Role in Energy Efficiency

But...

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Source: John Michalakes and Manish Vachharajani

• And ...

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Outline

Motivation Role in Energy Efficiency Hurricane Mitigation Overview Performance Modeling GPU Programming for BLAST Model Testing Conclusion

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Motivation

Hurricanes cost coastal regions financial and personal damage

Damage can be mitigated, but

Impact area prediction is inaccurate

Simulation using commodity computers is not precise

Alarming Statistics

40% of (small-medium sized) companies shut down within 36 months,

if forced closed for 3 or more days after a hurricane

Local communities lose jobs and hundreds of millions of dollars to their

economy

If 5% of businesses in South Florida recover one week earlier,

then we can prevent $219,300,000 in non-property economic

losses

Hurricane Andrew, Florida 1992 Katrina, New Orleans 2005 Ike, Cuba 2008

Outline

Motivation Role in Energy Efficiency Hurricane Mitigation Overview Performance Modeling GPU Programming for BLAST Model Testing Conclusion

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Motivation for application profiling and performance

prediction Optimal usage of grid resources through “smarter”

meta-scheduling Many users overestimate job requirements Reduced idle time for compute resources Save utility and energy costs Optimal resource selection for most expedient job

return time

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Process

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Typical Results on Large Clusters

Input: Marenostrum– 8, 16, and 32 nodes– 1 process per node

Output: Marenostrum– 8, 16, 32, 64, 96,

and 128 nodes

0 20 40 60 80 100 120 140

0

200

400

600

800

1000

1200

Actual Execution Time (s)Predicted Execution Time (s)

Number of Nodes

Exe

cutio

n T

ime

(s)

Future Modeling Plans

Model execution time with different GPU configurations

Current GPU project objective: learn how to model GPU performance by porting WRF kernels to CUDA Test with different cards Test with different processor configurations Test with different number of nodes

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Overview of GPU Benchmarking Project

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Understand Source code of existing CUDA-ported code

Understand old source code (Fortran)

Learn CUDA

Port another module

Benchmark

Learn WRF

Learn WRF

Learn CUDA

Learn Fortran

Status

Code has been compiled and executed Regions of similarity are being identified

– Fortran Program: 1729 lines

– CUDA (C) Program: 1329 lines (incl init) Currently figuring out necessary code logic of

existing ported kernel Preliminary documentation/report of findings

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Outline

Motivation Role in Energy Efficiency Hurricane Mitigation Overview Performance Modeling GPU Programming for BLAST Model Testing Conclusion

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Purpose

BLAST used extensively for sequence analysis Provides a different kind of application for

testing GPU performance improvements Further improve our GPU programming and

performance modeling knowledge

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Status

Literature review concerning other sequence analysis work with GPU

Learning how BLAST works

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Long-running, Fault-tolerant Weather Prediction

Slight inaccuracies in initial conditions of domain can cause significant inaccuracies later

Third component of this project: account for this using perturbation analysis

The effects of perturbation on runtime must also be modeled

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Conclusion

GPU’s promise much faster job execution for different applications

In order to maximize resource utilization, application execution time should be predictable Especially for time-critical applications that take long

to execute

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.

Thank You

Questions?

GreenLight Education & Outreach Summer WorkshopUCSD. La Jolla, California. July 1 – 2, 2009.