ou supercomputing center for education & research henry neeman, director ou supercomputing...

39
OU Supercomputing OU Supercomputing Center for Center for Education & Research Education & Research Henry Neeman, Director OU Supercomputing Center for Education & Research OU Information Technology University of Oklahoma

Upload: amie-mcbride

Post on 11-Jan-2016

227 views

Category:

Documents


1 download

TRANSCRIPT

OU Supercomputing OU Supercomputing Center forCenter for

Education & ResearchEducation & ResearchHenry Neeman, Director

OU Supercomputing Center for Education & ResearchOU Information Technology

University of Oklahoma

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 2

People

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 3

Things

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 4

What is Supercomputing?Supercomputing is the biggest, fastest computing

right this minute.Likewise, a supercomputer is one of the biggest,

fastest computers right this minute.So, the definition of supercomputing is constantly

changing.Rule of Thumb: a supercomputer is typically at

least 100 times as powerful as a PC.Jargon: supercomputing is also called High

Performance Computing (HPC).

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 5

What is Supercomputing About?

Size Speed

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 6

What is Supercomputing About? Size: many problems that are interesting to scientists

and engineers can’t fit on a PC – usually because they need more than a few GB of RAM, or more than a few 100 GB of disk.

Speed: many problems that are interesting to scientists and engineers would take a very very long time to run on a PC: months or even years. But a problem that would take a month on a PC might take only a few hours on a supercomputer.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 7

What is HPC Used For? Simulation of physical phenomena, such as

Weather forecasting Galaxy formation Oil reservoir management

Data mining: finding needles of information in a haystack of data, such as

Gene sequencing Signal processing Detecting storms that could produce tornados

Visualization: turning a vast sea of data into pictures that a scientist can understand

Moore, OKTornadic

Storm

May 3 1999[2]

[3]

[1]

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 8

What is OSCER? Multidisciplinary center Division of OU Information Technology Provides:

Supercomputing education Supercomputing expertise Supercomputing resources: hardware, storage, software

For: Undergrad students Grad students Staff Faculty Their collaborators (including off campus)

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 9

Who is OSCER? Academic Depts Aerospace & Mechanical Engr Biochemistry & Molecular Biology Biological Survey Botany & Microbiology Chemical, Biological & Materials Engr Chemistry & Biochemistry Civil Engr & Environmental Science Computer Science Economics Electrical & Computer Engr Finance History of Science

Industrial Engr Geography Geology & Geophysics Library & Information Studies Mathematics Meteorology Petroleum & Geological Engr Physics & Astronomy Radiological Sciences Surgery Zoology

More than 150 faculty & staff in 23 depts in Colleges of Arts & Sciences, Business, Engineering, Geosciences and Medicine – with more to come!

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 10

Who is OSCER? Organizations Advanced Center for Genome

Technology Center for Analysis & Prediction of

Storms Center for Aircraft & Systems/Support

Infrastructure Cooperative Institute for Mesoscale

Meteorological Studies Center for Engineering Optimization OU Information Technology Fears Structural Engineering

Laboratory Geosciences Computing Network Great Plains Network Human Technology Interaction Center Institute of Exploration &

Development Geosciences

Instructional Development Program Laboratory for Robotic Intelligence and

Machine Learning Langston University Mathematics

Dept Microarray Core Facility National Severe Storms Laboratory NOAA Storm Prediction Center OU Office of the VP for Research Oklahoma Climatological Survey Oklahoma EPSCoR Oklahoma Medical Research

Foundation Oklahoma School of Science & Math St. Gregory’s University Physics Dept Sarkeys Energy Center Sasaki Applied Meteorology Research

Institute

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 11

Center for Analysis & Prediction of Storms: daily real time weather forecasting

Oklahoma Center for High Energy Physics: particle physics simulation and data analysis using Grid computing

Biggest Consumers

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 12

Who Are the Users?Over 225 users so far: over 50 OU faculty over 50 OU staff over 100 students about 20 off campus users … more being added every month.Comparison: National Center for Supercomputing

Applications (NCSA), after 20 years of history and hundreds of millions in expenditures, has about 2100 users.*

* Unique usernames on cu.ncsa.uiuc.edu and tungsten.ncsa.uiuc.edu

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 13

What Does OSCER Do? Teaching

Science and engineering faculty from all over America learnsupercomputing at OU by playing with a jigsaw puzzle (NCSI @ OU 2004).

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 14

What Does OSCER Do? Rounds

OU undergrads, grad students, staff and faculty learnhow to use supercomputing in their specific research.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 15

Current OSCER Hardware Aspen Systems Pentium4 Xeon 32-bit Linux Cluster

270 Pentium4 Xeon CPUs, 270 GB RAM, 1.08 TFLOPs Aspen Systems Itanium2 cluster

66 Itanium2 CPUs, 132 GB RAM, 264 GFLOPs IBM Regatta p690 Symmetric Multiprocessor

32 POWER4 CPUs, 32 GB RAM, 140.8 GFLOPs IBM FAStT500 FiberChannel-1 Disk Server Qualstar TLS-412300 Tape Library

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 16

Coming OSCER Hardware (2005) NEW! Dell Pentium4 Xeon 64-bit Linux Cluster

1024 Pentium4 Xeon CPUs, 2240 GB RAM, 6.55 TFLOPs Aspen Systems Itanium2 cluster

66 Itanium2 CPUs, 132 GB RAM, 264 GFLOPs COMING! 2 x 16-way Opteron Cluster

16 AMD Opteron CPUs, 96 GB RAM, 128 GFLOPs COMING! Condor Pool: 750 student lab PCs COMING! National Lambda Rail Qualstar TLS-412300 Tape Library

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 17

Hardware: IBM p690 Regatta32 POWER4 CPUs (1.1 GHz)

32 GB RAM

218 GB internal disk

OS: AIX 5.1

Peak speed: 140.8 GFLOP/s*

Programming model: shared memory multithreading (OpenMP) (also supports MPI)*GFLOP/s: billion floating point

operations per second

sooner.oscer.ou.edusooner.oscer.ou.edu

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 18

270 Pentium4 XeonDP CPUs270 GB RAM~10,000 GB diskOS: Red Hat Linux Enterprise 3Peak speed: 1.08 TFLOP/s*

Programming model: distributed multiprocessing (MPI)*TFLOP/s: trillion floating point

operations per second

Hardware: Pentium4 Xeon Cluster

boomer.oscer.ou.eduboomer.oscer.ou.edu

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 19

56 Itanium2 1.0 GHz CPUs112 GB RAM5,774 GB diskOS: Red Hat Linux Enterprise 3Peak speed: 224 GFLOP/s*

Programming model: distributed multiprocessing (MPI)*GFLOP/s: billion floating point

operations per second

Hardware: Itanium2 Cluster

schooner.oscer.ou.eduschooner.oscer.ou.edu

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 20

1,024 Pentium4 Xeon CPUs2,240 GB RAM20,000 GB diskInfiniband & Gigabit EthernetOS: Red Hat Linux Enterp 3Peak speed: 6.5 TFLOPs*

Programming model: distributed multiprocessing (MPI)*TFLOPs: trillion calculations per sec

New! Pentium4 Xeon Cluster

topdawg.oscer.ou.edutopdawg.oscer.ou.eduDEBUTED AT #54 WORLDWIDE, #9 AMONG US UNIVERSITIES, #4 EXCLUDING BIG 3 NSF CENTERSwww.top500.org

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 21

Coming! National Lambda RailThe National Lambda Rail (NLR) is the next

generation of high performance networking.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 22

Coming! Condor Pool

Condor is a software package that allows number crunching jobs to run on idle desktop PCs.

OU IT is deploying a large Condor pool (750 desktop PCs) over the course of the Spring 2005.

When deployed, it’ll provide a huge amount of additional computing power – more than is currently available in all of OSCER today.

And, the cost is very very low.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 23

What is Condor?

Condor is grid computing technology: it steals compute cycles from existing desktop PCs; it runs in background when no one is logged in.Condor is like SETI@home, but better: it’s general purpose and can work for any

“loosely coupled” application; it can do all of its I/O over the network, not using

the desktop PC’s disk; it can use academic research community’s Grid

middleware such as Globus, but it doesn’t have to.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 24

Supercomputing at Night

Desktop PCs tend to be very active during the workday.

But at night, during most of the year, they’re idle.

So they can be used for number crunching.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 25

Condor

Condor is like SETI@home: it steals compute cycles from existing desktop PCs; it runs in background when no one is sitting at the

desk.Condor is better than SETI@home: it’s general purpose and can work for any

loosely coupled application; it can do all of its I/O over the network, not using

the desktop PC’s local disk.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 26

How is Condor Helpful? Low cost: either

nothing (if you have many Linux PCs), OR very little (if you have many Windows PCs).

Repurpose idle time on existing desktop PCs – you’ve already paid for the hardware, and most of the software is FREE!

Enable research that involves lots of computing. Share multiple independent Condor pools among

various institutions (flocking). It provides a quick Grid computing resource, to

get regional campuses used to Grids.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 27

Windows vs. Linux

Condor runs best on Unix flavors (e.g., Linux). The Windows version is clipped: it doesn’t support

automatic checkpointing or automatic job migration. However, desktop PC users demand a pure

Windows experience. So, to use Condor in OU IT student PC labs, we

need Windows and Linux at the same time. Solution: VMware

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 28

VMware

VMware is virtual machine software that allows a host OS and one or more guest OSes.

VMware is commercial software, not free. For Condor, OU runs Linux as the host OS and

Windows as the guest OS. This way, Condor can run directly on Linux and

therefore have the maximal set of features, but desktop users can have a pure Windows experience.

The lab PCs are configured to boot directly to the Windows login page, so desktop users never know.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 29

Linux/VMware/Windows/Condor

LinuxVMwareCondor

Windows

Desktop Applications

Science Applications

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 30

But I Don’t Want to Manage Linux!

WindowsVMware

Desktop Applications

Condor

Science Applications

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 31

Condor Setup at OU

LabPCs

Condor Users

Mgmt Nodes

FirewallDevice

Login Nodes

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 32

Current Status at OU

Pool of test machines in dorm PC lab Submit/management from Neeman’s desktop PC Rollout to multiple labs during summer Total rollout to 750 PCs by end of 2005

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 33

What Does OU Plan to Do w/Condor?

Loosely coupled problems: many small, independent jobs

High Energy Physics: D-Zero experiment Nanotechnology: Monte Carlo simulation Computational Chemistry: molecular dynamics Aerospace Engineering: parameter space searches ... and many others.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 34

NSF CI-TEAM Program

The NSF Cyberinfrastructure TEAM program is a brand new program.

It is providing grants of up to $250,000 for up to 2 years.

One of CI-TEAM’s goals is to expand Cyberinfrastructure – for example, supercomputing – to institutions and people that traditionally haven’t had much access.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 35

Our NSF CI-TEAM Project

The University of Oklahoma (OU) is leading an NSF CI-TEAM project, submitted May 27 2005.

The focus is on setting up Condor pools across the Great Plains region, and beyond.

The kickstart application will be BLAST (bioinformatics), but these Condor pools will be available for any appropriate application.

Most of the money in OU’s CI-TEAM proposal will go to institutions other than OU, for VMware licenses and PCs to manage the Condor pools.

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 36

CI-TEAM Participants So Far At OU

OSCER/IT Arts & Sciences: Botany & Microbiology; Chemistry & Biochemistry;

Mathematics; Physics & Astronomy; Zoology Engineering: Aerospace & Mechanical Engineering; Civil Engineering &

Environmental Science; Chemical, Biological & Materials Engineering; Computer Science; Electrical & Computer Engineering, Industrial Engineering

Medicine: Surgery, Radiological Sciences Other Academic Institutions in Oklahoma: Langston U. (minority

serving), Oklahoma Baptist U. (4 year), Oklahoma School of Science & Mathematics (high school), St. Gregory’s U. (4 year), U. Central Oklahoma (Masters-granting)

Academic Institutions outside Oklahoma: Contra Costa College of CA (2 year), Kansas State U. (PhD), U. Arkansas Fayetteville (PhD), U. Arkansas Little Rock (PhD), U. Kansas (PhD), U. Nebraska (PhD), U. Northern Iowa (Masters)

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 37

An Added Bonus

OSCER’s user eligibility policy is that anyone can have access to OSCER’s supercomputers, if they are on a project that has an OU faculty or staff member as the Principal or Co-Principal Investigator.

So, as a bonus, everyone at participating institutions who is on the CI-TEAM project – and their students – will get FREE access to OSCER’s supercomputers!

OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 38

Join our CI-TEAM!

If OU’s CI-TEAM proposal is fully funded, it will create one of largest Condor flocks in the world: almost 4,000 PCs.

YOU CAN JOIN US! All you need is to commit some PCs – you create

your own Condor pool with OU’s help, and then let the rest of the CI-TEAM institutions use them.

CI-TEAM will pay for the VMware licenses, and for small schools also a PC as Condor manager.

QUESTIONS?

Thank youfor your attention.