university of california, san diego san diego supercomputer center fran berman september 11, 2006...

25
UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer Center Professor and High Performance Computing Endowed Chair, UC San Diego Beyond Branscomb

Upload: amelia-moxham

Post on 15-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

September 11, 2006

Dr. Francine Berman

Director, San Diego Supercomputer Center

Professor and High Performance Computing Endowed Chair, UC San Diego

Beyond Branscomb

Page 2: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

The Branscomb Committee• Charge: The Branscomb Committee was to

assess the role of HPC for NSF constituent communities. The Committee focused in particular on 4 challenges

• Challenge 1: How can NSF remove existing barriers to the evolution of HPC and make it broadly usable?

• Challenge 2: How can NSF provide scalable access to a pyramid of computing resources? What balance of computational resources should NSF anticipate and encourage?

• Challenge 3: How should NSF encourage broad participation in HPC?

• Challenge 4: How can NSF best create the intellectual and management leadership for the future of high performance computing in the U.S.? What role should NSF play wrt the HPCC program and other agencies?

The Branscomb Report

TITLE: From Desktop to TeraFlop: Exploiting the U.S. Lead in High Performance Computing

AUTHORS: NSF Blue Ribbon Panel on High Performance Computing(Branscomb, Belytschko, Bridenbaugh, Chay, Dozier, Grest, Hays, Honig, Lane, Lester, McCrae, Sethian, Smith, Vernon

DATE: August, 1993

Page 3: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

The Branscomb Pyramid

• Major Recommendations from the Branscomb Report

• NSF should make investments at all levels of the Branscomb Pyramid as well as investments in aggregating technologies (today’s cluster and grid computing). NSF should make balanced investments.

• Increase support of HPC-oriented SW. algorithm, and model development

• Coordinate and continue to invest in Centers. Develop allocation committees to facilitate use of resources in community.

• Develop an OSTP advisory committee representing states, HPC users, NSF Centers, computer manufacturers, computer and computational scientists to facilitate state-federal planning for HPC.

Page 4: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

The Branscomb Pyramid, circa 1993

Page 5: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

The Branscomb Pyramid, circa 2006

Small-scale, desktop, home

Medium-scale Campus/Commercial

Clusters

Large-scale campus/commercial

resources, Center supercomputers

Leadership Class

100’s of TFs

10’s of TFs

1’s of TFs

10’s of GFs

Page 6: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

The Branscomb Pyramid and U.S. Competitiveness

Small-scale, desktop, home

Medium-scale Campus/Commercial

Clusters

Leader-ship Class

Large-scale resources, center supercomputers

Spots 1-10

Spots 11-50

Spots 51-500

Everyone Else

According to the last Top500 List (June 2006),

• Leadership Class (1-10) – 6 US machines

• 5 machines (1, 3, 4, 6, 9) at DOE national laboratories (LLNL, NASA Ames, Sandia) and 1 machine (2) at a U.S. corporation have spots

• Large-scale (11-50) – 19 US machines

• 3 machines (23, 26, 28) at U.S. academic institutions (IU, USC, Virginia Tech)

• 2 machines (37, 44) at NSF centers (NCSA, SDSC)

• 5 machines (13, 14, 24, 25, 50) at DOE national laboratories (ORNL, LLNL, LANL, PNNL)

• 4 machines (20, 32, 33, 36) at other federal facilities (ERDC MSRC, Wright-Patterson, ARL, NAVOCEANO)

• 5 machines (19, 21, 31, 39, 41) at US corporations (IBM, Geoscience, COLSA)

• Medium-scale (51-500) – 273 US machines• 38 are in the academic sector

Page 7: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Who is Computing on the Branscomb Pyramid?

• Leadership Class (1-10) • DOE users, industry researchers,

Japanese academics and researchers, German and French researchers

• Large-scale (11-50) (5 academic)

• Campus researchers, DOE and government users, industry users

• National open academic community at SDSC, NCSA, IU (around 50 TF in aggregate)

• Medium-scale (51-500) (38 academic)

• Campus researchers, federal agency users, industry users

• National open academic community on TeraGrid (not including above -- around 50 TF in aggregate)

Small-scale, desktop, home

Medium-scale Campus/Commercial

Clusters

Leader-ship Class

Large-scale resources, center supercomputers

Spots 1-10

100’s of TFs

Spots 11-50

10’s of TFs

Spots 51-500

1’s of TFs

Everyone Else, 10’s of GFs

More than 15,000,000 students attend

college.

The number of degrees in Science and

Engineering exceeds 500,000

There are ~2500 accredited institutions of higher education in

the U.S. *

* Ballpark numbers

Page 8: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Competitiveness at all Levels

Small-scale, desktop, home

Medium-scale Campus/Commercial

Clusters

Leader-ship Class

Large-scale resources, center supercomputers

Currently U.S. dominating. Top500 “bragging rights”.

Federal support required

Cost-effective user supported

commercial model

No coordinated approach to national

research infrastructure.

Wide variability in coverage, use, service,

support

Potential for breakthrough “pioneer” computational science

discoveries

IT-literate workforce

Mid-levels the focus of almost all academic and commercial

R&D –lion’s share of new results and discoveries

Page 9: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Balancing Investments in Branscomb

• If HPC is to become the ubiquitous enabler of science and engineering envisioned in Branscomb Report (and every report since), we need to re-focus on providing

• Enough cycles to cover the broad needs of academic researchers and educators on-demand and without high barriers to access

• Usable and scalable software tools with useful documentation

• “You’ve got 1024 processors and you can only smile and wave at them” HPC user

• Professional-class strategy for SW sharing, standards, development environments

Branscomb Recommendations Revisited

NSF should make investments at all levels of the Branscomb Pyramid as well as investments in aggregating technologies (today’s cluster and grid computing).

NSF should make balanced investments.

Increase support of HPC-oriented SW. algorithm, and model development

Coordinate and continue to invest in Centers.

Develop allocation committees to facilitate use of resources in community.

Develop an OSTP advisory committee representing states, HPC users, NSF Centers, computer manufacturers, computer and computational scientists to facilitate state-federal planning for HPC.

Page 10: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Fran’s “No User Left Behind” Initiative

“No User Left Behind” Goal: Sufficient and usable computational resources to support computationally-oriented research and education throughout the U.S. academic community

How (Fran’s 5 step program for computational health)

1. Do market research – what is adequate coverage for the university community? Where are the gaps in coverage in the US?

2. Get creative -- Work with the private sector and universities to develop a program for adequate coverage of computational cycles (we’re doing it with networking to K-12, no reason we can’t do it with computation for 12+)

3. Fund support professionals – every facility should have sys admins and help desk people – they should be part of a national organization which meets to exchange best practices and helps develop standards

4. Raise the bar on SW – private sector should step up and work with academia to improve HPC environments. Professors and grad students cannot provide robust SW tools with adequate documentation and evolutionary support

5. Get serious about data – many HPC applications involve significant data input or output – HPC efforts and data efforts must be coupled

Page 11: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

On the Horizon: Emerging Data Crisis will Increasingly Impact

Computational Users

• More academic, professional, public, and private users use their computers to access data than for computation

• Data management, stewardship and preservation fundamental for new advances and discovery

Astronomy

NVO – 100+ TB

Physics

Projected LHC Data – 10 PB/year

Geosciences

SCEC – 153 TB

Life Sciences

JCSG/SLAC – 15.7 TB

Page 12: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Today’s Applications Cover the Spectrum

Compute (more FLOPS)

Dat

a (m

ore

BY

TE

S)

Home, Lab, Campus, Desktop

Applications

Medium, Large, and Leadership

HPCApplications

Data-oriented Science and Engineering

Applications

Everquest

Quicken

PDB applications

TeraShake

NVO

MolecularModeling

Large-scale data required as input,

intermediate, output for many

modern HPC applications

Applications vary with respect to

how well they can perform in

distributed mode (grid computing)

Analogue of

High

Performance

Computing

(HPC)

is High

Reliability

Data

(HRD)

Page 13: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Applying Branscomb to Data:The Data Pyramid

Facilities

National-scale data repositories, archives, and libraries. Maintained by professionals.High capacity, high reliability

Regional libraries and targeted data centers.

Maintained by professionals.Medium capacity, medium-high

reliability

Private repositories. Supported by users or

their proxies. Low-medium reliability,

low capacity

Target Collections

Reference, nationally important, and irreplaceable data

collections. (PDB, PSID, Shoah, Presidential

Libraries, etc.)

Research and project data collections.

Personal data collections

Regional Scale

Local Scale

NationalScale

Page 14: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Adapting to a Digital World

Emerging commercial opportunities Local Scale

Page 15: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Data Storage for Rent

• Cheap commercial data storage is moving us from a “napster model” (data is accessible and free) to an “iTunes model” (data is accessible and inexpensive)

Page 16: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Amazon S3 (Simple Storage Service)

• Storage for Rent:• Storage is $.15 per GB per month

• $.20 per GB data transfer (to and from)

• Write, read, delete objects containing 1 GB-5GB (number of objects is unlimited), access controlled by user

• For $2.00 +, you can store for one year• Lots of high resolution family photos

• Multiple videos of your children’s recitals

• Personal documentation equivalent to up to 1000 novels, etc.

Should we store the NVO with Amazon S3?

The National Virtual Observatory (NVO) is a critical

reference collection for the astronomy community of data

from the world’s large telescopes and sky surveys.

Page 17: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

A Thought Experiment• What would it cost to store the SDSC NVO

collection (100 TB) on Amazon?

• 100,000 GB X $2 (1 ingest, no accesses + storage for a year) = $200K/year

• 100,000 GB X $3 (1 ingest, average 5 accesses per GB stored + storage for a year) = $300K/year

• Not clear:• How many copies Amazon stores• Whether the format is well-suited for NVO• Whether the usage model would make the costs of data

transfer, ingest, access, etc. infeasible, etc.• If Amazon constitutes a “trusted repository”• What happens to your data when you stop paying, etc.

• What about the CERN LHC collection (10 PB/year)?

• 10,000,000 GB X $2 (1 ingest, no accesses per item + storage for a year) = $20M/year

Page 18: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

The most valuable research data is in the most danger

Universities and libraries can provide greater support but

they need help

Emerging commercial opportunities

Reference and irreplaceable data require

long-term preservation and reliable stewardshipNo real sustainable plan

Regional Scale

Local Scale

NationalScale

Page 19: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Providing Sustainable and Reliable Data Infrastructure Incurs Real Costs

Entity at risk

Size What can go wrong FrequencyMinimum number of replicas needed to mitigate risk

Administrative support FTEs

File ~2 MBCorrupted media, disk failure

1 year2 copies in single system

System Admins

Tape ~200 GB+ Simultaneous failure of 2 copies

5 years3 homogeneous systems

+ Storage Admin

System ~10 TB

+ Systemic errors in vendor SW, or Malicious user, or Operator error that deletes multiple copies

15 years3 independent, heterogeneous systems

+ Database Admin+ Security Admin

Archive ~1 PB+ Natural disaster, obsolescence of standards

50 - 100 years

3 distributed, heterogeneous systems

+ Network Admin+ Data Grid Admin

Less risk means more replicants, more resources, more people

Page 20: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Supporting Long-lived Data: What Happens if We Don’t Preserve Our Most important Reference Collections?

UCSD Libraries

Life sciences research would have the resources available in roughly the 1970’s – no PDB, no Swiss-Prot, no PubMed, Etc.

New discoveries from climate and other predictive simulation models

which utilize longitudinal data would dramatically slow

iTunes would store only current

music, NetFlix would provide only

current movies

Federal, state, and local records would need to remain on paper.Without

preservation, digital history is only as old as the current storage media.

Page 21: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

• Chronopolis provides a comprehensive approach to infrastructure for long-term preservation integrating

• Collection ingestion

• Access and Services

• Research and development for new functionality and adaptation to evolving technologies

• Business model, data policies, and management issues critical to success of the infrastructure

SDSC , the UCSD Libraries, NCAR, UMd , NARA working together on long-term preservation of digital collections

Consortium

Chronopolis: Using the Data Grid to support

Long-Lived Data

Page 22: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Chronopolis – Replication and Distribution

• 3 replicas of valuable collections considered reasonable mitigation for risk of data loss

• Chronopolis Consortium will store 3 copies of preservation collections:

• “Bright copy” – Chronopolis site supports ingestion, collection management, user access

• “Dim copy” – Chronopolis site supports remote replica of bright copy and supports user access

• “Dark copy” – Chronopolis site supports reference copy that may be used for disaster recovery but no user access

• Each site may play different roles for different collections

UCSD

U MdNCAR

Chronopolis Site

Chronopolis Federation architecture

Bright copy C1

Dim copy C1

Bright copy C2

Dark copy C1

Dim copy C2

Dark copy C2

Page 23: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Creative Business Models Needed to Support Long-lived Data

• Data preservation infrastructure need not be an infinite, increasing mortgage

• Creative solutions are possible

• Relay funding

• Consortium support

• Recharge

• Use fees

• Hybrid models, and other support mechanisms

can be used to create sustainable business models

Regional Scale

Local Scale

NationalScale

Page 24: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Current competitions providing a venue for a broader set of players and experts

Our best and our brightest are becoming lean, mean competition machines – does this really serve the science and engineering community best?

• We’re getting good at circling the wagons and pointing the guns inward, isn’t it time we turned things around?

What will it take for all of US to take the leadership to better focus CS infrastructure, research, and development efforts?

Beyond Branscomb

Whining

Page 25: UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER Fran Berman September 11, 2006 Dr. Francine Berman Director, San Diego Supercomputer

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Thank You

www.sdsc.edu