service and support for science it lunchveranstaltung herbst 2014

46
Service and Support for Science IT Lunchveranstaltung Herbst 2014

Upload: jose-barrett

Post on 14-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Service and Support for Science IT Lunchveranstaltung Herbst 2014

Service and Support for Science IT

Lunchveranstaltung Herbst 2014

Page 2: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Peter Kunszt

PhD in Theoretical Physics (University of Bern)

Postdoc in Astrophysics / Cosmology, Johns Hopkins University Baltimore, USA: Sloan Digital Sky Survey Science Archive (Virtual Observatory), 3Y

CERN IT Department, Geneva, Data Management Section head and Project Manager for EU projects , 5Y

CSCS, Lugano: Build Swiss Tier 2 for CERN, Swiss Grid Initiative, 3Y

ETH Zürich: Head of SyBIT Projects for SystemsX.ch, 5Y

UZH: Heading S3IT

Page 3: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Outline

Science is changing

Challenges due to the change

Addressing the challenge

Infrastructure

Organization of S3IT

Page 4: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

A Digital World

Scientific Discovery driven by new instrumentation

Page 5: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Scientific data doubles every year

Changes the nature of scientific computing

Cuts across disciplines …. eScience

It becomes increasingly difficult to extract knowledge

An Exponential World

19701975

19801985

19901995

2000

0.1

1

10

100

1000

CCDs Glass

Slide by Alex Szalay, JHU

Page 6: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Not only scientific data!

20% of the world’s servers go into centers by the “Big 5”

– Google, Microsoft, Yahoo, Amazon, eBay

An Exponential World

Slide by Alex Szalay, JHU

Page 7: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3ITScience is Changing

THOUSAND YEARS AGOscience was empirical describing natural phenomena

LAST FEW HUNDRED YEARStheoretical branch using models, generalizations

LAST FEW DECADESa computational branch simulating complex phenomena

TODAYdata intensive science, synthesizing theory, experiment and computation with statistics ►new way of thinking required!

Slide by Alex Szalay, JHU

Page 8: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Change of Culture

Single person discoveries Large Collaborations

Page 9: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Change of Culture

Single person discoveries Large Collaborations

Citizen Science

Page 10: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Luckily, there’s Moore’s Law

Page 11: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Moore’s Law – no mo(o)re?

Page 12: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Scientific Data Analysis Today

Data is produced everywhere, never will be at a single location

Data grows as fast as our computing/instrument power

Many labs have their own power-workstation or mini-cluster

Hitting the cooling and power wall

Moore’s law is not as it used to be – more complex solutions

Trouble storing even the produced data stream

Not scalable, not maintainable…

Page 13: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Fire and forget...

Often, you do not want to be bothered with computing details

IT JUST NEEDS TO WORK!

Page 14: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Widening Complexity Gap

Standard Computing

GAP Research Needs

Desktop computing, storageHelpdesk, supportInternet, Wikipedia, ..

AlgorithmsModels, StatisticsVisualizationsData analysisPublication

Local IT ResourcesCentral IT Services

Research laboratoriesCore Facilities

Page 15: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Challenge: Scale Up

High Throughput Instruments

– Much larger data volumes

– Increased data complexity

Large Collaborations

– More people

– More experiments and measurements

– More coverage

BIG

Eve

ryth

ing

Page 16: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3ITScience IT

Connect IT and Science

Dedicated support forcomputations and dataanalysis

SPEED : faster time to solution

ACCESS to competitive infrastructure

ENABLE : remove barriersnew possibilities

Speed

Access

Enablement

Page 17: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Sure, we all believe in miracles...

Page 18: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Gray’s Laws of Data Engineering

Jim Gray:

Scientific computing is increasingly about data

Need scale-out solution for analysis

Take the analysis to the data!

Start with “20 queries”

Go from “working to working”

Slide from Alex Szalay, JHU

Page 19: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

What does that mean?

1. Understand your data = understand your problem

2. Reduce data wherever possible – think about what is worth keeping vs. reproducing

3. Focus on doable chunks of work and questions

4. Build programs and systems that can scale

Page 20: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

And: No one size fits all – all research is different by nature

Page 21: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Providing the ‚Miracle‘: A lot of this is Scientific Work by itself!

Computer Science

Scientific Computing, Research Informatics

Data Science

• Department of Informatics

• Institute for Computational Science

• Domain Informatics – Bioinformatics, Medical Informatics, Geoinformatics, ....

Lots of PhD theses to come!

Page 22: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Results of that research needs to be applied!

Software engineering

Code optimization and scaling

Visualization

Applied statistical analysis

Automation of workloads

Data storage and management

Maintenance ! ! !Students don‘t have time for

that and don‘t get any

recognition

Page 23: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Like a Formula 1 Racing Team

Page 24: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Informatikdienste

Teamwork in Science IT

CoreFacilities

CoreFacilities

DomainInformatics

Research GroupsProjects

ProjectsProjects

Research GroupsResearch GroupsResearch Groups

FacultiesInstitutes

Departments

FacultiesInstitutes

Departements

Department of Informatics

CSCS

externalinternal

VendorsVendors

VendorsIndustrial Partners

Institute for Computer Science

CoreFacilities

CoreFacilities

CoreFacilities

Science IT @ETH, Univ of X

Zentrale Informatik

CoreFacilities

CoreFacilities

Local IT Support

Page 25: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Science IT as a Service

BOOTSTRAP: Consultancy• Research context and perspective• Categorization of problem in terms of

Simulation, Data, Processing, Publication• Map to available infrastructure• Plan Support Service as a Project (time, cost, ..)

DELIVERY: Project execution• Setup of infrastructure, software, integration• Automation, analysis, visualization• Training of the users on the workflow• Continuous Support, feedback and iterative improvement

FINISH• Conclusion and publication• Reusability and sustainability measures

Page 26: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Categorization of Infrastructure

Computer is the Resesarch Instrument : ‚Supercomputing‘– Simulations of phenomena

– Needs the largest computers you can get

– Theoretical physics, astrophysics, mathematic, computational chemistry, biochemistry, meteorology

– Simulations also generate a lot of data, models.

– Continuous usage.

Our job: Provide access to necessary Infrastructure

– Support

– Maintenance

– Software optimization

– Data storage and handling

Page 27: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Categorization of Infrastructure

Computer is a tool, a workhorse– Statistical analysis, parameter studies

– ‚Big Data‘ processing

– Visualization

– Life science, biochemistry, geography, medicine, digital humanities, banking and finance, computer science, ...

– Very heterogeneous requirements

– Non-continuous use

Can be large!

1. Server computing

– Interactive work, person sitting in front of the system

2. Cluster computing

– Automated workloads, many computers at once

Page 28: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Categorization of Infrastructure

1. Server computing

– Interactive work, person sitting in front of the system

2. Cluster computing

– Automated workloads, many computers at once

Our job: Provide access to individualized, custom servers and clusters

– Assure scalability

– Keep costs down

– Maintenance, support

– Automation tooling, workflows

– Data management and data processing

– Standardized processes

Page 29: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Supercomputing

UZH operates local supercomputing since almost 10 years

• Irchel Datacenter

• Supported and operated centrally

BUT: Getting ‚old‘ – over 4 years

• Not competitive

• Not power-efficient

• Expensive in maintenance

Page 30: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

The Science Cloud

Scale by re-centralizing individual local infrastructure

– One hardware size fits all

– But individual delivery of clusters and servers!

– Possible due to virtualization

NEW

Physical Hardware

‚Virtual‘SimulatedHardware

Page 31: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Schrödinger Supercomputer Local Computing

Infrastructure Today

Maintained by S3IT / ZI

Supported by S3IT / ZI

Standardized Tools

Locally installed and maintained

Own tools and developments

Page 32: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

UZH Science Cloud

Maintained by S3IT / ZI

Supported by S3IT / ZI

Toolset : both standard and own tools

NEWUZH HPC@CSCS Local Computing

Infrastructure 2015

Maintained by CSCS

Supported by S3IT / ZI

Standardized Toolset

Locally maintained

Own tools and developments

Page 33: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

UZH HPC@CSCS Local Computing UZH Science Cloud

Infrastructure 2016

Maintained by CSCS

Supported by S3ITLocally maintained

Own tools and developments

Maintained by S3IT

Supported by S3IT

Page 34: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Storage Storage Storage

Page 35: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Cost of a Petabyte is not the problem

From backblaze.comOctober 2014

Page 36: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3ITRead Petabytes in short time is!

Current problems are not easy to parallelize and scale!

• 10-30TB ‘easy’

• 100-200TB doable

• 500TB+ very difficult

Moving 100TB over the network (sequentially)

• 1Gbps – 10 days

• 10Gbps – 1 day : but needs a dedicated connection!

• Physically? – FedEx

Page 37: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Categorization of Science IT Problems

• Models• Theory

• Analysis• Mgmt

• Scaling• Autom.

• Share• Publish

Simulation Data

ProcessingPublication

Page 38: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Managing the Data Lifecycle

Conscious data production, reduction, usage

• Ask the questions you need to ask as quickly as possible

• Delete data wherever possible, keep starting points (freezer?)

• Aim for reproducibility on all levels on first principles – document steps

Automated data processing pipelines

• Scale by automation

• Ability to re-run simulations and data analysis on a push of a button

• Extract and keep all metadata

Publication and archiving

• Publish everything to be easily reproducible by others

• Archive only what you need or what is mandatory

Page 39: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Setup of Service and Support for Science IT

Director

Office

Infrastructure Projects Collaborations

HPCCloud

Storage

Life ScienceGeoscience

PhysicsHumanities

...

NationalInternational

Industrial

Page 40: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Setup of Service and Support for Science IT

Informatikdienste

Zentrale Informatik

Vice President Law & Economics

Page 41: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Organisation

Core Team

Site Team

Site Team

EE

EE

EE

EE

EE

...

...

EE = Embedded ExpertsProject work directly in the groups.

Site TeamsJoining forces with local institute-IT experts. Support and provisioning of access and software on site.

Core TeamConsultancy in core competences, central infrastructure, project management

Page 42: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Funding

• University Core Funding for the core team

• Site Teams: Funding through local institutes, faculties or 3rd party projects – service charges

• Embedded Experts: 3rd party projects – co-applicants on project proposals

Page 43: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

What to expect of S3IT

Science IT as a Service - Consulting

Support through projects

Software, Workflows, Infrastructure is optimized to the needs of the research problem, not the other way round

Our operational concept follows the cloud model to meet the very heterogeneous needs of the UZH research groups, while working with standardized, commodity hardware and software

Scalability, Extensibility, Reusability are our guiding principles

Page 44: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Already a success story

• Started with 1 person in January

• 10 people now, 5 on core, 5 on other funds

• 20 project applications written, 10 projects already approved and running, 10 projects in the pipeline, some already pre-approved

• Many new project ideas as a result of consultation

• Projects include

• Life Science (imaging, genomics, proteomics)

• HPC optimization

• Digital humanities, art history, ..

• Industrial cooperations

Page 45: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

What next?

How long does the data growth continue?

How far can we scale?

What new technologies will make our life easier or harder?

Let‘s find out together

Page 46: Service and Support for Science IT Lunchveranstaltung Herbst 2014

S3IT

Please visit usNext Lunch Events

Generating Hypotheses from Large Data: Lars Malmström, S3IT

Big Data in Art History: Thoms Hänsli, Ditigal Art History UZH

Visual Analytics, understanding interactions: Markus Grau; Business Alliances, Guido Oswald; SAS

Kognitive Systems, how ‚artificial intelligence‘ will revolutionalize research and education – Karin Fey, IBM

Cloud Services at the UZH – ZI

http://www.s3it.uzh.ch/