vicky rowley solution architect birn coordinating center - university of california san diego...

16
Vicky Rowley Solution Architect BIRN Coordinating Center - University of California San Diego E-x-t-e-n-d-i-n-g Rocks: The Creation and Management of Grid Systems for Biomedical Research OSGC Conference - May 14, 2008

Upload: toby-thornton

Post on 14-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Vicky Rowley

Solution Architect

BIRN Coordinating Center - University of California San Diego

E-x-t-e-n-d-i-n-g Rocks: The Creation and Management of Grid

Systems for Biomedical Research

OSGC Conference - May 14, 2008

BIRN is Data Storage (SRB) and Processing, BIRN is Data Storage (SRB) and Processing, but…but…

UNM

UMN

UI

Duke

UCSD

UCI

BWHMGH

Yale

UCLA

Stanford

= Support existing sites= Support existing sites

= Establish new sites= Establish new sites

= Replicate for new community= Replicate for new community

Cluster

Cluster

Rocks Standard vs. Rocks for BIRN

Cluster building focus Data processing focus Lots of big clusters

Collaboration focus Data storage/sharing focus A few relatively small

clusters Data Grid was needed

before clusters processing was needed

MGH Segmentation

De-identificationAnd upload

JHUShape Analysis

of Segmented Structures

BIRN Data Grid

BWHVisualization

Scientific Goal: classify patient status from

morphometric results

1

2

3

4

5

Large ScaleDistributed Computing

N=45

Data DonorSite (WashU)

So what does BIRN _do_?Large Deformation Diffeomorphic

Metric Mapping using the TeraGrid

Preliminary Study:•46 hippocampus data sets•30,000 CPU hours, 4 TB data

Shape-derived metrics can be used to detect class-specific information

6 semantic dementia subjects

18 Alzheimer subjects21 control subjects

SASHA: Shape Analysis Pipeline Results

The BIRN Collaboratory Today

Enabling collaborative research at 28 research institutions comprised of 37 research groups.

How does Rocks make it do that?

Installs operating system software Turns individual servers into a “Grid”

• Portals & web servers• Data grid for access & management• Compute clusters• Database servers

Distributes, installs and updates 3rd party, domain-specific scientific software packages

Updates system software

What would be better?

Add/Improve security & performance monitoring Detect and capture configuration changes Track versions Ideally, reduce, reuse, recycle…

Desired System Qualities Agile

• Fast response with updates

• Self-help for developers

Repeatable• Tracking of versions

• Tracking of deployments

Modular/Flexible• Handles unique site

requirements

• Handles unique project requirements

Customizable Scalable

• Highly automated

• Supports addition of several sites per year, plus additional projects over 5 years

Basic System Software Operating System

Security…

Server Definition Software Apache/Tomcat

Globus…

Application Software Gridsphere

HID

Mediator

Scientific Applications…

BIRN/Rocks

Software

Stack

Custom

BIRN Server

A BIRN Grid

Portal/Web BIRN Rack

GPOP

GComp

Nettools

NAS

MCAT

DB Server

Registry DB

UMLS

HID DB

Mediator

GAMA Server

MyProxy

Globus

CAS

HID

What’s involved in a single grid?

CVS, SVN & CVS, SVN & SRB ReposSRB Repos

Testbeds

Rocks Central & YUMRocks Central & YUMRolls:Rolls: * RHEL4 * area51 * base * birn * birnafs * birncondor * birnportal * birnsrb * CentOS * condor * cvsserver * freesurfer * gama-naregi 1.0 * gama-naregi 4.1 * ganglia * grid * gridsphere * hardwareutils * hid * hpc * java * kernel * mediator * nagios * oracle * postgres * sciapps * sge * srb34 * tomcat * updates-CentOS * webserver

14 Rocks Rolls 14 Rocks Rolls

(-2 for OS)(-2 for OS)

17 Custom Rolls17 Custom Rolls

Software Development & Integration

Rocks/YUM

Server

Testbeds

- update local CVS/SVN

- update tarballs

- update RPMs

- new config/install

BIRN-CC

- large source into SRB

- updates RPMs- Makefile- version.mk- *.spec.in

- updates XML (rare)

Software Deployment

CVS

Development Area

-Integrate software for many diverse sources

-Version control at system and sub-component levels

-Rolling baseline

-Integration and Functional Testing

Staging Area

-Verify interoperation of latest code

-Support demonstration of latest development efforts without disruption to production

-Functional system/Beta Testing

Production Area

- Stable

- Reliable

- Facilitates research

Rocks Development Server

Rocks Staging Server

Rocks Production Server

SRB

What we love?

Repeatability• All the web servers are the same• All the database servers are the same

Flexibility• Mix & Match rolls

Level of automation• Experienced person can “kick” a server in 5 minutes• IPs, hostnames, software configuration done

Open Source Result: Not one grid - Many! Not one project - Many!

What drives us nuts?

Turn around time for updates Steep learning curve RPM building not standard Build time large Software developers are not co-

located with integrators Reinstalling to get updates is

not an option Lack of advanced roll

development training

More info?

See the project website: http://www.nbirn.net Email vrowley_at_ucsd_dot_edu