department of biomedical informatics service oriented bioscience cluster at osc umit v. catalyurek...

12
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics Dept. of Electrical & Computer Engineering The Ohio State University

Upload: bethanie-richardson

Post on 28-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Department of Biomedical Informatics

Service Oriented Bioscience Cluster at OSC

Umit V. CatalyurekAssociate Professor

Dept. of Biomedical InformaticsDept. of Electrical & Computer Engineering

The Ohio State University

Department of Biomedical Informatics

2

Origins of caBIG

• Goal: Enable investigators and research teams nationwide to combine and leverage their findings and expertise in order to meet NCI 2015 Goal.

• Strategy: Create scalable, actively managed organization that will connect members of the NCI-supported cancer enterprise by building a biomedical informatics network

“Relieve suffering and death due to cancer by the year 2015”

Department of Biomedical Informatics

3

Driving needs:cancer Biomedical Informatics Grid

• A multitude of “legacy” information systems, most of which cannot be readily shared between institutions

• An absence of tools to connect different databases• An absence of common data formats• A huge and growing volume of data must be collected, analyzed,

and made accessible• Few common vocabularies, making it difficult, if not impossible, to

interlink diverse research and clinical results• Difficulty in identifying and accessing available resources• An absence of information infrastructure to share data within an

institution, or among different institutions

Department of Biomedical Informatics

What is caBIG?

• Common, widely distributed infrastructure that permits the cancer research community to focus on innovation

• Shared, harmonized set of terminology, data elements, and data models that facilitate information exchange

• Collection of interoperable applications developed to common standards

• Cancer research data available for mining and integration

Department of Biomedical Informatics

5

What is caGrid?

• A grid based software infrastructure consisting of services, toolkits, APIs, and applications

• A production grid deployment of the core services provided by that infrastructure

• A community of developers leveraging that grid and infrastructure to provide applications and services to the cancer research community

Department of Biomedical Informatics

6

What is caGrid?

• Development project of Architecture Workspace• The Grid infrastructure for caBIG (the “G” in caBIG)• Driven from use cases and needs of cancer research community• Service Oriented Architecture• Based on federation• Model Driven• Object-Oriented, Semantically-Annotated Data Virtualization

Department of Biomedical Informatics

7

What is caGrid? cont…

• Builds on existing Grid technologies• Provides additional enterprise Grid components

• Grid Service Graphical Development Toolkit• Metadata Infrastructure• Advertisement and Discovery• Semantic Services• Data Service Infrastructure• Analytical Service Infrastructure• Identifiers• Workflow• Security Infrastructure• Client tooling

Department of Biomedical Informatics

8

caGrid Community Involvement

• caGrid itself provides no real “data” or “analysis” to caBIG™; its the enabling infrastructure which allows the community to do so

• Community members add value to the grid as applications, services, and processes (for example: shared workflows)

• caGrid provides the necessary core services, APIs, and tooling• The real “value” of the grid comes from bringing this information

to the “end user”• Community members develop end user applications which

consume of the resources provided by the grid

Department of Biomedical Informatics

9

caGrid @ OSC

• Goals: • Create an expandable caGrid Installation at OSC• Deploy Pilot Applications to demonstrate

Service Oriented Access to HPC resources

• Dorian, GTS and Index services are deployed• cagrid-dorian01.osc.edu• cagrid-gts01.osc.edu• cagrid-index01.osc.edu

• SyncGTS along with Dorian and Index for performance• caGrid 1.2 was released this week, and we deployed it!

Department of Biomedical Informatics

• Image Mining for Performing Comparative Analysis of Expression Patterns in Tissue Microarrays• Project funded by NIH R01 (PI: David Foran, Co-PI: Joel Saltz)

• Development of innovative analysis methods for analysis of tissue microarrays • Computation of features, annotations of image data based on features

• Development of software support • to manage and share tissue microarray data and analysis results • to process large volumes of tissue microarray data on high performance systems

• Development of ability to share data and analytical resources using caGrid • Supports Help Defeat Cancer project which 100,000 imaged histology

specimens originating from breast, head & neck, colorectal cancers.

Pilot Application : TMA

Department of Biomedical Informatics

11

TMA Analytical Service Implementation

• TMA Application is a pipelined workflow• Several processing steps that need to be applied in sequence to the images• Build a prototype workflow orchestration system• Wraps a program execution

• Stages the the data in• Invoke the executable• Retrieve the output files

• Uses caGrid’s bulk data transfer to move files from host to host• Interacts with a scheduler to allocate resources for the execution

• Executable can be a parallel/distributed application

• TMA user interface• Specify the workflow

• List with executables and parameters• Invoke the service for the first stage

Department of Biomedical Informatics

12

What is next?

• Next Pilot Application: Prof. Dan Janies’ Supramap• http://supramap.osu.edu• Builds a phylogenetic tree and projects onto the map of the

planet• Computationally expensive

• Next Pilot Application(s): Your Application!?

• More Info: http://bmi.osu.edu and http://www.cagrid.org• Contact: Umit V. Catalyurek email: [email protected]