grid computing research and applications sornthep vannarat large scale simulation research...

90
Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Upload: adriel-stallworth

Post on 31-Mar-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Grid ComputingResearch and Applications

Sornthep Vannarat

Large scale Simulation Research LaboratoryNational Electronics and Computer Technology Center

Page 2: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Outline

• Introduction to Grid computing• Open Grid Service Architecture• Bioinformatics applications on Grid• Information Grid project• GEO Grid project• Knowledge Grid• Web 2.0 and Grid computing• Grid activities at NECTEC

Page 3: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Introduction to Grid computing

Page 4: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

4

หน่�วยปฏิ�บัติ�การว�จัยการจั�าลองขน่าดใหญ่� พั�ฒนาองค์ค์วามรู้ � นว�ตกรู้รู้ม และจั�ดการู้แก�ปั�ญหาด�วยการู้จั�าลองทางค์อมพั�วเตอรู้

Understand, innovate and manage problems through computer simulations

ก ารู้จั�าลองด�วยค์อมพั�วเตอรู้น�าไปัสู่ !การู้ค์�นพับองค์ค์วามรู้ �ใหม! ซึ่%&งจั�าเปั'นต!อ การู้พั�ฒนาเทค์โนโลย)ชั้�+นสู่ ง เพั,&อเศรู้ษฐก�จัและค์0ณภาพัชั้)ว�ตของปัรู้ะชั้าชั้น

การู้ปัรู้ะย0กตใชั้�การู้จั�าลองด�วยค์อมพั�วเตอรู้ในการู้ออกแบบทาง ว�ศวกรู้รู้มน�าไปัสู่ !ผล�ตภ�ณฑ์ท)&ม)ค์0ณภาพัและค์วามสู่ามารู้ถสู่ งข%+น รู้วมถ%ง

กรู้ะบวนการู้ผล�ตท)&ม)ปัรู้ะสู่�ทธิ�ภาพั ปัรู้ะหย�ดพัล�งงานและว�ตถ0ด�บ

ใ นการู้แก�ปั�ญหาสู่�&งแวดล�อม และ ภ�ยพั�บ�ต� การู้จั�าลองด�วยค์อมพั�วเตอรู้ สู่ามารู้ถชั้!วยท�านายการู้เปัล)&ยนแปัลง และ ผลกรู้ะทบของปั�จัจั�ยต!างๆ น�าไปั

สู่ !ค์วามเข�าใจัปั�ญหา และสู่น�บสู่น0นให�เก�ดการู้วางแผนท)&ด)

พั�ฒนาองค์ค์วามรู้ �

สู่รู้�างนว�ตกรู้รู้ม

จั�ดการู้แก�ปั�ญหา

Page 5: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

5

ก�จักรู้รู้มหล�ก1. การู้สู่รู้�างรู้ะบบค์อมพั�วเตอรู้สู่มรู้รู้ถนะสู่ ง และ รู้ะบบจั�ดเก9บ

ข�อม ลขนาดใหญ!2. การู้ศ%กษาและปัรู้ะย0กตใชั้� virtualization middleware

3. การู้พั�ฒนาโค์รู้งสู่รู้�างพั,+นฐานและ middleware สู่�าหรู้�บกา รู้บ รู้ณาการู้รู้ะบบค์อมพั�วเตอรู้และข�อม ล

4. การู้พั�ฒนาโปัรู้แกรู้มเพั,&อสู่รู้�างแบบจั�าลอง5. การู้ปัรู้ะย0กตใชั้�การู้สู่รู้�างแบบจั�าลองด�วยค์อมพั�วเตอรู้เพั,&อ

สู่รู้�างองค์ค์วามรู้ � เพั,&อการู้ออกแบบทางว�ศวกรู้รู้ม และ เพั,&อการู้จั�ดการู้และแก�ไขปั�ญหา

ค์ล�สู่เตอรู้ค์อมพั�วต�+ง กรู้�ดค์อมพั�วต�+ง รู้ะบบจั�ดเก9บข�อม ลขนาดใหญ!• การู้ปัรู้ะมวลผลแบบกรู้ะจัาย Web Services, XML, Java Programming

• การู้ค์�านวณเชั้�งต�วเลข ไฟไนตเอล�เม�นต(FEM) และ กลศาสู่ตรู้ของไหลเชั้�งค์�านวณ(CFD)

เทคโน่โลย�ท��เก��ยวข�อง

Page 6: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

What is Grid computing?• Next-generation computing platform and global

cyberinfrastructure for solving large-scale problems in science, engineering, and business

• Grid Café [http://gridcafe.web.cern.ch/gridcafe/]• Web is a service for sharing information over the

Internet, the Grid is a service for sharing computer power and data storage capacity over the Internet

• Ian Foster– 1998: Computational Grid is a hardware and software

infrastructure that provides dependable, consistent, and pervasive access to high-end computational capabilities

– 2000: Grid computing is concerned with coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations

– 2002: Grid is a system that (1) coordinates resources that are NOT subject to centralized control (2) uses standard, open, general purpose protocols and interfaces (3) delivers non-trivial qualities of service

Page 7: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Status of Grid computing

• A promising work in progress

• Usable with a lot of efforts

• WISDOM: – EGEE Docking project

– Find new inhibitors for proteins produced by Plasmodium falciparum

– Over 46 million docking simulations in 6 weeks using 1,700 computers in 15 countries, equivalent to 80 CPU-years

• Beyond computing power

Page 8: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Types of Grids

• Computing grid• Data/storage grid• Information grid• Instrument grid• Access grid

Page 9: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

9

The Grid Problem• Flexible, secure, coordinated resource sharing among

dynamic collections of individuals, institutions, and resource

From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”

• Enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals -- assuming the absence of…– central location,– central control, – omniscience, – existing trust relationships.

Page 10: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

10

Elements of the Problem• Resource sharing

– Computers, storage, sensors, networks, …– Sharing always conditional: issues of trust, policy,

negotiation, payment, …

• Coordinated problem solving– Beyond client-server: distributed data analysis,

computation, collaboration, …

• Dynamic, multi-institutional virtual orgs– Community overlays on classic org structures– Large or small, static or dynamic

Page 11: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Challenges

• To provide seamless access• Heterogeneous environments• Multiple administrative domains and

autonomy issues• Scalability • Dynamicity/adaptability

Page 12: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Grid computing middleware• “Global Grids and Software Toolkits: A Study of Four

Grid Middleware Technologies”, Parvin Asadzadeh et al.• UNICORE

– Uniform Interface to Computing Resources– Ready-to-run Grid system including client and server software– UNICORE 6.0.1 release26 Nov 2007: WSRF based

implementation• Globus Toolkit

– Developed by Globus Alliance– Open source software toolkit used for building grids with

services written in a combination of C and Java– GT 4.0.5 OGSA WSRF based

• Legion, Gridbus• EGEE’s gLite

Page 13: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

13

One View of Requirements• Identity & authentication• Authorization & policy• Resource discovery• Resource characterization• Resource allocation• (Co-)reservation, workflow• Distributed algorithms• Remote data access• High-speed data transfer• Performance guarantees• Monitoring

Adaptation Intrusion detection Resource management Accounting & payment Fault management System evolution Etc. Etc. …

Page 14: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

14

Layered Grid Architecture

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

InternetTransport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 15: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Open Grid Services Architecture

Page 16: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Open Grid Services Architecture

• Service-oriented architecture– Key to virtualization, discovery, composition, local-

remote transparency

• Leverage industry standards– Internet, Web services

• Distributed service management– A “component model for Web services”

• A framework for the definition of composable, interoperable services

“The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration”, Foster, Kesselman, Nick, Tuecke, 2002

Page 17: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Web Services • XML-based distributed computing technology

• Web service = a server process that exposes typed ports to the network

• Described by the Web Services Description Language, an XML document that contains– Type of message(s) the service understands & types of

responses & exceptions it returns– “Methods” bound together as “port types”– Port types bound to protocols as “ports”

• A WSDL document completely defines a service and how to access it

• WSRF

Page 18: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Extension of WS

• Lifecycle management• Statefull• Subscribable

Page 19: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Writing Grid Service

• Define the interface with WSDL, wsrp• Implement the service (Java)• Define the deployment parameters

(WSDD, JNDI)• Compile GAR file (Ant)• Deploy service (GT4)

Page 20: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Notification

• Polling and pushing• WS-Topics: topic trees• WS-BaseNotification: subscribe, notify• WS-BrokeredNotification: broker

Page 21: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Lifecycle management

• Creation operation: factory service• Access and destroy operations: instance

service• Destroy operation

– Immediate– Scheduled (lease based)

Page 22: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

22

Page 23: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

24

GT4 Container

Monitoring & Discovery

GRAM User

Index

GT4 Cont.

RFT

Index

GT4 Container

Index

GridFTP

adapter

Registration &WSRF/WSN Access

Custom protocolsfor non-WSRF entities

Clients(e.g., WebMDS)

Automatedregistrationin container

WS-ServiceGroup

Page 24: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Security

• Privacy• Integrity• Authenticate• Authorization• Non-reputable

Page 25: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

PKI

• Public Key Infrastructure• Key based encryption• Symmetry and Asymmetric encryptions• Public and Private keys• Digital signature• Digital certificate• CA

Page 26: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

GSI

• Grid Security Infrastructure• Transport and message-level security• Authorization schemes• Credential delegation and single sign-on• Different levels of security: container,

service, and, resource

Page 27: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

28

OGSA-DAI• An extensible framework for data access and integration• Expose heterogeneous data resources to a grid through web

services• Interact with data resources

– Queries and updates– Data transformation / compression– Data delivery– Application-specific functionality

• A base for higher-level services– Federation, mining, visualisation,…

• Open Grid Forum DAIS Working Group– DAIS (Database Access and Integration) specifications– OGSA-DAI to be a reference implementation of DAIS

Page 28: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

29

OGSA-DAI functionality• Interaction with data resources

– Relational – MySQL, SQL Server, DB2, PostGres, Oracle– XMLDB – eXist, Xindice– Files – text, binary, indexed– SQL multi-resources – aggregation of OGSA-DAI services

exposing relational resources

• Transformation and compression– ZIP, GZIP, XSLT, ResultSet-to-WebRowSet, ResultSet-to-CSV,

…– WebRowSet projection, frequency distribution, random sample,

• Delivery– Local file, HTTP, SMTP, SOAP attachments, GridFTP, other

OGSA-DAI services

• Resource creation and destruction

• Document-oriented interface – service interface is resource agnostic

Page 29: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

30

Bioinformatics Applications on Grid

Page 30: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

31

Bioinformatics and Grid• Bioinformatics applications often require high-performance

computing and large data handling

• Tools: bioinformatics tools and web services

• Data:

– Public databases

– Biological knowledge: ontology and meta data

– unpublished data

• Grid computing meets the requirements

– Computing Grids

– Data Grids

– Knowledge Grids

Page 31: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

32

Computing Grid• High throughput computing

– Thousands of small independent tasks

• Grid computing v.s. cluster computing

– aims at parallel and distributed computing

– differ in network latency and robustness.

– frequency of task failures is much higher in grid computing

• Two types of high-throughput computing

– numerical processing

– symbolic processing

Page 32: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

33

High throughput numerical processing

• Systems biology aims at modeling of biological dynamics in molecules, cells, organs and individuals

• Huge computational power is needed for

– molecular folding

– molecular docking

– spatiotemporal molecular interaction

– kinetic parameter estimation

• Problem decomposition techniques

– parameter sweep

– stochastic modeling

Page 33: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

34

WISDOM• EGEE Docking project

• Find new inhibitors for proteins produced by Plasmodium falciparum

• over 46 million docking simulations 6 weeks

• 1,700 computers in 15 countries

• Equivalent to 80 CPU-years

Page 34: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

35

DIANE• Enhanced version of

WISDOM

• Light-weight framework

• Search for drugs for predicted variants of H5N1

• 2 millions docking complexes with a size of 600 gigabytes

• 2,000 grid worker nodes in 17 countries

Page 35: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

36

Limitations of EGEE Infrastucture

• Experiences from virtual screening projects

• Overall grid efficiency about 50 percent

• Major sources of failure

– Server license failure 23%

– Workload management failure 10%

– Site failure 9%

Page 36: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

37

Study of kinetic pathways• Estimation of ODEs for modeling of metabolic pathways and

signal transduction pathways

• Genetic algorithms:

– Estimating optimal parameter fitting to biological experimental results

– High degrees of parallelism (multiple trials with initial conditions)

• Parameter-parameter dependencies:

– Calculating moment parameters, such as AUC, MRT, VRT

Page 37: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

38

High throughput symbolic processing

• Sequence analysis: Homology searches, Genome comparisons, Genome-wide analyses

• Sequencing data are expected to increase more rapidly

– High-throughput DNA sequencing technologies

– Metagenomic projects

– Human resequencing projects

– Genome sequencing projects on other species

• Requires large databases such as DNA and protein sequence

• Sharing and updating of biological databases on the grid are of key importance

Page 38: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

39

Sharing biological databases• Become more and more difficult and intractable

• Automatic updating of databases is necessary

• Concerns

– Duplicated database copying

– Disk overflow

– Unexpected shutdown

– Version management

– File checksum integrity verification

– Parallel and pipelined mechanisms for high-throughput data transfer

Page 39: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

40

EGEE Framework• EGEE provides a general framework for sharing replicas of

biological databases represented

• Physical File Name (PFN)

• Logical File Name (LFN)

• Globally Unique Identifier (GUID)

• Replica Manager System (RMS)

– Replica Metadata Catalog (RMC)

– Replica Location Service (RLS)

LFN-3

LFN-2

GUID

PFN-2PFN-1

LFN-1

RMC

RLS

Page 40: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

41

GADU• Genome Analysis and

Database Update system

• Automated, scalable, high-throughput computational workflow engine

• Executes bioinformatics tools (BLAST, BLOCKS, PFam, Chisel and InterPro)

• Public databases (NCBI RefSeq, PIR, InterPro and KEGG)

Page 41: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

42

Homology Search• GRID BLAST implementations have been developed and

reported

– Prestaging of sequence databases to minimize the runtime overhead of transferal of large sequence databases

– Databases update which keeps data consistency on the data-grid

– Dynamic load balancing of query sequences

– Assembling of the results from distributed jobs

Page 42: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

43

Genome Comparison• Most promising life science applications for grid computing

• Expandable and flexible large scale computing facility is needed

• E.g. Investigation of horizontal gene transfer among 354,606 ORFs extracted from more than 100 microbial genomes

– Used 229 CPUs located in 5 institutions

• Number of pair-wise sequence comparison ∝ N2

Page 43: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

44

Integration of bioinformatics services

• Resourceome

– Uniform and secure interface

– Providing workflows

– Using Metadata and ontology

• Metadata, ontology, XML: fill the semantic gap of heterogeneous databases

• Framework: OGSA based on WSRF

Page 44: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

45

BioGrid

Page 45: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

46

RbsB in Different Formats

• DDBJ• SWISS-PROT• PDB

Page 46: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

47

BioPfuga• Workflow system integrating application programs

• Separating application programs into smaller parts.

• Standardize the data format for transferring data between different application programs.

Page 47: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

48

Bioinformatics workflow• Necessary for end-users of bioinformatics web/grid services

• Taverna provides a workflow language and graphical user interface for: building, running and editing of workflows

• Semantic indexing system of bioinformatics services has become essential for choosing resources

• Searching functionally similar bioinformatics workflows is also important

• Bioinformatics ontology is essential for automatic generation of bioinformatics workflows

Page 48: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

49

Secure Data Access• Many bioinformatics databases are public and freely available

• But access to the data needs to be strictly controlled in distributed collaborative research (For example: clinical data)

• Public Key Infrastructures (PKI) is the predominant method for enforcing authentication

• Virtual Organization for Trials and Epidemiological Studies (VOTES) project uses Internet2 Shibboleth technology

Page 49: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

50

Information Grid

Page 50: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

51

Information Grid• an open and flexible

infrastructure that facilitates the integration of any information anywhere across heterogeneous data sources under grid environment.

• 3 essential components– MDL: Marker Description

Language

– Information Services

– Information Brokers

RDBMS

RDBMS

File Server

XMLDBFile Server

FTP Server

RDBMS

Web Server

Information Broker

Information Service

Application

MDL

MDL

MDL

Marker Directory Service

Information ServiceInformation

Service

Information Service

Information Service

Information Service

Information Service

Information Service

MDL

MDL

MDL

MDL

MDL

MDL Information Broker

Information Broker

Page 51: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

52

MDL: Marker Description Language

• a unified language that defines:– standard schema model

– integration configuration model

– standard schema discovery model

MDLSchema

Researcher Project

Publica-tion

Page 52: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

53

Information Service• as an agent to publish information• Responsibilities:

– connect to a current data source of an organization

– transform generic query (mdlQuery) into specific query

– transform query result into standard schema defined in the specified MDL document

Information Service

RDBMS

Information Broker

generic query (mdlQuery)

specific query (SQL)

query result (table)

query result(mdl-based result)

Generic Information Service Tool

• manual mapping• RDBMS• no authentication

Page 53: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

54

Information Broker

• as a broker of Information Services

• Responsibilities:– connect to Information

Services

– connect to others Information Brokers

– discover potential Information Brokers and Services

– integrate information

Information Service

Information Service

Information Broker

Information Service

Information Service

Information Service

Information Broker

Information Service

mdlQuery

integratedmdl-based result

mdlQuery

integrated

mdl-based re

sult

Information Broker

Page 54: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

55

Information Grid Deployment

RDBMS

RDBMS

File Server

XMLDBFile Server

FTP Server

RDBMS

Web Server

Information Broker

Information Broker

Information Broker

Information Service

Application

MDL

MDL

MDL

Marker Directory Service

Information ServiceInformation

Service

Information Service

Information Service

Information Service

Information Service

Information Service

MDL

MDLQuery

resut

resut

resut

resut

resut

resut

Page 55: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

GEO Grid

Page 56: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center
Page 57: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center
Page 58: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center
Page 59: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center
Page 60: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center
Page 61: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center
Page 62: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center
Page 63: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center
Page 64: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center
Page 65: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Knowledge Grid

Page 66: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

67

Knowledge Grid• Tacit knowledge

– "We should start from the fact that we can know more than we can tell", Michael Polanyi, a 20th-century philosopher

• Knowledge represented on computers is just a part of out knowledge

• Grid as place where people work together and create knowledge

• Sharing explicit and tacit knowledge

• This framework gives a meta-philosophical approach to rationalise the current Grid phenomemon.

Page 67: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

68

Knowledge spiral theory• Knowledge creation requires a cyclic process of knowledge

conversion between tacit knowledge and explicit knowledge

– Socialization (tacit knowledge to tacit knowledge)

– Externalization (tacit knowledge to explicit knowledge)

– Combination (explicit knowledge to explicit knowledge)

– Internalization (explicit knowledge to tacit knowledge)

Page 68: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

69

Socialization

• First step in formulating a community

• Grid portals are helpful for attracting those who are interested in some specific field

• Must allow formulation of user-defined communities

• Knowledge grids should provide social communication system-like facilities

– Participants formulate new communities

– Participants recruit other participants

• Face-to-face meeting or off-site meeting will be also helpful in promoting mutual understanding in a community.

Page 69: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

70

Externalization

• For example publication of research papers

• Externalization is the essence of knowledge creation

• Knowledge grid should provide facilities for participants to publish their knowledge in a community

• Web-based dynamic contents are one of the promising ways of publication of knowledge

Page 70: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

71

Combination

• Combination expands knowledge by the sharing of explicit knowledge in a community

• Synergy effects can be expected if participants bring together their own knowledge

• Grid portals and application-oriented grids play an essential role in this process

Page 71: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

72

Internalization

• Internalization is a process of acquiring tacit knowledge by experience

• To make use of a grid for real world life science problems, a problem solving layer for bioinformatics must be developed

• Gridfication of public databases and bioinformatics tools are necessary conditions but not sufficient

• Bioinformatics environment should provide secure facilities to deal with unpublished data and customization facilities to develop one's own bioinformatics environment coordinated with global bioinformatics environment

Page 72: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Web 2.0 and Grid computing

Page 73: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Web 2.0 Design Patterns• The Long Tail• Data is the Next Intel Inside• Users Add Value "architecture of participation" • Network Effects by Default• Some Rights Reserved Design for "hackability" and

"remixability." • The Perpetual Beta• Cooperate, Don't Control• Software Above the Level of a Single Device• What Is Web 2.0... by Tim O'Reilly,

http://www.oreilly.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html

Page 74: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Web 2.0 Core Competencies

• Services, not packaged software, with cost-effective scalability

• Control over unique, hard-to-recreate data sources that get richer as more people use them

• Trusting users as co-developers • Harnessing collective intelligence • Leveraging the long tail through customer self-

service • Software above the level of a single device • Lightweight user interfaces, development

models, AND business models

Page 75: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

What are we talking about?

• Communities & all that social stuff?– Great, love it, should have done all this 20 years ago…

• Easier to use web interfaces?– Love them as a user but they are (still) hard to build (tried

JSF+AJAX+Swing Webflow - argh!!!) – Is it worth the effort? Researchers are not occasional users!

• Existing web 2.0 applications?– Each great individually but try using them in combination…– How can I share my connotea™ bookmarks with my

Facebook™ friends?

• REST as an architectural style?– Good idea - for some applications - flipside of the Grid btw.

Page 76: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Web 2.0 and Grid computing

• Simplify user interface• More flexible than (conventional) portal• Software as a service• Collaboration Grid• Knowledge Grid

Page 77: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Tools and mashups based on web service infrastructure

http://www.chembiogrid.org/projects/proj_tools.html

Page 78: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Shared Bookmarking for Social Networks

• MSI-CIEC project to support tagging and online shared bookmarking.– Pioneered by del.icio.us in 2003 (!)

• Bookmarking services allow you to – Share links (URLs) with networks of friends

– Organize your links by mnemonic tags

– Find other interesting URLs by popularity (most bookmarked)

– Find interesting URLs by keywords

• When used collectively, tags form folksonomies.– “Pave the cow paths”

– Typically about tagged URLs.

– But also about people who tag.

– Semantic Web Lesson: everything is a URI.

Page 79: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Grid activities of NECTEC

Page 80: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Grid Activity Summary

• Grid Testbed– CFD Application– Virtualization

• Information Grid• Grid CA

Page 81: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Grid Computing Testbed

• Internal Level– NECTEC conducts tests of Globus Toolkit 4 for

following issues:• Middleware

• Pre-WS and WebServices components

• CFD on Grid

• Gfarm file system

Page 82: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Grid Computing Testbed

• National Level– NECTEC cooperates with Thai National Grid

Project (TNGP) to set Thailand Grid community standards.

Page 83: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Grid Computing Testbed

• International Level

– NECTEC has been an active member of PRAGMA resources and data working group.

– Improve the interoperabilityof Grid middleware in the Asia Pacific region and make Grid enable to use for scientists.

Page 84: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

• Computation requirements for CFD very high.

• Nevertheless, some major challenges still exist in CFD research; for examples, turbulence research and very-large-scale CFD simulations.

CFD on Grid

• Grid provides dependable, consistent, pervasive and inexpensive access to high-end computational capability.

• We investigate the feasibility and scalability of cross-platform simulation paradigms for a fine-grain applicationas CFD application on our Grid testbed.

Page 85: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

CFD on Grid

• Hardware– Grid64 Cluster (Itanium2

1.4GHz, 4 nodes)• Memory 4 GB

• Network 100 Mbps

– Grid3 Cluster (AMD Athlon 1.6GHz, 2 nodes)

• Memory 1 GB

• Network 100 Mbps

• Software– Rock Cluster version 4.1

– GT 4.0.2 (Pre-WS)

– MPICH-G2 version 1.3.7

– Intel Complier version 8

– PBS scheduler

Page 86: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

CFD on Grid

• Some Remarks– Grid infrastructure with independency of

public IP.– Ability to do job migration automatically.– Dedicated Grid environment for fine-grain

applications such as CFD application.– Improvement of algorithm for high latency

Grid infrastructure.

Page 87: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center
Page 88: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

NECTEC GOC CA

• A digital certificate issuer developed specifically to support authentication for Grid resources.

• Developed under X.509 Public Key Infrastructure by Large Scale Simulation Research Laboratory (LSR), National Electronics and Computer Technology Center.

• A issues certificates to users, hosts and services.

• Current Status:– Production Level CA under APGrid PMA

– ~ 10 Certificates issued (all for internal users)

Page 89: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

NECTEC GOC CA

• Collaboration

Page 90: Grid Computing Research and Applications Sornthep Vannarat Large scale Simulation Research Laboratory National Electronics and Computer Technology Center

Questions?