running a scientific experiment on the grid vilnius, 13 rd may, 2008 by tomasz szepieniec ifj pan...
TRANSCRIPT
Running Running a Scientific Experimenta Scientific Experiment
on the Grid on the Grid
Vilnius, 13rd May, 2008
by Tomasz Szepieniec IFJ PAN & CYFRONET
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Scientific Experiment
■ What it is?
set of observations performed in the context of solving a
particular problem or question, to retain or falsify a hypothesis
or research concerning phenomena. (wikipedia)
computations which a researcher need to run, to make
progress in his/her research
■ Examples:
Monte Carlo simulations
Massive protein folding for testing releases of folding software
Simulation of a scheduling process using a new algorithm
Prepare an advance visualization of simulation results
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Why running it on Grid?
■ Because time matters
Single sequential job?
► NO - a processor on your desktop has the same power
Make sense if the process is done in parallel
► high throughput computing
► multithreaded, parallel computation is our future!
■ Storing and sharing huge data is now possible
■ Access to specific machines
hardware supporting specific type of computation
► huge parallel installations
software available for lower/no cost
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius 4
GAUSSIAN VO
• VO for GAUSSIAN users
• operated by in EGEEII &III by CYFRONET (Krakow, Poland)
• For users – everyone that accept the policy can join
– easy to start – ready scripts..– http://egee.grid.cyfronet.pl/Applications/gaussian-vo/
• For admins– sites with GAUSSIAN site licence can join– http://egee.grid.cyfronet.pl/Applications/gaussian-vo-how-to-support/
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Working group view: miracle of sharing resources
■ If there is a shortage of resource, SHARING is the solution
■ Typical stages of an experiment: Preparing, Computing, Analyzing, (Writing a paper)
It does not refer to some researcher (e.g. solving Sierpinski problem)
■ Sharing gives you more than you can obtain by keeping your part only
demand
resourcesUnused resources
Unmet demand
Figure copied from P. Plaszczak „Grid Computing”
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Jump in?
Before jumping see the other side…
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Additonal effort required
■ Access to data becomes transfers
■ You should think about the following: How many times I need to use it? Location and size of data Speed-up including overheads
► Parallel execution
Licensed software Other people that uses produced data Security level required
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Also technical problems
■ Job sometimes fails – resubmission is required
■ Some sites are just wrongly configured!
■ We should not overflow VOs but use it efficiently
■ In some failures only the application operator
should make decision what to do
■ Jobs are to quick (e.g. 10min)= submission
overhead to large – we need put more workload
to single grid job
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Virtual Labolatory is all we need
Figure copied from EU IST Virolab Project
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Guideline #1
It is better to spend 60 minutes on preparing a tool
than 3 minutes every day of doing work manually
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Example: Rendering Application
by Krzysztof Abramowicz, Cyfronet
■ Application: Visualization for L-system editor; user want to quickly create movie showing the results.
■ Goal: Limit movie generation time■ Mean: Make parallel scenes rendering on the grid, using interactive
connections between master and workers
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Guideline #2
The best possible grid is invisible.
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Example: Grid application without grids
by JUMC Team in EUChinaGrid Project, Kraków, Poland
T.Szepieniec AT cyfronet.pl; 13 May 2008, Grid Open Day, Vilnius
Kind suggestions of conclusions
■ For researchers:
To have more time for science, you need tools that improve
your efficiency. The grid is one of them.
■ For grid developers and industry:
Between an user and a grid middleware there is a gap that
you are to bridge with tools that hide all which is not
necessary for researcher to do science.