m. daneluttomarcod/papers/ifiplon.pdf(monitor, checkpoint, interrupt&resume) accounting all...
TRANSCRIPT
![Page 1: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/1.jpg)
��������������
� �������
M. DaneluttoUniversity of Pisa
IFIP 10.3 Nov 1°, 2003
![Page 2: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/2.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 2
��������
� GRID: current status� A RISC grid core� Current experiments� Conclusions
![Page 3: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/3.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 3
��������
� GRID: current status� A RISC grid core� Current experiments� Conclusions
![Page 4: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/4.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 4
���������� �� � ���
��������
� Characterization of GRID� Tools� Abstract machine view� Current, “GRID aware” applications
![Page 5: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/5.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 5
������������� �� �����
� Target architectures for complex computations• Complex, multidisciplinary, multilanguage …
� Heterogeneous• HW & SW heterogeneous
� Dinamic• Instant (latency!) and long range (node up&down)
� Distributed• Geographic scale networks
![Page 6: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/6.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 6
�����
� Middleware• Between hw & sw !
� Many flavours• One winner
� Complex functionalties• Scheduling, resource
manag., data/code staging, monitoring, etc.
� All is in charge to the appl. programmer !
Hardware/OS
Middleware / services
Applications
![Page 7: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/7.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 7
������ � ����� ����
� Layer 0 • OS services, common core (TCP, POSIX)
� Layer 1• Middleware (resource discovery/management,
code/data staging, remote execution, security, monitoring)
� Layer 2• Programming environment (PSE, in some cases)
� Layer 3 • Applications
![Page 8: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/8.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 8
���� ���� �����������
� Case A• Embarassingly parallel computations (CONDOR like),
no heterogenity, dinamicity
� Case B• Esplicitly needed resources (compiler, CPU power,
…), hand made placement, no dinamicity (but “instant”one)
� Case C• Data intensive/data driven applications (no dinamicity,
heterogenity)
![Page 9: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/9.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 9
���� ���� �����������
� Case D• High level parallel code, high level
requirement specs, automatical resource discovery, automatic adaption, automatic restucturing upon dynamic changes lowering overall performance, …
![Page 10: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/10.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 10
��������
� GRID: current status� A RISC grid core� Current experiments� Conclusions
![Page 11: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/11.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 11
���� ���� �� �����
� Perspective� Basic functionalities� Implementation
![Page 12: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/12.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 12
���� ���������
� GRID components actively participate to offer services
� Subject • Entity needing services → Entity providing
services
� RGC basic service: → computing services
![Page 13: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/13.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 13
���
XX
X
XX
What to bea GRID
actor
What to bea GRID
actor
Need to compute A
![Page 14: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/14.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 14
���
XX
X
XX
RunRISC
GRID RTS
Run RISC GRID RTSRun RISC
GRID RTS
![Page 15: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/15.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 15
���
XX
X
XX
Data staging
Code Staging
Data staging
Analize Applrequirements Security
Security
ApplicationManager
![Page 16: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/16.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 16
���� ���������� ����� ���
� Secure access(user classes, certificates, etc.)
� Code & Data staging (on demand!)� Secure data/code staging� Introspection/reflection/meta info� Remote control
(monitor, checkpoint, interrupt&resume)� Accounting� All services managed from application !
![Page 17: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/17.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 17
��� ��� ������������
� Identify class of users � Allow different sandbox levels� Guarantee unique id & certificates� Use session certificates for the single
computation
![Page 18: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/18.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 18
��� ��� � �����
� On demand� No long range consistency� Completely application driven� Caching allowed/enforced upon
application directives • To enhance appl. performance as well as
multiple runs performance
![Page 19: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/19.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 19
��� ��� ������� �����
� Session certificates � Data & code
• on demand, default → encode
� Performance issues• Latency/bandwidth vs. cypher/decypher time• Critical ? Very often latency much larger!
![Page 20: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/20.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 20
��� ��� �� ���� ��
� Needed to gather compatibility info� Needed to set up application deployment� Open format
• E.g. XML� Service
• To be asked from application manager� Announce
• When announcing node availability w.r.t. grid
![Page 21: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/21.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 21
��� ��� �� ���� �
� Available to the application manager• To monitor code execution• To preempt no more useful computations• To force different behaviour upon
performance changes (loss w.r.t. theoretical model)
![Page 22: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/22.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 22
��� ��� ����� ���
� Needed to bill users!� Account CPU time
• Different policies per user class• Different policies per type (idle time, …)
� Must be present to make the approach acceptable • Differently from other points above !
![Page 23: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/23.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 23
��� ������
XX
X
XX
Appl A embarassingly
parallelLook for workers
Reflection:May compute
Start comp onTwo workers:Code staging
Data (task) staging
Start computing
Perf suboptimal(wait for
intermediate results) place
another worker
Reflection:May compute
Reflection:May compute
Start computing
Start computing
![Page 24: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/24.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 24
��� ������ ���
XX
X
XX
Appl A embarassingly
parallelLook for workers
Reflection:May compute
Start comp onTwo workers:Code staging
Data (task) staging
Start computing
Perf suboptimal(wait for
intermediate results) place
another worker
Reflection:May compute
Reflection:May compute
Start computing
Start computingFault/user back office/ …
Stop computing
![Page 25: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/25.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 25
��� �� ���� ��������
� Server process on well known port • Peer2peer discovery/announce• Implements all basic RGC services• Possibly built on top of existing middleware
• Globus• JXTA• …
• Explicitly run by machine owner !
![Page 26: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/26.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 26
�! ���� "
� Compare Globus services with our ones!� Compare the amount of code in the p2p
server placed on the machines!� Compare the usage of services
(application driven, even concerning server code staging)!
![Page 27: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/27.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 27
��# ������ � �� $
![Page 28: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/28.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 28
$ ��% &����'� � �� (
![Page 29: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/29.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 29
�! ���� ���
� Control/policies moved up to application manager
OS
Middleware (Services)
Application
currently
OS
Middleware (Services)
Appl. manager
Application
RGC
![Page 30: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/30.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 30
��������
� GRID: current status� A RISC grid core� Current experiments� Conclusions
![Page 31: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/31.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 31
)�� ���*
� FIRB GRID.it • Three year Italian National Project• Basic research on grids• WP8 “advanced, structured, component
based parallel programming model for grids”
� In the meanwhile• Several proptotypes already available
• P3L, Lithium, ASSIST, …
![Page 32: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/32.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 32
&����'�
� Java based, parallel, structured programming environment• RMI• Algoritmical skeletons (including farms,
pipelines & divide&conquer)• Full Java library• On since 2001 (experimental)
![Page 33: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/33.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 33
&����'� ��'�'�
Fireable MDFiTASK POOL
RMI server(gets serialialized
Code)
ControlThread
:Fetch
DispatchStore
results
RMI server(gets serialialized
Code)
ControlThread
:Fetch
DispatchStore
results
ControlThread
:Fetch
DispatchStore
results
RMI server(gets serialialized
Code)
Discoverworker
Discoverworker
Discoverworker
IDLE
IDLEtask
taskresult
result
result
![Page 34: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/34.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 34
&����'� ���� ����'����
� Discovery → JINI, JXTA, …� Code/Data staging → Java serialization� Security → SSL� Reflection → Beans� Remote Control → RMI� Accounting → ???
![Page 35: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/35.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 35
���������� � ���+� ��� ���
� Embarassingly parallel computations� Data at the application site� Results at the application site� TASK FARM dynamic template
• Compute tasks ASAP• On the available resources• This is a CONDOR-like, Case A application
schema!
![Page 36: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/36.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 36
,�'���! $
� Task list (pool)� Code computing f� List of possible workers (static, dynamic
required to the GRID resource manager, available once and for all)
� … go � & wait for completion
![Page 37: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/37.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 37
���� ������
� Task farm manager• Looks for workers (p2p)• Assigns tasks for execution• When results come back
• Stops looking for new workers (steady state)• Assigns tasks for execution to idle workers
� Workers• Just receive work to be computed
(RGC: computation services only at the GRID peers!)
![Page 38: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/38.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 38
���* � ���+� �)���
init
steady
Wj send Ri & Ti’Assign Ti’ to Wj
adj + set Timeout
Wj found & TiAssign Ti to Wj
adj + set Timeout
Timeout elapsedTi to be reassigned
Timeout elapsed (k times)Ti to be reassinged
Wj removed from the pool
Idle & Wj found & TiAssign Ti to WjAdj + set Timeout
Timeout elapsedTi to be reassigned
Timeout elapsed (k times)Ti to be reassinged
Wj removed from the pool
Wj found & TiAssign Ti to Wj
adj + set Timeout
Wj sends Ri & Ti reassigned to Wj’ & Ti’ & Ti’’Assign Ti’ to Wj & interrupt Wj’ & assign Ti’’ to Wj’
![Page 39: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/39.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 39
�*� �)���
init
steady
end
Send (here I am)
Recv(Ti)
Recv(end)
Recv(Ti) &Compute Ri = TiSend Ri
Recv(interrupt)&TiAbort current Ti
![Page 40: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/40.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 40
������� $
� Adaptivity• To faulty workers• To dinamic changes in network performance
� Fault tolerance• Faulty workers
� Heterogeneity• Java …
![Page 41: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/41.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 41
-.���� ����
� Modified version of the Lithium prototype• RMI, Serialization, …
� Simulator• Java program• Exact knowledge • … compared to measured
![Page 42: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/42.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 42
���'���� �������������
� Execute • a stream of independent task
• Average execution time not known• Distribution not known
• On a set of production workstations• Different Hw• Different load
![Page 43: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/43.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 43
���'���� �������������
-3-2-101234567
WS1 WS2 WS3 WS4 WS5 WS6
workstation
task
s (n
orm
aliz
ed)
WS load
#task (normalized)
WS power
![Page 44: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/44.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 44
���'���� ��������
� Run a stream of tasks� On a set of production workstations� First run
• All discovered since T0� Secon run
• WS4 discovered after ½ tasks� Third run
• WS3 never discovered• WS2 discovered after 1/3 tasks• WS4 discovered after ½ tasks
![Page 45: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/45.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 45
���'���� �������
0
10
20
30
40
50
60
WS1 (3189.84BogoMIPS)
WS2 (3189.84BogoMIPS)
WS3 (466.94BogoMIPS)
WS4 (4810.34BogoMIPS)
Task
per
WS
Run 1 Run 2 Run 3
![Page 46: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/46.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 46
���'���� ���� � ������
� Workers assumed to fail• Due to internal problems• Due to network problems/delays
� Faults % to the total number of tasks executed
� Efficiency measured
![Page 47: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/47.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 47
���'���� ���� � ������
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0.39 0.78 1.95 3.91
Faults (% of tasks)
Effi
cien
cy
![Page 48: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/48.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 48
���'���� �� ������ �
� A stream of 1024 tasks executed� With 2 or 10 workers� 2 or 4 faults per worker (at random time)� α = 0.15 or α = 0.85
RTT = α RTT + (1-α) RTTcurrent
![Page 49: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/49.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 49
���'���� �� ����� �
0
20
40
60
80
100
120
140
160
0.15 0.85 0.15 0.85 0.15 0.85 0.15 0.85
4 4 8 8 20 20 40 40
2 2 2 2 10 10 10 10
alpha/faults(total)/workers
Ext
ra (d
uplic
atge
d) ta
sks
![Page 50: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/50.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 50
��������
� GRID: current status� A RISC grid core� Current experiments� Conclusions
![Page 51: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/51.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 51
-���'����
� Task farm manager → application schema manager
� Other managers (e.g. pipeline)→ more complex application schemas
� Become an environment• Suitable to handle common grid aware
applications
![Page 52: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/52.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 52
-���'���� ���
Task Farm Manager
Grid Resouce interface
Other manager interface
Task Farm ManagerGrid Resouce interface
Other manager interface
Task Farm ManagerGrid Resouce interface
Other manager interface
Pipe ManagerGrid Resouce interface
Other manager interface
� Structure managers!
GRID mw
![Page 53: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/53.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 53
-���'���� �#�
� Programming environment• Target GRIDs• GRID aware• Structured• Component based• With application managers• Sitting on top of common sense middleware
(say GT3, Java/Jini/JXTA, …)
![Page 54: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/54.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 54
������/ ��*
� So much activity on GRID … !� Want to cite CONDOR
• Limited application schema• Very efficient RTS• Included in recent toolkits
� Whereas RGC:• Unlimited application schemas• (hopefully) efficient RTS• ???
![Page 55: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/55.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 55
����'�����
� New methodology for GRID tool development proposed
� Join skeleton/design pattern results with the GRID world
� Preliminary results� Currently evolving� …
![Page 56: M. Daneluttomarcod/Papers/ifipLon.pdf(monitor, checkpoint, interrupt&resume) Accounting All services managed from application ! M. Danelutto IF P WG 1 0. 3- Nov st 2 17](https://reader035.vdocuments.mx/reader035/viewer/2022071213/602dbe2bbe851127ea378a28/html5/thumbnails/56.jpg)
M. Danelutto IFIP WG10.3 -- Nov.1st 2003 56
� ��/0/�%'����%��
���%/�%'����%��12� ��/
A RISC approach to the GRID