flexcomp lab the challenge of parallel architectures: an integrative approach
DESCRIPTION
בס"ד. FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach. A few slides are based on presentations by Katherine Yelick (UC-Berkeley) and David A. Wood (UW-Madison). D. Dayan, M. Goldstein, S. Mizrahi, R. B. Yehezkael JCT , January 2011 – שבט תשע "א. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
FLEXCOMP Lab Presentation 1
FLEXCOMP Lab
The Challenge of Parallel Architectures:
An Integrative Approach
A few slides are based on presentations by Katherine Yelick (UC-Berkeley) and David A. Wood (UW-Madison).
D. Dayan, M. Goldstein, S. Mizrahi, R. B. Yehezkael
JCT, January 2011 – אתשע שבט"
בס"ד
FLEXCOMP Lab Presentation 2
Motivation
• Multicore processors are here! - Intel, AMD, Sun, … are shipping their multi-core processors.- The computing power of desktops, servers and the cloud, are being transformed by multi-core processors.
• But Parallel Programming is hard! - Few people have any real experience and knowledge to program multi-processor systems.- Parallel programming must be made easier: - What do application programmers need to easily program parallel systems? - What should programming languages and/or models provide
programmers in order to make parallel programming easier?
FLEXCOMP Lab Presentation 3
Making Parallel Programming Easier: a proposal
We intend to concentrate research efforts on threeinterrelated topics
Programming Hardware
Education for Parallel Thinking
FLEXCOMP Lab Presentation 4
Making Parallel Programming Easier: a proposal
We intend to concentrate research efforts on threeinterrelated topics
• Programming
– A Flexible Execution Embedded Language is in the making.
• Hardware
– Hardware support for Flexible Execution will be developed.
• Education for Parallel Thinking
– Educational Material appropriate for Parallelism was and will
be developed for parallel programming courses.
FLEXCOMP Lab Presentation 5
What was done in the past
•Flexible Algorithms – The idea
- A flexible algorithm allows various execution orders
(parallel / asynchronous), but the end result well defined.
Asynchronous means sequential in any order.
- Declarative notation with algorithmic style.
- Notationally simple and multifaceted.
•Flexible Algorithms - Course for Beginners
•Simple Flexible Language – SFL
- A prototype compiler was developed in 2002 by Isaac Dayan.
FLEXCOMP Lab Presentation 6
What was done in the past
Educational Approach
- Emphasis on flexible algorithms - early awareness of parallelism
- Reading - Executing – Understanding
- Converting flexible algorithms to hardware block diagrams (a.c.)
- Converting flexible algorithms to sequential algorithms (t.r.)
- Computational Induction - Deeper understanding
- Changes to (and writing) flexible algorithms
- Overall description of systems using flexible algorithms (a.c.)
- Broaden outlook of students - important in a first CS course.
FLEXCOMP Lab Presentation 7
Flexible Algorithms
Functional form: Parameters for values received (IN variables)Parameters for values returned (OUT variables)(no INOUT variables)
Function calls and compositions.
Set of statements- once only and composite assignments
FLEXCOMP Lab Presentation 8
Flexible Algorithms – An Example
Reversing part of a vector
function v' •= reverse(v, low, high);{if (low<high) {v'high •= vlow; v' •= reverse (v, low+1, high-1); // ** v'low •= vhigh;}else if (low=high) {v'high •= v high;};} // end reverse
** Also may be written as: reverse (low•=low+1; high•=high-1);
The values of low and high are not changed by the statements low •= low+1, high •= high-1.There are separate variables for each call or activation of a function.
FLEXCOMP Lab Presentation 9
Flexible Algorithms – An Example
Suppose that v=(1,2,3,4,5) and we wish to execute: r' •= reverse (v,0,4);
Parallel execution method
set of statements r'{ r' •= reverse (v, 0, 4); } ( _, _, _, _, _ ){ r'4 •= v0; r' •= reverse (v, 1, 3); r'0 •= v4; } ( _, _, _, _, _ ){ r'3 •= v1; r' •= reverse (v, 2, 2); r'1 •= v3; } ( 5, _, _, _, 1 ){ r'2 •= v2; } ( 5, 4, _, 2, 1 ){ } ( 5, 4, 3, 2, 1 )5 lines
FLEXCOMP Lab Presentation 10
Flexible Algorithms – An Example
Sequential execution left to right with immediate execution of the function call at left
set of statements r'{ r' •= reverse (v, 0, 4); } ( _, _, _, _, _ ){ r'4 •= v0; r' •= reverse (v, 1, 3); r'0 •= v4; } ( _, _, _, _, _ ){ r' •= reverse (v, 1, 3); r'0 •= v4; } ( _, _, _, _, 1 ){ r'3 •= v1; r' •= reverse (v, 2, 2); r'1 •= v3; r'0 •= v4; } ( _, _, _, _, 1 ){ r' •= reverse (v, 2, 2); r'1 •= v3; r'0 •= v4; } ( _, _, _, 2, 1 ){ r'2 •= v2; r'1 •= v3; r'0 •= v4; } ( _, _, _, 2, 1 ){ r'1 •= v3; r'0 •= v4; } ( _, _, 3, 2, 1 ){ r'0 •= v4; } ( _, 4, 3, 2, 1 ){ } ( 5, 4, 3, 2, 1 )9 lines
FLEXCOMP Lab Presentation 11
Flexible Algorithms – An Example
Sequential execution left to right with delayed execution of the function call at left
set of statements r'{ r' •= reverse (v, 0, 4); } ( _, _, _, _, _ ){ r'4 •= v0; r' •= reverse (v, 1, 3); r'0 •= v4; } ( _, _, _, _, _ ){ r' •= reverse (v, 1, 3); r'0 •= v4; } ( _, _, _, _, 1 ){ r' •= reverse (v, 1, 3); } ( 5, _, _, _, 1 ){ r'3 •= v1; r' •= reverse (v, 2, 2); r'1 •= v3; } ( 5, _, _, _, 1 ){ r' •= reverse (v, 2, 2); r'1 •= v3; } ( 5, _, _, 2, 1 ){ r' •= reverse (v, 2, 2); } ( 5, 4, _, 2, 1 ){ r'2 •= v2;} ( 5, 4, _, 2, 1 ){ } ( 5, 4, 3, 2, 1 )9 lines
... הכל צפוי והרשות נתונה ...
FLEXCOMP Lab Presentation 12
Flexible Algorithms – Work to be done
Flexible Execution Embedded Language
Hardware Support for
Flexible Execution
Education for
Parallel Thinking
FLEXCOMP Lab Presentation 13
Education for Parallel Thinking
• Amdahl's Law.
• Integrated course on flexible algorithms and digital logic (also for secondary schools?).
• Advanced course on flexible algorithms and parallel programming.
FLEXCOMP Lab Presentation 14
Embedded Flexible Language (EFL)
// Host language code (Python, Java, etc.) ……….. ………..EFL { ……….. ……….. }// Host language code (Python, Java, etc.) ……….. ………..EFL { ……….. ……….. }
Should annotations be allowed?
Annotations would indicate in the EFL code where parallelism should be used, since too much parallelism can be inefficient.
Annotations do not affect the results produced by the program and may be ignored by the EFL pre-compiler (e.g. if there is only one processor).
FLEXCOMP Lab Presentation 15
Embedded Flexible Language (EFL)
// Host language code (Python, Java, etc.) ……….. ………..EFL { ……….. ……….. }// Host language code (Python, Java, etc.) ……….. ………..EFL { ……….. ……….. }
Interface between the Host language and EFL
- Flexible execution supported in EFL blocks.
- Only sequential execution allowed in (non-EFL) host language code.
We have been working on the design of this interface
FLEXCOMP Lab Presentation 16
MDA for EFL – Our Goals
Domain : EFL language
Aims : Grammar of the EFL language formal and precise Decomposition for separating the independent
components and dependant of the platform Facilitate the transformations to the PSMs and to the
code.
FLEXCOMP Lab Presentation 17
MDA application
An MDA application is composed of:
1. Platform independent model (PIM).
2. Platform(s) dependant model(s) (PSM).
3. Transformations : describe the passage of the source model to the various target platforms.
FLEXCOMP Lab Presentation 18
MDA - Model Transformations
(figure from: Bezivin, J. et Blanc, X., "MDA : Vers un Important Changement de Paradigme en Génie Logiciel" , 2002)
FLEXCOMP Lab Presentation 19
MDA – Overview of the PIM meta-model
FLEXCOMP Lab Presentation 20
MDA – Transformation from PIM to PSM
FLEXCOMP Lab Presentation 21
MDA – Code Generation from PSMs
• The syntactic translation
• The semantic translation
FLEXCOMP Lab Presentation 22
Hardware Support for Flexible Execution
Composite Assignments - Asynchronous execution
x f= e; // means the new value of x = (current value of x) f e; // f is a binary operator and e, e0, ..., en are expressions.
x = e0; // Initial value of x x f= e1; // Composite assignment
...x f= en; // Composite assignment
Asynchronous execution of composite assignments is well defined if ((a f b) f c) = ((a f c) f b).
Values of the expressions e0, ..., en may be computed in parallel.
Once only assignment is a special case of composite assignment.
FLEXCOMP Lab Presentation 23
Hardware Support for Flexible Execution
• Use of "or=" ("and=") composite assignment is suggested.
Capacitance based memories give the possibility of implementing these operations very simply.
Paradoxically, there is no need to read the contents of the memory to perform "or=", and this may be done in parallel. The end result is well defined.
Similarly regarding "and=".
FLEXCOMP Lab Presentation 24
Hardware Support for Flexible Execution
DRAMעיקרון פעולה
תהליך הכתיבה נעשה ע"י בחירת שורה ועמודה באמצעות קווי
הכתובת. ולכן המידע 1 במצב MUXה
גורם לטעינת או BUSשנמצא ב פריקת הקבל שנבחר באמצעות קווי
הכתובת.
בתהליך קריאה קווי הכתובת בוחרים את השורה עמודה
, 0 במצב MUXהמתאימה, ה דרך מעגלי BUSוהמידע יוצא ל
sense amplifierבגלל שרכיב הזיכרון הינו קבל שיש
לו זליגה עצמית, יש צורך ברענון הזיכרון, מקובל לעבוד בקצב רענון
.64msecשל
FLEXCOMP Lab Presentation 25
Hardware Support for Flexible Execution
OR DRAM for Parallel Computing
OR_DRAMדומה בפעולתו לDRAMרגיל ,
OR ההבדל שבתהליך הכתיבה נעשית פעולת .1 אלא רק 0הדיודות גורמות שלא ניתן לכתוב
ניתן לפנות לכתובת , כאשר רוצים לאפס תא בזיכרון דבר שיגרום 0 לCLRולהוריד את הקו
י "ניתן להשיג מחיקה ע, בזכרוןלמחיקת התא המתנה מספיק ארוכה לפריקת המטען
.מהקבל
זיכרון זה מהווה מרכיב חשוב למחשבים מקבילים היות , FLEX אלגוריטםהמתבססים על
פונקצייתוהבסיס של החישוב המקבילי הוא OR כמו . לזכרון והיא מתבצעת בעצם הכתיבה
. CPU של המידע לFECHINGכן אין צורך ב
. דבר שחוסך זמן רב בביצועי המחשב
FLEXCOMP Lab Presentation 26
The FLEXCOMP People
StaffDavid Dayan [email protected]
Moshe Goldstein [email protected]
Shimon Mizrahi [email protected]
Raphael B. Yehezkael [email protected]
StudentsOr Berlowitz Sarit Gutman
Max (Mordechai) Rabin Efrat Tamir
OthersIsaac Baron (JCT graduate, currently completing his PhD)
Isaac Dayan (JCT graduate, currently not at any college)