flexcomp lab the challenge of parallel architectures: an integrative approach

26
FLEXCOMP Lab Presentation 1 FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach A few slides are based on presentations by Katherine Yelick (UC-Berkeley) and David A. Wood (UW-Madison). D. Dayan, M. Goldstein, S. Mizrahi, R. B. Yehezkael JCT, January 2011 – טטט טטט דד"ד

Upload: kelly-jarvis

Post on 01-Jan-2016

17 views

Category:

Documents


3 download

DESCRIPTION

בס"ד. FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach. A few slides are based on presentations by Katherine Yelick (UC-Berkeley) and David A. Wood (UW-Madison). D. Dayan, M. Goldstein, S. Mizrahi, R. B. Yehezkael JCT , January 2011 – שבט תשע "א. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 1

FLEXCOMP Lab

The Challenge of Parallel Architectures:

An Integrative Approach

A few slides are based on presentations by Katherine Yelick (UC-Berkeley) and David A. Wood (UW-Madison).

D. Dayan, M. Goldstein, S. Mizrahi, R. B. Yehezkael

JCT, January 2011 – אתשע שבט"

בס"ד

Page 2: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 2

Motivation

• Multicore processors are here! - Intel, AMD, Sun, … are shipping their multi-core processors.- The computing power of desktops, servers and the cloud, are being transformed by multi-core processors.

• But Parallel Programming is hard! - Few people have any real experience and knowledge to program multi-processor systems.- Parallel programming must be made easier: - What do application programmers need to easily program parallel systems? - What should programming languages and/or models provide

programmers in order to make parallel programming easier?

Page 3: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 3

Making Parallel Programming Easier: a proposal

We intend to concentrate research efforts on threeinterrelated topics

Programming Hardware

Education for Parallel Thinking

Page 4: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 4

Making Parallel Programming Easier: a proposal

We intend to concentrate research efforts on threeinterrelated topics

• Programming

– A Flexible Execution Embedded Language is in the making.

• Hardware

– Hardware support for Flexible Execution will be developed.

• Education for Parallel Thinking

– Educational Material appropriate for Parallelism was and will

be developed for parallel programming courses.

Page 5: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 5

What was done in the past

•Flexible Algorithms – The idea

- A flexible algorithm allows various execution orders

(parallel / asynchronous), but the end result well defined.

Asynchronous means sequential in any order.

- Declarative notation with algorithmic style.

- Notationally simple and multifaceted.

•Flexible Algorithms - Course for Beginners

•Simple Flexible Language – SFL

- A prototype compiler was developed in 2002 by Isaac Dayan.

Page 6: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 6

What was done in the past

Educational Approach

- Emphasis on flexible algorithms - early awareness of parallelism

- Reading - Executing – Understanding

- Converting flexible algorithms to hardware block diagrams (a.c.)

- Converting flexible algorithms to sequential algorithms (t.r.)

- Computational Induction - Deeper understanding

- Changes to (and writing) flexible algorithms

- Overall description of systems using flexible algorithms (a.c.)

- Broaden outlook of students - important in a first CS course.

Page 7: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 7

Flexible Algorithms

Functional form: Parameters for values received (IN variables)Parameters for values returned (OUT variables)(no INOUT variables)

Function calls and compositions.

Set of statements- once only and composite assignments

Page 8: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 8

Flexible Algorithms – An Example

Reversing part of a vector

function v' •= reverse(v, low, high);{if (low<high) {v'high •= vlow; v' •= reverse (v, low+1, high-1); // ** v'low •= vhigh;}else if (low=high) {v'high •= v high;};} // end reverse

** Also may be written as: reverse (low•=low+1; high•=high-1);

The values of low and high are not changed by the statements low •= low+1, high •= high-1.There are separate variables for each call or activation of a function.

Page 9: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 9

Flexible Algorithms – An Example

Suppose that v=(1,2,3,4,5) and we wish to execute: r' •= reverse (v,0,4);

Parallel execution method

set of statements r'{ r' •= reverse (v, 0, 4); } ( _, _, _, _, _ ){ r'4 •= v0; r' •= reverse (v, 1, 3); r'0 •= v4; } ( _, _, _, _, _ ){ r'3 •= v1; r' •= reverse (v, 2, 2); r'1 •= v3; } ( 5, _, _, _, 1 ){ r'2 •= v2; } ( 5, 4, _, 2, 1 ){ } ( 5, 4, 3, 2, 1 )5 lines

Page 10: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 10

Flexible Algorithms – An Example

Sequential execution left to right with immediate execution of the function call at left

set of statements r'{ r' •= reverse (v, 0, 4); } ( _, _, _, _, _ ){ r'4 •= v0; r' •= reverse (v, 1, 3); r'0 •= v4; } ( _, _, _, _, _ ){ r' •= reverse (v, 1, 3); r'0 •= v4; } ( _, _, _, _, 1 ){ r'3 •= v1; r' •= reverse (v, 2, 2); r'1 •= v3; r'0 •= v4; } ( _, _, _, _, 1 ){ r' •= reverse (v, 2, 2); r'1 •= v3; r'0 •= v4; } ( _, _, _, 2, 1 ){ r'2 •= v2; r'1 •= v3; r'0 •= v4; } ( _, _, _, 2, 1 ){ r'1 •= v3; r'0 •= v4; } ( _, _, 3, 2, 1 ){ r'0 •= v4; } ( _, 4, 3, 2, 1 ){ } ( 5, 4, 3, 2, 1 )9 lines

Page 11: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 11

Flexible Algorithms – An Example

Sequential execution left to right with delayed execution of the function call at left

set of statements r'{ r' •= reverse (v, 0, 4); } ( _, _, _, _, _ ){ r'4 •= v0; r' •= reverse (v, 1, 3); r'0 •= v4; } ( _, _, _, _, _ ){ r' •= reverse (v, 1, 3); r'0 •= v4; } ( _, _, _, _, 1 ){ r' •= reverse (v, 1, 3); } ( 5, _, _, _, 1 ){ r'3 •= v1; r' •= reverse (v, 2, 2); r'1 •= v3; } ( 5, _, _, _, 1 ){ r' •= reverse (v, 2, 2); r'1 •= v3; } ( 5, _, _, 2, 1 ){ r' •= reverse (v, 2, 2); } ( 5, 4, _, 2, 1 ){ r'2 •= v2;} ( 5, 4, _, 2, 1 ){ } ( 5, 4, 3, 2, 1 )9 lines

... הכל צפוי והרשות נתונה ...

Page 12: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 12

Flexible Algorithms – Work to be done

Flexible Execution Embedded Language

Hardware Support for

Flexible Execution

Education for

Parallel Thinking

Page 13: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 13

Education for Parallel Thinking

• Amdahl's Law.

• Integrated course on flexible algorithms and digital logic (also for secondary schools?).

• Advanced course on flexible algorithms and parallel programming.

Page 14: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 14

Embedded Flexible Language (EFL)

// Host language code (Python, Java, etc.) ……….. ………..EFL { ……….. ……….. }// Host language code (Python, Java, etc.) ……….. ………..EFL { ……….. ……….. }

Should annotations be allowed?

Annotations would indicate in the EFL code where parallelism should be used, since too much parallelism can be inefficient.

Annotations do not affect the results produced by the program and may be ignored by the EFL pre-compiler (e.g. if there is only one processor).

Page 15: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 15

Embedded Flexible Language (EFL)

// Host language code (Python, Java, etc.) ……….. ………..EFL { ……….. ……….. }// Host language code (Python, Java, etc.) ……….. ………..EFL { ……….. ……….. }

Interface between the Host language and EFL

- Flexible execution supported in EFL blocks.

- Only sequential execution allowed in (non-EFL) host language code.

We have been working on the design of this interface

Page 16: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 16

MDA for EFL – Our Goals

Domain : EFL language

Aims : Grammar of the EFL language formal and precise Decomposition for separating the independent

components and dependant of the platform Facilitate the transformations to the PSMs and to the

code.

Page 17: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 17

MDA application

An MDA application is composed of:

1. Platform independent model (PIM).

2. Platform(s) dependant model(s) (PSM).

3. Transformations : describe the passage of the source model to the various target platforms.

Page 18: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 18

MDA - Model Transformations

(figure from: Bezivin, J. et Blanc, X., "MDA : Vers un Important Changement de Paradigme en Génie Logiciel" , 2002)

Page 19: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 19

MDA – Overview of the PIM meta-model

Page 20: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 20

MDA – Transformation from PIM to PSM

Page 21: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 21

MDA – Code Generation from PSMs

• The syntactic translation

• The semantic translation

Page 22: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 22

Hardware Support for Flexible Execution

Composite Assignments - Asynchronous execution

x f= e; // means the new value of x = (current value of x) f e; // f is a binary operator and e, e0, ..., en are expressions.

x = e0; // Initial value of x x f= e1; // Composite assignment

...x f= en; // Composite assignment

Asynchronous execution of composite assignments is well defined if ((a f b) f c) = ((a f c) f b).

Values of the expressions e0, ..., en may be computed in parallel.

Once only assignment is a special case of composite assignment.

Page 23: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 23

Hardware Support for Flexible Execution

• Use of "or=" ("and=") composite assignment is suggested.

Capacitance based memories give the possibility of implementing these operations very simply.

Paradoxically, there is no need to read the contents of the memory to perform "or=", and this may be done in parallel. The end result is well defined.

Similarly regarding "and=".

Page 24: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 24

Hardware Support for Flexible Execution

DRAMעיקרון פעולה

תהליך הכתיבה נעשה ע"י בחירת שורה ועמודה באמצעות קווי

הכתובת. ולכן המידע 1 במצב MUXה

גורם לטעינת או BUSשנמצא ב פריקת הקבל שנבחר באמצעות קווי

הכתובת.

בתהליך קריאה קווי הכתובת בוחרים את השורה עמודה

, 0 במצב MUXהמתאימה, ה דרך מעגלי BUSוהמידע יוצא ל

sense amplifierבגלל שרכיב הזיכרון הינו קבל שיש

לו זליגה עצמית, יש צורך ברענון הזיכרון, מקובל לעבוד בקצב רענון

.64msecשל

Page 25: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 25

Hardware Support for Flexible Execution

OR DRAM for Parallel Computing

OR_DRAMדומה בפעולתו לDRAMרגיל ,

OR ההבדל שבתהליך הכתיבה נעשית פעולת .1 אלא רק 0הדיודות גורמות שלא ניתן לכתוב

ניתן לפנות לכתובת , כאשר רוצים לאפס תא בזיכרון דבר שיגרום 0 לCLRולהוריד את הקו

י "ניתן להשיג מחיקה ע, בזכרוןלמחיקת התא המתנה מספיק ארוכה לפריקת המטען

.מהקבל

זיכרון זה מהווה מרכיב חשוב למחשבים מקבילים היות , FLEX אלגוריטםהמתבססים על

פונקצייתוהבסיס של החישוב המקבילי הוא OR כמו . לזכרון והיא מתבצעת בעצם הכתיבה

. CPU של המידע לFECHINGכן אין צורך ב

. דבר שחוסך זמן רב בביצועי המחשב

Page 26: FLEXCOMP Lab The Challenge of Parallel Architectures: An Integrative Approach

FLEXCOMP Lab Presentation 26

The FLEXCOMP People

StaffDavid Dayan [email protected]

Moshe Goldstein [email protected]

Shimon Mizrahi [email protected]

Raphael B. Yehezkael [email protected]

StudentsOr Berlowitz Sarit Gutman

Max (Mordechai) Rabin Efrat Tamir

OthersIsaac Baron (JCT graduate, currently completing his PhD)

Isaac Dayan (JCT graduate, currently not at any college)