introduction cell processor

40
Introduction Cell Processor

Upload: coolmirza143

Post on 29-Nov-2014

689 views

Category:

Technology


2 download

DESCRIPTION

shared by Mansoor Mirza

TRANSCRIPT

Page 1: Introduction Cell Processor

Introduction Cell Processor

Page 2: Introduction Cell Processor

Why Cell Processor

Performance improvement with increase in frequency Possible due to increase in transistor

density Clock frequency is timing reference for a

processor Power density

Leakage currents increase with reducing the transistor density

Increase the idle power consumption

2

Page 3: Introduction Cell Processor

History of Cell Processor

A powerful processor of next generation of PS2 Powerful multimedia and broadband

network interface IBM contribution in shaping the concept

of Cell processor Collaboration with Toshiba STI Alliance

3

Page 4: Introduction Cell Processor

History of Cell Processor

Development of Cell 1999: Sony proposed partnership with IBM

for successor of PS2 2001: STI alliance initiated the development

on Cell 2004: first prototype of Cell 2005: Sony unveil the PS3 in an E3 2006: official release of PS3, Cell SDK by

IBM 2008: IBM Roadrunner become fastest

supercomputer in the world (1.026 pflops)

4

Page 5: Introduction Cell Processor

Overview of Cell

5Design and Animation Game Programming Graphics Programming Matthew Scarpino

Page 6: Introduction Cell Processor

Overview of Cell

66.189 IAP 2007 MIT

Page 7: Introduction Cell Processor

Cell components

Memory Interface Controller (MIC) Bus Interface Controller (BIC) PowerPC Processor Element/Unit

(PPE/PPU) Synergistic processing Element/Unit

(SPE/SPU) Element Interconnect Bus (EIB) Input/Output InterFace (IOIF)

7

Page 8: Introduction Cell Processor

Cell components

MIC Connects the processor with system

memory Two channels to system memory Xteram Data Rate Dynamic Random Access

Memory (XDR DRAM) Can support 8 data transfers per second Provides high data flow at low frequency

PS3 contains 256 MB XDR DRAM

8

Page 9: Introduction Cell Processor

Cell components

PPU Based on IBM PowerPC architecture RISC architecture Cell control center

Runs operating system Manages interrupts Manages L2 shared cache Issues work to SPU

9

Page 10: Introduction Cell Processor

Cell components

10

PPU

Design and Animation Game Programming Graphics Programming Matthew Scarpino

Page 11: Introduction Cell Processor

Cell components

11

PPU 64bit architecture Supports SIMD Supports cell related functions Dual thread processor Computation power is reduced

PPU is not computational element in Cell Reduces power consumption

Page 12: Introduction Cell Processor

Cell components

12

Functional units of PPU

Design and Animation Game Programming Graphics Programming Matthew Scarpino

Page 13: Introduction Cell Processor

Cell components

13

Instruction unit (IU) Fetches and executes the instruction

Load and Store Unit Receives the memory access request

Vector/Scalar Unit (VSU) Contains Floating Point Unit Performs FP operations on individual or

multiple operands

Design and Animation Game Programming Graphics Programming Matthew Scarpino

Page 14: Introduction Cell Processor

Cell components

14

Fixed point unit (FPU) Performs fix point operations

Arithmetic and logical operations

Memory Management Unit (MMU) Performs virtual memory management

PPU registers Provides quick access to operands Some functional unit can access only

processor registers

Design and Animation Game Programming Graphics Programming Matthew Scarpino

Page 15: Introduction Cell Processor

Cell components

15

32 general purpose registers 32 floating point registers Link register

Holds branch address of upcoming target Count register

Holds branch address of upcoming target (or)

Holds loop counter Fixed point exception register

Holds carry and overflow bits for fixed point op. Design and Animation Game Programming Graphics Programming Matthew

Scarpino

Page 16: Introduction Cell Processor

Cell components

16

Condition register Holds status of arithmetic, logical or

comparison Floating point status and control

register Status of scalar FP operation

Vector registers Contains data for vector operations

Vector status and control register Holds saturation bit for vector operation

Vector register save and restore register Saves vector registers in case of context

switch Design and Animation Game Programming Graphics Programming Matthew Scarpino

Page 17: Introduction Cell Processor

Cell components

17

SPU Basic work horse of Cell Designed to executes SIMD Separate Instruction set Takes the work for PPU Does have any cache No virtual memory Each SPU can contain only 256KB of

memory

Page 18: Introduction Cell Processor

Cell components

18

SPU SPU can only access its own 256KB memory

directly Dynamic Memory Access is required to

transfer the required data to SPU Memory alignment is required to pass data

to SPU Different methods to communicates with

PPU and other memory

Page 19: Introduction Cell Processor

Cell components

19Design and Animation Game Programming Graphics Programming Matthew Scarpino

Page 20: Introduction Cell Processor

Cell components

Purpose of SPU Take 128-bit data to local register Apply operation on it Save the result to local memory

Two distinct pipelines Even pipeline handles mathematical

operations Odd pipeline handles everything else

20

Page 21: Introduction Cell Processor

Cell components

SPU Control Unit (SCN) Fetches and dispatches the instructions Perform branching and other control

operations SPU even fixed point unit

Handles logic/arithmetic operations Performs comparisons and reciprocations

for FP SPU odd fixed point unit

Performs bit level shifts, rotations, and shuffling

21

Page 22: Introduction Cell Processor

Cell components

SPU floating point unit Performs floating point operations

SPU load/store unit Performs loads and stores Manages branch targets and DMA to Local

store SPU channel and DMA unit

Communicates with Memory Flow Controller Controls DMA transfer

22

Page 23: Introduction Cell Processor

Cell components

SPU registers 128 general purpose registers Floating point status and control registers

Contains status and results of floating point operations

SPU local Store (LS) Each SPU contains very low latency 256KB

memory It acts as local cache for SPU All data transfer is responsibility of the

programmer

23

Page 24: Introduction Cell Processor

Cell components

SPU local Store (LS) Not a cache just an SRAM Only one read/write operations per second Operations accessing the LS

DMA Transfer data from main memory to LS

SPU load/store Reads/writes 16 bytes at a time

Instruction fetch Reads 128 bytes of the LS at once

24

Page 25: Introduction Cell Processor

Cell components

SPU local Store (LS) Does not support virtual memory Tradeoff between cache coherence and

fetching the data to LS LS is low latency memory Cache coherence protocols are used for other

processors Data is transferred to LS using high throughput

EIB via DMA instead of cache coherence protocols Make the hardware simple

25

Page 26: Introduction Cell Processor

Cell components

communications between SPU and other system DMA Mailboxes Events and signals

26

Page 27: Introduction Cell Processor

Cell components

DMA Transfers data to LS Asynchronous in nature

SPU continues its operation while DMA Transfers data in chunk of bytes of size

power of 2 Provides control to manage and synchronize

the data transfer One DMA can maximum transfer 16KB

27

Page 28: Introduction Cell Processor

Cell components

28Design and Animation Game Programming Graphics Programming Matthew Scarpino

Page 29: Introduction Cell Processor

Cell components

EIB Connects all the system components Consists of four data ring (two clockwise

and two counter-clockwise) One ring is for control signals One bus cycles can transfer 16 bytes of

data Each ring can carry three DMA requests

simultaneously Each DMA takes at least 8 cycles to

complete

29

Page 30: Introduction Cell Processor

Cell components

MFC Coprocessor to communicate between SPU

and EIB Process data transfer without interrupting

the SPU SPU requests the MFC to get the data MFC processes the rest of data transfer

30

Page 31: Introduction Cell Processor

Cell components

Mailboxes Simplest way to transfer the data between

PPU and SPU Can only transfer 4 bytes of data Provides one-to-one communication Mailbox channels

Outgoing mailbox Outgoing interrupt mailbox

Holds the data for outside world and cause interrupt if applicable

Incoming mailbox

31

Page 32: Introduction Cell Processor

Cell components

Events and signals Commonly used for DMA notifications Signals can be sent directly to outside world Signals can provide one-to-many style

communication

32

Page 33: Introduction Cell Processor

Cell components

Events and signals Commonly used for DMA notifications Signals can be sent directly to outside world Signals can provide one-to-many style

communication

33

Page 34: Introduction Cell Processor

Software development of Cell

Different instruction sets for SPU and PPU

Different compilers are required to compile the applications for two codes

Embedding the SPU code in PPU executable

34

Page 35: Introduction Cell Processor

Software development of Cell

Tools to compile the application for Cell PPU compiler

ppu-gcc SPU compiler

spu-gcc Embed SPU code to PPU

ppu-embedspu

35

Page 36: Introduction Cell Processor

Software development of Cell

Cell simulator Full System Simulator Emulates all system components Can provides cycle accurate information Provides graphical interface to se and

interact with system components

36

Page 37: Introduction Cell Processor

Software development of Cell

37IBM Full System Simulator user guide

Page 38: Introduction Cell Processor

Software development of Cell

Three modes Fast mode Simple mode Cycle mode

Graphical visualization of SPU and PPU Provides debugging and profiling

information Provides system utilization information 38

Page 39: Introduction Cell Processor

Software development of Cell

39

Page 40: Introduction Cell Processor

Software development of Cell

40Design and Animation Game Programming Graphics Programming Matthew Scarpino