anshul kumar, cse iitd csl718 : main memory cpu-cache-main memory performance 9th mar, 2006

37
Anshul Kumar, CSE IITD CSL718 : Main Memory CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Upload: leon-wade

Post on 16-Jan-2016

238 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD

CSL718 : Main MemoryCSL718 : Main MemoryCSL718 : Main MemoryCSL718 : Main Memory

CPU-Cache-Main Memory Performance

9th Mar, 2006

Page 2: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 2

A Simple ModelA Simple ModelA Simple ModelA Simple Model

tav = tc + pm . tc.miss

where

tav = average memory access time as seen by CPU

tc = cache access time

pm = miss probability (consider only read misses, if write penalties

are hidden by buffers)

tc.miss = cache miss penalty

CPU Cache Memory

Page 3: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 3

Cache miss penaltyCache miss penaltyCache miss penaltyCache miss penalty

Depends on • Various cache policies

– Read policy

– Load policy

– Write policy

– Write buffers etc.

• Main memory organization– Interleaving

– Page mode

Page 4: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 4

Read PoliciesRead PoliciesRead PoliciesRead Policies

CacheMemory

Teff=(1-pm).1 + pm . (T+2)

Sequential Simple:

CacheMemory

Teff=(1-pm).1 + pm . (T+1)

Concurrent Simple:

CacheMemory

Teff=(1-pm).1 + pm . (T+1)

Sequential Forward:

CacheMemory

Teff=(1-pm).1 + pm . (T)

Concurrent Forward:

1 1 1T

1 1 1T

1 1T

1 1T

Page 5: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 5

Load policiesLoad policiesLoad policiesLoad policies

4 AU Block

Cache miss on AU 1

Block Load

Load Forward

Fetch Bypass(wrap aroundload)

0 1 2 3

Page 6: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 6

Analyzing Write Policies:CPU timeAnalyzing Write Policies:CPU timeAnalyzing Write Policies:CPU timeAnalyzing Write Policies:CPU time

Hit:WB, Miss: WB 1 Tb + i 1 1

Hit:WB, Miss: WTWA 1 Tb + i 1 1

Hit:WB, Miss: WTNWA 1 Tb + i 1 1

Hit:WT, Miss: WB 1 Tb + i 1 1

Hit:WT, Miss: WTWA 1 Tb + i 1 1

Hit:WT, Miss: WTNWA 1 Tb + i 1 1

Policy Read Read Write Writehit miss hit miss

i depends on read policy

Page 7: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 7

Analyzing Write Policies:Bus timeAnalyzing Write Policies:Bus timeAnalyzing Write Policies:Bus timeAnalyzing Write Policies:Bus time

Hit:WB, Miss: WB 0 Tb (2-Pc) 0 Tb(2-Pc)

Hit:WB, Miss: WTWA 0 Tb (2-Pc) 0 Tb(2-Pc)+Tw

Hit:WB, Miss: WTNWA 0 Tb (2-Pc) 0 Tw

Hit:WT, Miss: WB 0 Tb (2-Pc) Tw Tb(2-Pc)

Hit:WT, Miss: WTWA 0 Tb Tw Tb+Tw

Hit:WT, Miss: WTNWA 0 Tb Tw Tw

Policy Read Read Write Writehit miss hit miss

Page 8: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 8

Interleaving with Fast Page ModeInterleaving with Fast Page ModeInterleaving with Fast Page ModeInterleaving with Fast Page Mode

m

LLT

m

LTTT buscalineaccess 1

Page 9: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 9

A Refined ModelA Refined ModelA Refined ModelA Refined Model

tav = tc + pm . (tc.miss + tinterference + tw-interference + tIO-interference )

where

tinterference = interference among line transfers

tw-interference = interference between word writes and line transfers

tIO-interference = interference between I/O and line transfers

Page 10: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 10

Interference among line transfersInterference among line transfersInterference among line transfersInterference among line transfers

What happens when another miss occurs in tbusy = tm.miss - tc.miss

interval?

tinterference = additional delay due to this

= expected number of misses during tbusy * delay per miss

= ( * tbusy * pm) * (tbusy / 2)where = memory request rate of processor

tc tc.miss

tm.miss

CPU blocked CPU executing

Memory busy

Page 11: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 11

Interference I/Os and writesInterference I/Os and writesInterference I/Os and writesInterference I/Os and writes

delay = prob that memory is busy when request arrives *

average waiting period

what happens when memory is found to be busy serving one request and some other requests are waiting?

Memory busy

request arrivals

served waiting served

Page 12: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 12

I/O InterferenceI/O InterferenceI/O InterferenceI/O Interference

tIO-interference = delay due to I/O contention

= probability that memory is occupied with I/O *

average time taken to complete ongoing I/O

= ( ) * (tservice +tIO-wait)/2

tservice = time to service (block read/write time)

tIO-wait = waiting time

= 0, if CPU has a higher priority

0, otherwise

estimate using queuing model

Page 13: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 13

Write Interference DelayWrite Interference DelayWrite Interference DelayWrite Interference Delay

tw-interference = probability that a write through is occupying the memory when a read miss occurs *

average time taken to complete ongoing write

Page 14: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 14

Memory performance using queuing modelMemory performance using queuing modelMemory performance using queuing modelMemory performance using queuing model

Arrival ofrequests

(from processor/cache)

Servicing ofrequests

(by memory)

Requests queuedfor service

Statistical behaviour of arrivals ?Statistical behaviour of service?

Model Nomenclature: arrival / service / number M / G / 1 G : GeneralM / M / 1 M : Poisson/ExponentialM / D / 1 D : ConstantMB / D / 1 MB : Binomial

Page 15: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 15

Modeling memory requestsModeling memory requestsModeling memory requestsModeling memory requests

prob of a request in one cycle = pprob of no request in one cycle = 1 – pprob of no request in T/ cycles = (1 – p)T/

prob of at least one req in T/ cycles = 1 – (1 – p)T/

prob of k requests in n (=T/ ) cycles = nCk pk (1 – p)n-k

(Binomial distribution)expected no. of requests in n cycles = n p

T : interval(memory cycle time)

: processor cycle

Page 16: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 16

Poisson ApproximationPoisson ApproximationPoisson ApproximationPoisson Approximation

If processor cycles are small

(i.e., 0, p 0, n , n p T),

Binomial distribution Poisson distribution, request rate =

prob of k requests in interval T =

expected no. of requests in interval T = T

Interval between two consecutive requests has an exponential distribution, prob (inter arrival interval > t) = 1 – e - t

Tk

ek

T

!

)(

Page 17: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 17

Modeling ServiceModeling ServiceModeling ServiceModeling Service

• Each request is served in constant time

e.g. cache write through requests,

cache block transfer requests

or• Service time has an exponential distribution

e.g. I/O requests with varying block sizes where small blocks are more common than large blocks

Page 18: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 18

M / G / 1 ModelM / G / 1 ModelM / G / 1 ModelM / G / 1 Model

Average waiting time = Tw =

Average queue length = Q =

where

= occupancy of server = / = average service rate

c = = variance of service time

)1(2

)1(1 22

c

)1(2

)1( 22

c

Page 19: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 19

Special cases: M/M/1, M/D/1Special cases: M/M/1, M/D/1Special cases: M/M/1, M/D/1Special cases: M/M/1, M/D/1

M/M/1 c = 1

Average waiting time = Tw =

Average queue length = Q =

M/D/1 c = 0

Average waiting time = Tw =

Average queue length = Q =

1

1 2

1

2

)1(2

1 2

)1(2

2

Page 20: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 20

M/D/1 with low server occupancyM/D/1 with low server occupancyM/D/1 with low server occupancyM/D/1 with low server occupancy

Average waiting time = Tw =

Average queue length = Q =

when is small, Tw =

=

Compare this with

)1(2

1 2

)1(2

2

2

1 2

2

1

2

1

2busym tp

2

1

Page 21: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 21

Designing buffer to hold the queueDesigning buffer to hold the queueDesigning buffer to hold the queueDesigning buffer to hold the queue

How to design a buffer so that buffer overflow or stalling due to buffer full is within certain limit?

For M/M/1 model ,

prob(queue size buffer size BF) = BF+1

Choose BF so that this probability is below a desired value.

Page 22: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 22

Open and Closed QueuesOpen and Closed QueuesOpen and Closed QueuesOpen and Closed Queues

Arrival ofrequests

(from processor/cache)

Servicing ofrequests

(by memory)

Requests queuedfor service

•Processor is not blocked by queuing delays and request rate remains unaffected – Open queue•Processor is blocked due to queuing delays and request rate drops – Closed queue

Page 23: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 23

Open and Closed QueuesOpen and Closed QueuesOpen and Closed QueuesOpen and Closed Queues

Arrival ofrequests

(from processor/cache)

Servicing ofrequests

(by memory)

Requests queuedfor service

Time Tw 1/ Number (open) Q = Tw = / Number (closed) Qa a

occupancy (open q) = = occupancy (closed q) + waiting (closed q) a + Qa

Page 24: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 24

M/D/1 Closed QueueM/D/1 Closed QueueM/D/1 Closed QueueM/D/1 Closed QueueReduced request rate = a

Reduced occupancy = a = a /

Requests being served = a

Requests waiting =

)1(2

2

a

a

1)1(1)1(

1)1()1(2

22

22

a

aa

aa

Page 25: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 25

Deriving queue length, wait timeDeriving queue length, wait timeDeriving queue length, wait timeDeriving queue length, wait time

Let ti = time when request i is being served

ri = no. of arrivals during ti

ni = queue length at the end of ti

including item in service

Assume occupancy of server = = / < 1 process reaches a steady state

Expected value E(ti ) = E(t ) = T = 1/

E(ri ) = E(r ) = E(t ) = / = E(ni ) = E(n ) = N

Page 26: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 26

Relating Relating nni+1i+1 to to nniiRelating Relating nni+1i+1 to to nnii

ni+1 = ni + arrivals – departures

two cases need to be considered:

i) ni 0

ii) ni = 0

Ci+1Ci+2Ci+3 Ci

ni

Page 27: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 27

When When nnii 0 0When When nnii 0 0

Ci+1 arrived before Ci left

ni+1 = ni + ri+1 - 1

Ci served Ci+1 served

Ci leaves Ci+1 leaves

timeti ti+1

Page 28: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 28

When When nnii = 0 = 0When When nnii = 0 = 0

Ci+1 arrived after Ci left

ni+1 = ni + 1 + ri+1 – 1

= ni + ri+1

Ci served Ci+1 served

Ci leaves Ci+1 leaves

timeti ti+1

Ci+1 arrives

Page 29: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 29

Combining the two casesCombining the two casesCombining the two casesCombining the two cases

ni+1 = ni + ri+1 – 1 + i

wherei = 0, when ni 0 and

i = 1, when ni = 0

note that ni i = 0 and i2

= i

E(ni+1) = E( ni ) + E( ri+1 ) – 1 + E( i )

in steady state, E(n) = E( n ) + E( r ) – 1 + E( )

that is, E() = 1 - E( r ) = 1 - prob ( n 0) =

Page 30: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 30

Combining the two casesCombining the two casesCombining the two casesCombining the two casesni+1 = ni + ri+1 – 1 + i

ni+12 = ni

2 + (ri+1 – 1)2 + i2 + 2 ni (ri+1 – 1)

+ 2 (ri+1 – 1) i + 2 ni i

ni+12 = ni

2 + (ri+1 – 1)2 + i + 2 ni (ri+1 – 1) + 2 (ri+1 – 1) i

E(ni+12) = E( ni

2 ) + E(ri+1 – 1)2 + E( i )

+ 2 E[ ni (ri+1 – 1) ] + 2 E[(ri+1 – 1) i ]

0 = E[(r – 1)2] + E( ) + 2 E[ n (r – 1) ] + 2 E[(r – 1) ]

0 = E(r2)-2 +1+ (1-) + 2 E(n) ( – 1) + 2 ( – 1)(1-)

2 E(n) (1- ) = E(r2) -2 2 +

Page 31: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 31

continuedcontinuedcontinuedcontinued

2 E(n) (1- ) = E(r2) -2 2 +

This is valid for G/G/1

)1(2

- )E(

)1(2

2- )E( )E( N

222

rr

n

Page 32: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 32

Consider Poisson arrivalConsider Poisson arrivalConsider Poisson arrivalConsider Poisson arrival

P(ri) =

mean E(ri) = ti

variance ri2 = ti

ri2 = E(ri

2) - |E(ri)|2

E(ri2) = ri

2 + |E(ri)|2

Take expectation over i

E(r2) = E(t) + 2 E(t 2)

i

i

!

)(

i

i tr

er

t

Page 33: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 33

continuedcontinuedcontinuedcontinued

mean E(t) = 1/

variance t2

E(t2) = t2 + [E(t) ] 2 = t

2 + 1/ 2

Recall E(r2) = E(t) + 2 E(t 2)

Therefore, E(r2) = / + 2 (t2 + 1/ 2 )

= + 2 t2 + 2

where c2 = 2 t2

)1(2

) (1

)1(2

)1(2

- )E( )E( N

222222

cr

n t

Page 34: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 34

Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1

P(n; t) = prob that there are n req in the system at time t (in queue + in service)

P(n; t+t) = P(n; t)(1 - t - t) + P(n-1; t) t + P(n+1; t) tP(0; t+t) = P(0; t)(1 - t) + P(1; t) t

Prob of more than one event in t is neglected (t2

term)

Page 35: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 35

Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1

dP(n; t)/dt = P(n; t)(-- ) + P(n-1; t) + P(n+1; t) dP(0; t)/dt = P(0; t)(-) + P(1; t)In steady state, We can drop ;t Derivatives tend to 0 0 = P(n)(-- ) + P(n-1) + P(n+1) 0 = P(0)(-) + P(1) P(n) - P(n+1) = P(n-1) - P(n) P(0) - P(1) = 0

P(n-1) - P(n) = 0 P(n) = P(n-1)

Page 36: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 36

Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1

P(n) = P(n-1)

P(n) = n P(0)

11)1()1(

)1()1()()(

)1()(1)0(

11

)0(1)0(1)(

2

2

000

00

i

i

i

i

i

n

i

i

i

iiiPinE

nPandP

PPiP

Page 37: Anshul Kumar, CSE IITD CSL718 : Main Memory CPU-Cache-Main Memory Performance 9th Mar, 2006

Anshul Kumar, CSE IITD slide 37

Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1

)(

)(Prob

)1(

)1)(1()1()(

)(Prob

1

1

2

00

k

k

k

i

ik

i

kserverqueueinitems

iP

kserverqueueinitems