cache management

41
Cache Management Presented By: Babar Shahzaad 14F-MS-CP-12 Department of Computer Engineering , Faculty of Telecommunication and Information Engineering , University of Engineering and Technology (UET), Taxila

Upload: muhammad-babar-shahzaad

Post on 09-Aug-2015

58 views

Category:

Engineering


3 download

TRANSCRIPT

Page 1: Cache management

Cache Management

Presented By:

Babar Shahzaad

14F-MS-CP-12

Department of Computer Engineering ,

Faculty of Telecommunication and Information Engineering ,

University of Engineering and Technology (UET), Taxila

Page 2: Cache management

Outline What is the Problem

What is a Cache What is Locality

Cache Hit/Miss

Types of Cache Miss Memory Mapping Techniques

Direct Mapping

Fully-Associative Mapping

Set-Associative Mapping

Summary of Memory Mapping Techniques Methods to Overcome Cache Miss

Miss Cache

Victim Caches

Stream Buffers

Summary of Cache Miss Techniques

Page 3: Cache management

What is the Problem

Page 4: Cache management

What is a Cache

Page 5: Cache management

What is a Cache

Small, fast storage used to improve average access time to slow memory

Exploits spatial and temporal locality

In computer architecture, almost everything is a cache! Registers “a cache” on variables – software managed

First-level cache “a cache” on second-level cache

Second-level cache “a cache” on memory

Memory “a cache” on disk (virtual memory)

 Translation Lookaside Buffer(TLB) “a cache” on page table

Branch-prediction “a cache” on prediction information

Page 6: Cache management

What is a Cache (Cont…)

Processor/

RegistersL1-

Cache

L2-Cache

Memory

Disk, Tape, etc.

Bigger Faster

Page 7: Cache management

What is Locality

Temporal Locality: locality in time If an item is referred, it will tend to be referred again soon

Spatial Locality: locality in space/distance If an item is referred, its neighboring addresses will be referred soon

Page 8: Cache management

Cache Hit/Miss

Page 9: Cache management

Cache Hit/Miss

Cache Hit Data found in cache. Results in data

transfer at maximum speed

Cache Miss Data not found in cache. Processor loads

data from Memory and copies into cache. This results in extra delay, called miss penalty

Page 10: Cache management

Types of Cache Miss

Page 11: Cache management

Types of Cache Miss

There are 4C’s

Page 12: Cache management

Memory Mapping Techniques

Page 13: Cache management

Direct Mapping

Page 14: Cache management

Direct Mapping

Each block of main memory maps to only one cache line i.e. if a block is in cache, it must be in one specific place

Address is in two parts Least Significant w bits identify unique word/byte

Most Significant s bits specify one memory block

The MSBs are split into a cache line field r and a tag of s-r (most significant)

Page 15: Cache management

Direct Mapping (Cont…)

Page 16: Cache management

Direct Mapping (Cont…)

Pros and Cons Simple

Inexpensive

No need of expensive associative search

Fixed location for given block

Miss rate may go up due to possible increase of mapping conflicts

If a program accesses 2 blocks that map to the same line repeatedly, cache misses(conflict misses) are very high

Cache Line Table

Page 17: Cache management

Fully-Associative Mapping

Page 18: Cache management

Fully-Associative Mapping

A main memory block can load into any line of cache

Memory address is interpreted as tag and word

Tag uniquely identifies block of memory

Every line’s tag is examined for a match

Cache searching gets expensive and power consumption due to parallel

Page 19: Cache management

Fully-Associative Mapping(Cont…)

Page 20: Cache management

Fully-Associative Mapping(Cont…) Pros and Cons

There is flexibility as to which block to replace when a new block is read into the cache

No restriction on mapping from Memory to Cache

Associative search tags is expensive

Feasible for very small size caches only

The complex circuitry required for parallel Tag comparison is however a major disadvantage

Page 21: Cache management

Set-Associative Mapping

Page 22: Cache management

Set-Associative Mapping

Cache is divided into a number of sets

Each set contains a number of lines

A given block maps to any line in a given set e.g. Block B can be in any line of set i

If 2 lines per set 2 way set associative mapping

A given block can be in one of 2 lines in only one set

Page 23: Cache management

Set-Associative Mapping(Cont…)

Page 24: Cache management

Set-Associative Mapping(Cont…)

Pros and Cons Most commercial cache have 2,4, or 8 way set associativity

Cheaper than a fully-associative cache

Lower miss ratio than a direct mapped cache

Direct mapped cache is the fastest

After simulating the hit ratio for direct mapped and (2,4,8 way) set associative mapped cache, it is observed that there is significant difference in performance at least up to cache size of 64KB, set associative being the better one.

However, beyond that, the complexity of cache increases in proportion to the associativity, hence both mapping give approximately similar hit ratio.

Page 25: Cache management

Summary of Memory Mapping Techniques Number of Misses

Direct Mapping > Set-Associative Mapping > Full-Associative Mapping

Access Latency Direct Mapping < Set-Associative Mapping < Full-Associative Mapping

Page 26: Cache management

Methods to Overcome Cache

Miss

Page 27: Cache management

Miss Cache

Page 28: Cache management

Miss Cache

Fully-associative cache inserted between L1 and L2

Contains between 2 and 5 cache lines of data

Aims to reduce conflict misses

Page 29: Cache management

Miss Cache Operation

On a miss in L1, we check the Miss Cache.

If the block is there, then we bring it into L1 So the penalty of a miss in L1 is just a few cycles, possibly as few as

one

Otherwise, fetch the block from the lower-levels, but store the retrieved value in the Miss Cache

Page 30: Cache management

Miss Cache Performance

What if we doubled the size of the L1 cache instead? 32% decrease in cache misses

So only .13% decrease per line

However, note that we are storing every block in the Miss Cache twice...

Page 31: Cache management

Victim Cache

Page 32: Cache management

Victim Cache

Motivation: Can we improve on our miss rates with miss caches by modifying the replacement policy (i.e. can we do something about the wasted space in the pure miss caching system)?

Fully-associative cache inserted between L1 and L2

Page 33: Cache management

Victim Cache Operation

On a miss in L1, we check the Victim Cache

If the block is there, then bring it into L1 and swap the ejected value into the miss cache Misses that are caught by the cache are still cheap, but better

utilization of space is made

Otherwise, fetch the block from the lower-levels

Page 34: Cache management

Victim Cache Performance

Even better than Miss Cache!

Smaller L1 caches benefit more from victim caches

Page 35: Cache management

Wait A Minute!

What about compulsory and capacity misses?

What about instruction misses?

Victim Cache and Miss Cache most helpful when temporal locality can be exploited

Pre-fetching techniques can help Pre-fetch Always

Pre-fetch on Miss

Tagged Pre-fetch

Can we improve on these techniques?

Page 36: Cache management

Stream Buffers

Page 37: Cache management

Stream Buffers

Stream Buffer is a FIFO queue placed in between L1 and L2

Page 38: Cache management

Stream Buffers Operation

When a miss occurs in L1, say at address A, the Stream Buffer immediately starts to pre-fetch elements at A+1

Subsequent accesses check the head of the Stream Buffer before going to L2

Note that non-sequential misses will cause the line to restart pre-fetching (i.e. A+2 and A+4 will each restart the pre-fetching process, even if A+4 was already in the stream)

Page 39: Cache management

Stream Buffer Performance

72% of instruction misses removed

25% of data misses removed

Page 40: Cache management

Summary of Cache Miss Techniques Miss Caches: Small caches in between L1 and L2. Whenever a

miss occurs in L1, they receive a copy of the data from the lower level cache

Victim Cache: Enhancement of Miss Caches. Instead of copying data from lower levels on a miss, they take whatever block has been evicted from L1

Stream Buffers: Simple FIFO buffers that are immediately pre-fetched into on a miss

Page 41: Cache management