coen 180 main memory cache architectures. basics speed difference between cache and memory is small....

21
COEN 180 COEN 180 Main Memory Cache Main Memory Cache Architectures Architectures

Upload: valentine-cummings

Post on 23-Dec-2015

230 views

Category:

Documents


0 download

TRANSCRIPT

COEN 180COEN 180

Main Memory Cache ArchitecturesMain Memory Cache Architectures

BasicsBasics

Speed difference between cache and memory Speed difference between cache and memory is small. is small.

Therefore:Therefore: Cache algorithms need to be implemented Cache algorithms need to be implemented

in hardware and be simple.in hardware and be simple. MM is byte addressable, but only words are MM is byte addressable, but only words are

moved, typically, the last two bits of the moved, typically, the last two bits of the address are not even transmitted.address are not even transmitted.

Hit rate needs to be high.Hit rate needs to be high.

BasicsBasics

Average Access TimeAverage Access Time

==

(Hit Rate)*(Access to Cache) (Hit Rate)*(Access to Cache)

++

(Miss Rate)*(Access to MM)(Miss Rate)*(Access to MM)

Direct Mapped CacheDirect Mapped Cache

Each Item in MM can be located in only one Each Item in MM can be located in only one position in cache.position in cache.

Direct Mapped CacheDirect Mapped Cache

Tag (highest order bits); Tag (highest order bits); Index;Index; 2 LSB distinguish the byte in the word.2 LSB distinguish the byte in the word.

Address is split intoAddress is split into

Direct Mapped CacheDirect Mapped Cache

Tag serves to identify the data item in the Tag serves to identify the data item in the cache.cache.

Index is the address of the word in the Index is the address of the word in the cache.cache.

Direct Mapped Cache ExampleDirect Mapped Cache ExampleMemory Addresses are 8b long.

Cache with 8 words in it.

Byte located at 01101011. Tag is 011. Index is 010. The two least significant bits are 11.

Since the tags, the byte is in the cache.

If the byte is in the cache, it is byte 3 in the word stored in line 010 = 2.

Write PoliciesWrite Policies

Write-through: A write is performed to both Write-through: A write is performed to both the cache and to the main memory.the cache and to the main memory.

Copy-back: A write is performed only to Copy-back: A write is performed only to cache. If an item is replaced by another cache. If an item is replaced by another item, then the item to be replaced is copied item, then the item to be replaced is copied back to main memory.back to main memory.

Dirty BitDirty Bit

To implement copy-back:

Add one bit that tells us whether the data item is dirty (has changed) or not.

Cache Operations Cache Operations Write-ThroughWrite-Through

READ:READ:

1.1. Extract Tag and Index from Address.Extract Tag and Index from Address.

2.2. Go to the cache line given by Index.Go to the cache line given by Index.

3.3. See whether the Tag matches the Tag stored See whether the Tag matches the Tag stored there.there.

4.4. If they match: Hit. Satisfy read from cache.If they match: Hit. Satisfy read from cache.

5.5. If they do not match: Miss. Satisfy read from If they do not match: Miss. Satisfy read from main memory. Also store item in cache.main memory. Also store item in cache.

Cache Operations Cache Operations Write-ThroughWrite-Through

Write:Write:

1.1. Extract Tag and Index from address.Extract Tag and Index from address.

2.2. Write datum in cache at location given by Write datum in cache at location given by Index.Index.

3.3. Reset the tag field in the cache line with Reset the tag field in the cache line with Tag.Tag.

4.4. Write datum in main memory.Write datum in main memory.

Cache Operations Cache Operations Copy-BackCopy-Back

If there is a read miss, then before the If there is a read miss, then before the current item in cache is replaced, check the current item in cache is replaced, check the dirty bit. If the current item is dirty, then it dirty bit. If the current item is dirty, then it needs to be written to main memory before needs to be written to main memory before being replaced.being replaced.

Writes are only to cache and set the dirty bit.Writes are only to cache and set the dirty bit.

Cache OverheadCache Overhead

Assume a cache that contains 1MB worth of data. Assume a cache that contains 1MB worth of data.

Hence, it contains 256K data words.Hence, it contains 256K data words.

Hence, it has 2**18 cache lines.Hence, it has 2**18 cache lines.

Hence, index is 18 bits long, tag is 12 bits long.Hence, index is 18 bits long, tag is 12 bits long.

Hence, each cache line stores 12b+32b (write Hence, each cache line stores 12b+32b (write through).through).

Hence, overhead is 12/32 = 37.5%.Hence, overhead is 12/32 = 37.5%.

Cache uses 1.375MB actual storage.Cache uses 1.375MB actual storage.

BlocksBlocks

Moving only one word to cache does not Moving only one word to cache does not exploit spatial locality.exploit spatial locality.

Move blocks of words into cache.Move blocks of words into cache. Split address into:Split address into:

Address =Tag:Index:Word in Block:2 ls bits.Address =Tag:Index:Word in Block:2 ls bits.

BlocksBlocks

If there is a miss on a read, bring all words If there is a miss on a read, bring all words in the block into the cache.in the block into the cache.

For a write operation:For a write operation:– Write the word, bring other words into the Write the word, bring other words into the

cache.cache.

OrOr– Write directly to main memory. Do not update Write directly to main memory. Do not update

cache.cache.

Example: Blocked CacheExample: Blocked Cache

Read access to byte at address 0111 0101. Extract components: 011:10:1:01. Go to cache line 2. Compare tags. No match, hence a miss. Dirty bit set. Move the two words to MM. Bring in words starting at 0111 0000 and 0111 R0100.

Read access to byte at address 0100 0101. Extract components: 010:00:1:01. Go to cache line 0. Compare tags. They match, hence a hit. Byte is the second byte in word 1, i.e. 0100 0101 (assuming low-endian).

Set Associative CacheSet Associative Cache

In a direct mapped cache, there is only one In a direct mapped cache, there is only one possible location for a datum in the cache.possible location for a datum in the cache.

Contention between two (or more) popular Contention between two (or more) popular data is possible, resulting in low hit rates.data is possible, resulting in low hit rates.

Solution:Solution:

Place more than one item in a cache line.Place more than one item in a cache line.

Set Associative CacheSet Associative Cache

A A nn-way set associative cache has a cache -way set associative cache has a cache line consisting of n pairs of Tag + Datum.line consisting of n pairs of Tag + Datum.

The larger the associativity The larger the associativity n, n, the larger the the larger the hit rate. hit rate.

The larger the associativity The larger the associativity nn, the more , the more complex the read, since all complex the read, since all nn tags need to tags need to be compared in parallel.be compared in parallel.

Cache replacement is more difficult to Cache replacement is more difficult to implement.implement.

Set Associative Cache ExampleSet Associative Cache Example

Popular words at address 0011 0000 and 1011 Popular words at address 0011 0000 and 1011 0000 fit both into cache. 0000 fit both into cache.

Split address 0011 0000 into 0011:00:00 and Split address 0011 0000 into 0011:00:00 and address 1011 0000 into 1011:00:00.address 1011 0000 into 1011:00:00.

Set Associative CacheSet Associative Cache

Assume a Assume a ll-way set associative cache.-way set associative cache. On a miss, we need to replace one of the On a miss, we need to replace one of the l l data in data in

the cache line. the cache line. We need to implement in hardware a choice.We need to implement in hardware a choice. Random:Random: Replaces one of the items at random. Replaces one of the items at random. LRULRU: If the associativity is two, then we can store : If the associativity is two, then we can store

in a single bit the priority. If the set associativity is in a single bit the priority. If the set associativity is larger, then we use an approximation.larger, then we use an approximation.