csc1016 coursework clarification derek mortimer march 2010

CSC1016 Coursework Clarification

Derek MortimerMarch 2010

CSC1016 Coursework

Contents

• Clarification of the specification

• Example Output

• Cache Layouts

• Caching Schemes

– Direct Mapped

– Fully Associative

• Implementation NotesMarch 2010

CSC1016 Coursework

Clarification

• You are not implementing a “complete” cache

• You are simulating the hit and miss rates for a cache using direct mapped and fully associative schemes:– Example: If address 135 was used to access a cache,

would it have been a hit or a miss? And so on…

• The hit and miss counts for a given sequence of addresses will always be the same if your schemes are implemented correctly

March 2010

CSC1016 Coursework

Clarification

• You do not need to deal with “real” data in the cache, but you will need to keep track of the meta-data for the cache blocks including:– Cache Tags– Validity Bits– LRU information (for fully associative cache)

• For a cache with 32 blocks, you will need to keep track of 32 tags, validity bits and the LRU information.

• You don’t need to model every byte in the cache, representing every block is enough.

March 2010

CSC1016 Coursework

Example Output

• Specification states that for any memory address, A, you may represent the contents in the cache as M(A).

• Example, an 8 byte cache with 2 byte blocks means the cache has 4 blocks (0, 1, 2, 3).

• We access the cache using address 32, suppose this maps to cache block 1 and then using address 24, mapping to block 3.

• At this point, blocks 1 and 3 have been accessed but 0 and 2 have not.

March 2010

CSC1016 Coursework

Example Output

• Following from the previous slide, your output would look similar to:

• As 32 and 24 were the addresses used and they caused cache data to be stored in blocks 1 and 3

March 2010

Cache{

[0] -> EMPTY[1] -> M(32)[2] -> EMPTY[3] -> M(24)

}

CSC1016 Coursework

Cache Layouts

• There are three key pieces of information in describing a cache’s layout.1. Size of the entire cache (in bytes)2. Size of a single cache block (in bytes)3. Number of blocks within a cache

• Number of Blocks =Size of Cache / Size of Block

• For example:128 bytes / 32 bytes = 4 blocks

March 2011

CSC1016 Coursework

Cache Layouts

• When talking about cache size, block size and number of blocks, it is more useful to talk about them in terms of exponents of a given base.

• For a Decimal cache, the base is always 10

• For a Binary cache, the base is always 2

• The equations are exactly the same for working in decimal or binary, however…

• Computers work in binary, you should too.March 2010

CSC1016 Coursework

Cache Layouts

• We use N to describe the size of the entire cache so:– 2N = cache size, in bytes (binary)– 10N = cache size, in bytes (decimal)

• We use M to describe the size of the blocks within the cache so:– 2M = block size, in bytes (binary)– 10M = block size, in bytes (decimal)

March 2010

CSC1016 Coursework

Cache Layouts

• We use I to describe the number of blocks within a cache so:

– 2I = number of blocks (binary)

– 10I = number of blocks (decimal)

• We know N and M, we can always work out I:

– 2N / 2M = 2I (binary)

– 10N / 10M = 10I (decimal)

Math says that ax/ay is the same as ax-y

So we can say that I = N – M, this always holds

March 2010

CSC1016 Coursework

Cache Layouts

• To summarize, for any cache you want to simulate, you must specify:

– Whether you are working in binary or decimal• So you know whether to use 2 or 10 as a base

– N: 2N or 10N = how big your cache will be

– M: 2M or 10M = how big your blocks will be

– I: 2I or 10I = how many blocks you will have

• I = N – M

March 2010

CSC1016 Coursework

Example Layouts

• A Binary Cache, 32 bytes big with 8 byte blocks. Work out N, M and I

– N: 25 = 32 soN = 5

– M: 23 = 8 soM = 3

– I: N – M = 5 – 3 soI = 2

I is correct because 2I = 22 = 4* and2N/2M = 25/23 = 32/8 = 4*March 2010

*NOTE: 4 is the number of blocks and thus, the number of tags and validity bits we need to keep track of

CSC1016 Coursework

Example Layouts

• A Decimal Cache, 1000 bytes big with 100 byte blocks. Work out N, M and I

– N: 103 = 1000 soN = 3

– M: 102 = 100 soM = 2

– I: N – M = 3 – 2 soI = 1

I is correct because 10I = 101 = 10* and10N/10M = 103/102 = 1000/100 = 10*March 2010

*NOTE: 10 is the number of blocks and thus, the number of tags and validity bits we need to keep track of

CSC1016 Coursework

Block Sizes

• If you have a cache with blocks larger than 1 byte (meaning M is greater than 0), each block within the cache will represent more than 1 byte of information, meaning more than 1 address maps to each block

• When given some address, A, to access the cache with, if we know M, we can work out the Group Address (GA)– The group address is a common prefix that all

addresses referencing the same block will shareMarch 2010

CSC1016 Coursework

Block Sizes

• Example: A Binary cache, 32 bytes big with 8 byte blocks (so 4 blocks)

• N = 5, M = 3 and I = 2• 25 = 32, 23 = 8 and 22 = 4

• If address 27 is used to access a block within the cache, this means 7 other addresses also point to the block as 8 (23) bytes would be contained within the block

March 2010

CSC1016 Coursework

Block Sizes

• How do we turn an address into a group address?– We know that for N=5, M=3 and I=2 that 8

addresses reference the same block.• Because the example is binary, we will look at

the addresses around 27 in binary.

March 2010

24 =1100025 =1100126 =1101027 = 11011

28 =1110029 =1110130 =1111031 =11111

CSC1016 Coursework

Block Sizes

• You will notice that the first 2 digits are the same for all addresses. This means we need to eliminate the last 3 digits from each address.– Remember what M equals? 3.

• So we know that to turn any address into a group address we want to remove the last M digits from it (in binary or decimal, of course).

March 2010

24 =1100025 =1100126 =1101027 = 1101128 =

1110029 =1110130 =1111031 =11111

CSC1016 Coursework

Block Sizes

• How do we remove the last M digits from a number?– Divide by 2M (binary)– Divide by 10M (decimal)

• To turn any address into its group address you simply do:– GA = A / 2M (binary)– GA = A / 10M (decimal)

March 2010

NOTE: You do not need to “convert” your numbers to binary if doing the binary cache, dividing by 2M will work as if you had. I did this only to illustrate the common prefix.

CSC1016 Coursework

Block Sizes

• Example: A Decimal cache, 1000 bytes big with 100 byte blocks (so 10 blocks)

• N = 3, M = 2 and I = 1• 103 = 1000, 102 = 100 and 101 = 10

• If address 1234 it used to access a block within the cache, this means 99 other addresses also point to the block as 100 (102) bytes would be contained within the block

March 2010

CSC1016 Coursework

Block Sizes

• We know that M = 2, and 102 = 100, so to remove the last 2 digits from the address 1234, we divide it by 102, 100.

• 1234 / 100 = 12 (Note: Integer division, round down)

• This would turn every address from 1200 to 1299 (100 addresses) into the group address 12, the common prefix.

March 2010

CSC1016 Coursework

Block Sizes

• To summarize:– If working in binary, to turn any address into its

group address you divide it by 2M

– If working in decimal, to turn any address into its group address you divide it by 10M

– Because this is integer division, always round the result down in dynamically typed languages

– Don’t worry if you don’t fully understand it yet, just remember you have to do it

March 2010

CSC1016 Coursework

Cache Access

• For both schemes, the structure of your solution will be as follows– Given an address A

• Turn A into its corresponding GA

– Does the cache contain a block with GA‘s data in it?• Yes? Record as a Hit• No? Record as a Miss

– Record that A was used to access the cache– Repeat for another address A until done– At the end, print out the contents of the cache, the

number of hits and the number of misses

March 2010

CSC1016 Coursework

Cache Access

• What differs between Direct Mapped and Fully Associative schemes?

1. How you check if the cache contains GA

2. How and where you record GA within the cache

March 2010

CSC1016 Coursework

Direct Mapped

• In a Directly Mapped scheme, an address, A, will always reference the same cache block

• This is achieved through a sequence of steps1. Calculate the Group Address (GA)2. Calculate the cache Index and Tag from the GA3. Use the Tag and Validity Bit at the calculated

index to check for a hit or miss4. Update the cache data and try the next address

March 2010

NOTE: The Index you calculate refers to one of your cache blocks, it will always be between 0 and the number of blocks you have within the cache (due to modulo arithmetic)

CSC1016 Coursework

Direct Mapped

• How do you work out the index and tag?– For a given GA, the last I digits are the index, everything before

that is the tag

• This is why we need to know N and M, so we can work out I and then work out the index and tag for any address

– Where we used division to get rid of digits for the group address, we can use modulo to extract them, meaning:

• Cache Index = GA % 2I (binary)• Cache Tag = GA / 2I

(binary)

• Cache Index = GA % 10I (decimal)• Cache Tag = GA / 10I

(decimal)

March 2010

NOTE: As with previous examples, you must round down if using JavaScript.

CSC1016 Coursework

DM Example

• Binary cache, N = 5, M = 3, thus I = 2

• 32 byte cache, 8 byte blocks, gives 4 blocks

• 4 blocks means we need to keep track of 4 tags and 4 validity bits (vBits)

• The tags are initially empty, the vBits are initialised to false.

• We will ‘access’ the cache using a set of addresses and record if each caused a hit, or a miss

Initial output might look like this:

Cache{

[0] -> EMPTY[1] -> EMPTY[2] -> EMPTY[3] -> EMPTY

}

Hits: 0, Misses: 0

March 2010

CSC1016 Coursework

DM Example• Access with Address = 24• GA = A / 2M = 24 / 8 = 3

• Index = GA % 2I = 3 % 4 = 3• Tag = GA / 2I = 3 / 4 = 0 (rounded down)

• Is the vBit set at the index?– vBit [3] == false.

• If the vBit is set, do the tags match?

– vBit was false, do not check

• Hit if both were true, otherwise, Miss.– This is a miss

• Log a Miss and update the data* to say 24 was used to access the cache– vBit[3] = true, tag[3] = 0, address[3] = 24.

March 2010

Set the vBit at 3 to true, store the tag at 3 and remember A=24 was used to access block 3 last.

CSC1016 Coursework

DM Example

• Following the previous slide, the output would now look like:

Cache

{[0] -> EMPTY[1] -> EMPTY[2] -> EMPTY[3] -> M(24)

}

Hits: 0, Misses: 1

• As we used address 24 which resulted in a miss and new information being stored in block 3.

• Subsequent accesses may cause hits, misses and alter the contents of any of the blocks within the cache…

March 2010

CSC1016 Coursework

DM Example

• If 24, or 24-31 (all addresses within the group) were used to access the cache again, the result would be a hit until another address mapped into the same block with a different tag

• Misses can happen because nothing is in the block whose index you have just calculated (the vBit would be false)

- OR -

• Because the tag currently stored at the index (from a previous access) does not match the tag you just calculated (from a new address and group address)

– This happens when your addresses can be larger than your cache size.

• E.g. In a cache 32 bytes big, addresses 0, 32, 64, 96 and 128 will all map to the same block (modulo arithmetic), so we need to check tags as well as vBits at the calculated index.

March 2010

CSC1016 Coursework

Fully Associative

• In a Fully Associative caching scheme, the steps are simpler, conceptually.

• The idea with FA is to store the information at the first available space.

– This means addresses do not always end up in the same block, so you do not need to calculate a cache index

March 2010

CSC1016 Coursework

Fully Associative

• Fully associative schemes work the following way:1. Calculate the group address from the address

2. Use the entire group address as the tag

3. Scan the entire cache and see if the tag is already in the cache

4. If the tag is found, record a hit

5. If the tag is not found, record a miss, and store the tag in the first available block (first vbit = false is empty)

6. If there are no spaces available, remove the LEAST recently used (LRU) block from the cache

7. Update the cache so the new tag is stored (if there was a miss) and remember this block is the MOST recently used.• Set vBit to true, store tag in block, remember which address was used

originally.

March 2010

CSC1016 Coursework

FA Example

• Same cache as the DM example– The initial output would be the same, empty cache

with 0 hits and 0 misses• Access with A = 24• Calculate GA = A / 2M = 24 / 8 = 3• Tag = GA• Check if 3 is currently in the cache?– No it is not

• Store 3 in the first available block (block 0, vBit is false) and set the vBit for block 0 to true

March 2010

CSC1016 Coursework

FA Example

• Following the previous slide, the output would now look like:

Cache

{[0] -> M(24)[1] -> EMPTY[2] -> EMPTY[3] -> EMPTY

}

Hits: 0, Misses: 1

• As we used address 24 which resulted in a miss and new information being stored in the first available block, 0.

• Subsequent accesses may cause new tags to be placed into the cache, you will need to make sure that the least recently used blocks are the ones that are replaced

March 2010

CSC1016 Coursework

Cache Data

• For any cache with 2I or 10I blocks, you will need to keep track of 2I or 10I validity bits and tags

• 2I or 10I will change depending on the cache size and block size (I changes as N and M do)

• Collections (e.g. Arrays and Lists) are ideal constructs to use for storing this information

• In DM you will know the index by working it out from the group address

• For FA you will need to scan the entire collection looking for the tag, empty block or a block to replace

March 2010

CSC1016 Coursework

Least Recently Used

• How do you keep track of the least recently used block, for fully associative caches.

• Two usual ways:1. Keep the blocks in order, from most recently used to least

recently used.

• When you need to replace a block, you remove the old one from the end and add the new one onto the beginning

• When you get a hit, move the hit to the beginning of the cache

2. Store ‘time’ information for each block in the cache so you can find the least recently used one

• Requires you to store more information, easy to get wrong

March 2010

CSC1016 Coursework

Summary

• Use these slides in conjunction with the coursework specification and examples in your CSC1016 notes

– For DM and FA examples I did not cover the other things that can happen (hits, misses where tags do not match)

– You should do some pen and paper examples to understand how both schemes work

– Your resulting code will be quite small, but it will require you to understand what is going on

March 2010

CSC1016 Coursework

Final Notes

• Do NOT use String manipulation (including converting numbers to binary strings) to solve this coursework– Hardware caches work using arithmetic, you need

to as well

• It is possible to have your code deal with binary and decimal without changing your calculations (all that changes is the base, 10 or 2).

March 2010

csc1016 coursework clarification derek mortimer march 2010

Documents

cache blocks

binary cache

decimal cache

associative cache

byte cache

cache data

entire cache

cache number of blocks