iki10230 pengantar organisasi komputer bab 5.2: cache

of 29 /29
1 IKI10230 Pengantar Organisasi Komputer Bab 5.2: Cache 23 April 2003 Bobby Nazief ([email protected]) Qonita Shahab ([email protected]) bahan kuliah: http://www.cs.ui.ac.id/kuliah/iki10230/ Sumber : 1. Hamacher. Computer Organization, ed-5. 2. Materi kuliah CS152/1997, UCB.

Author: slade-dale

Post on 14-Mar-2016

46 views

Category:

Documents


0 download

Embed Size (px)

DESCRIPTION

IKI10230 Pengantar Organisasi Komputer Bab 5.2: Cache. Sumber : 1. Hamacher. Computer Organization , ed-5. 2. Materi kuliah CS152/1997, UCB. 23 April 2003 Bobby Nazief ([email protected]) Qonita Shahab ([email protected]) bahan kuliah: http://www.cs.ui.ac.id/kuliah/iki10230/. - PowerPoint PPT Presentation

TRANSCRIPT

AT90S8515*
sangat cepat waktu eksekusi dalam orde nanoseconds sampai dengan picoseconds
perlu mengakses kode dan data program!
Dimana program berada?
so how do we account for this gap?
Menggunakan teknologi memori!
Memory (DRAM)
Kapasitas jauh lebih besar dari registers, lebih kecil dari disk (tetap terbatas)
Access time ~50-100 nano-detik, jauh lebih cepat dari disk (mili-detik)
Mengandung subset data pada disk (basically portions of programs that are currently being run)
Fakta: memori dengan kapasitas besar (murah!) lambat, sedangkan memori dengan kapasitas kecil (mahal) cepat.
Solution: bagaimana menyediakan (ilusi) kapasitas besar dan akses cepat!
*
Increasing Distance from Proc.,
Lebih kecil,
Lebih cepat,
Subset semua data pada level lebih atas (mis. menyimpan data yang sering digunakan)
Efisien dalam pemilihan mana data yang akan disimpan, karena tempat terbatas
Lowest Level (usually disk) contains all available data
*
Memory Hierarchy Analogy: Library (1/2)
You’re writing a term paper (Processor) at a table in Doe
Doe Library is equivalent to disk
essentially limitless capacity
Table is memory
smaller capacity: means you must return book when table fills up
*
Open books on table are cache
smaller capacity: can have very few open books fit on table; again, when table fills up, you must close a book
much, much faster to retrieve data
Illusion created: whole library open on the tabletop
*
The Principle of Locality:
Program access a relatively small portion of the address space at any instant of time.
Address Space
of reference
The principle of locality states that programs access a relatively small portion of the address space at any instant of time.
This is kind of like in real life, we all have a lot of friends. But at any given time most of us can only keep in touch with a small group of them.
There are two different types of locality: Temporal and Spatial. Temporal locality is the locality in time which says if an item is referenced, it will tend to be referenced again soon.
This is like saying if you just talk to one of your friends, it is likely that you will talk to him or her again soon.
This makes sense. For example, if you just have lunch with a friend, you may say, let’s go to the ball game this Sunday. So you will talk to him again soon.
Spatial locality is the locality in space. It says if an item is referenced, items whose addresses are close by tend to be referenced soon.
Once again, using our analogy. We can usually divide our friends into groups. Like friends from high school, friends from work, friends from home.
Let’s say you just talk to one of your friends from high school and she may say something like: “So did you hear so and so just won the lottery.”
You probably will say NO, I better give him a call and find out more.
So this is an example of spatial locality. You just talked to a friend from your high school days. As a result, you end up talking to another high school friend. Or at least in this case, you hope he still remember you are his friend.
+3 = 10 min. (X:50)
Temporal Locality (Locality in Time):
Keep most recently accessed data items closer to the processor
Spatial Locality (Locality in Space):
Move blocks consists of contiguous words to the upper levels
Lower Level
To Processor
From Processor
Blk X
Blk Y
How does the memory hierarchy work? Well it is rather simple, at least in principle.
In order to take advantage of the temporal locality, that is the locality in time, the memory hierarchy will keep those more recently accessed data items closer to the processor because chances are (points to the principle), the processor will access them again soon.
In order to take advantage of the spatial locality, not ONLY do we move the item that has just been accessed to the upper level, but we ALSO move the data items that are adjacent to it.
+1 = 15 min. (X:55)
By taking advantage of the principle of locality:
Present the user with as much memory as is available in the cheapest technology.
Provide access at the speed offered by the fastest technology.
Control
Datapath
Secondary
Storage
(Disk)
Processor
Registers
Main
Memory
(DRAM)
Second
Level
Cache
(SRAM)
On-Chip
Cache
1s
10,000,000s
Ts
The design goal is to present the user with as much memory as is available in the cheapest technology (points to the disk).
While by taking advantage of the principle of locality, we like to provide the user an average access speed that is very close to the speed that is offered by the fastest technology.
(We will go over this slide in details in the next lecture on caches).
+1 = 16 min. (X:56)
Registers <-> Memory
by the programmer (files)
Disk contains everything.
When Processor needs something, bring it into to all lower levels of memory.
Cache contains copies of data in memory that are being used.
Memory contains copies of data on disk that are being used.
*
Kecil, storage cepat untuk meningkatkan “average access time” dibandingkan memori
Memanfaatkan spatial & temporal locality
Registers “a cache” on variables – software managed
First-level cache a cache on second-level cache
Second-level cache a cache on memory
Memory a cache on disk (virtual memory)
*
Bagaimana organisasi cache? (ingat: cache akan menerima alamat memori dari prosesor, tidak ada perubahan pada rancangan prosesor ada/tanpa cache).
Bagaimana melakukan mapping (relasi) antara alamat memori dengan cache (ingat: prosesor hanya melihat memory byte addressable)
*
Cache: Blok Memory
Transfer data antara cache dan memori dalam satuan blok (kelipatan words)
Mapping (penerjemahan) antara blok di cache dan di main memory (ingat: cache copy dari main memory)
Hit
Main
Memory
Cache
To Processor
From Processor
Blk X
Blk Y
How does the memory hierarchy work? Well it is rather simple, at least in principle.
In order to take advantage of the temporal locality, that is the locality in time, the memory hierarchy will keep those more recently accessed data items closer to the processor because chances are (points to the principle), the processor will access them again soon.
In order to take advantage of the spatial locality, not ONLY do we move the item that has just been accessed to the upper level, but we ALSO move the data items that are adjacent to it.
+1 = 15 min. (X:55)
data lokasi dari:
Lokasi memori:0, 4, 8, ...
Secara umum: lokasi memori yang merupakan kelipatan 4 (jumlah blok cache)
Memory
Memory
Address
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
Cache
Index
0
1
2
3
Let’s look at the simplest cache one can build. A direct mapped cache that only has 4 bytes.
In this direct mapped cache with only 4 bytes, location 0 of the cache can be occupied by data form memory location 0, 4, 8, C, ... and so on.
While location 1 of the cache can be occupied by data from memory location 1, 5, 9, ... etc.
So in general, the cache location where a memory location can map to is uniquely determined by the 2 least significant bits of the address (Cache Index).
For example here, any memory location whose two least significant bits of the address are 0s can go to cache location zero.
With so many memory locations to chose from, which one should we place in the cache?
Of course, the one we have read or write most recently because by the principle of temporal locality, the one we just touch is most likely to be the one we will need again soon.
Of all the possible memory locations that can be placed in cache Location 0, how can we tell which one is in the cache?
+2 = 22 min. (Y:02)
Cache
Index
0
1
2
3
Let’s look at the simplest cache one can build. A direct mapped cache that only has 4 bytes.
In this direct mapped cache with only 4 bytes, location 0 of the cache can be occupied by data form memory location 0, 4, 8, C, ... and so on.
While location 1 of the cache can be occupied by data from memory location 1, 5, 9, ... etc.
So in general, the cache location where a memory location can map to is uniquely determined by the 2 least significant bits of the address (Cache Index).
For example here, any memory location whose two least significant bits of the address are 0s can go to cache location zero.
With so many memory locations to chose from, which one should we place in the cache?
Of course, the one we have read or write most recently because by the principle of temporal locality, the one we just touch is most likely to be the one we will need again soon.
Of all the possible memory locations that can be placed in cache Location 0, how can we tell which one is in the cache?
+2 = 22 min. (Y:02)
Cache
Index
0
1
2
3
Cache
Set
0
1
Let’s look at the simplest cache one can build. A direct mapped cache that only has 4 bytes.
In this direct mapped cache with only 4 bytes, location 0 of the cache can be occupied by data form memory location 0, 4, 8, C, ... and so on.
While location 1 of the cache can be occupied by data from memory location 1, 5, 9, ... etc.
So in general, the cache location where a memory location can map to is uniquely determined by the 2 least significant bits of the address (Cache Index).
For example here, any memory location whose two least significant bits of the address are 0s can go to cache location zero.
With so many memory locations to chose from, which one should we place in the cache?
Of course, the one we have read or write most recently because by the principle of temporal locality, the one we just touch is most likely to be the one we will need again soon.
Of all the possible memory locations that can be placed in cache Location 0, how can we tell which one is in the cache?
+2 = 22 min. (Y:02)
Masalah: multiple alamat memori “map” ke indeks blok yang sama!
Bagaimana kita mengetahui apakah alamat blok yang diinginkan berada di cache?
Harus ada identifikasi blok memori dikaitkan dengan alamat
Bagaimana jika ukuran blok > 1 byte?
Solusi: bagi alamat dalam bentuk fields
ttttttttttttttttt iiiiiiiiii oooo
if have select within
correct block block block
Semua fields untuk penerjemahan dianggap sebagai bilangan positif integer.
*
Setiap blok terdiri dari 4 word
Tentukan ukuran field: tag, index, dan offset jika menggunakan komputer 32-bit.
Offset
blok mempunyai: 4 words 16 bytes 24 bytes
diperlukan 4 bit untuk menentukan alamat byte
*
Index: (~index into an “array of blocks”)
diperlukan untuk menentukan blok yang mana pada cache (rows yang mana)
cache mempunyai 16 KB = 214 bytes
blok mempunyai 24 bytes (4 words)
# rows/cache = # blocks/cache (sebab setiap rows terdiri dari satu blok)
= bytes/cache / bytes/row
diperlukan 10 bit untuk menentukan indeks pada rows dari cache
Tag: 32 – (4 + 10) = 18 bit
*
4 bits offset
10 bit index
18 bit tag
Akses dilakukan dengan mencari index blok/ rows
Bandingkan tag dari alamat dan tag yang disimpan pada row tersebut
Jika tag sama => HIT, ambil byte (offset)
Jika tidak sama => MISS, ambil blok dari memori.
*
Cache
Tag
000000000000000000
Cache
Index
0000000000
0000000001
0000000010
0000000011
ttttttttttttttttt iiiiiiiiii oooo
Let’s look at the simplest cache one can build. A direct mapped cache that only has 4 bytes.
In this direct mapped cache with only 4 bytes, location 0 of the cache can be occupied by data form memory location 0, 4, 8, C, ... and so on.
While location 1 of the cache can be occupied by data from memory location 1, 5, 9, ... etc.
So in general, the cache location where a memory location can map to is uniquely determined by the 2 least significant bits of the address (Cache Index).
For example here, any memory location whose two least significant bits of the address are 0s can go to cache location zero.
With so many memory locations to chose from, which one should we place in the cache?
Of course, the one we have read or write most recently because by the principle of temporal locality, the one we just touch is most likely to be the one we will need again soon.
Of all the possible memory locations that can be placed in cache Location 0, how can we tell which one is in the cache?
+2 = 22 min. (Y:02)
Replacement Algorithms
Berlaku bagi:
Associative Mapping
Set-Associative Mapping
Bagaimana menempatkan blok baru jika cache sudah penuh berpengaruh pada kinerja!
Beberapa algoritme: