chapter 2: memory hierarchy design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_a.pdf · memory...

27
1 CS359: Computer Architecture Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science and Engineering

Upload: others

Post on 20-Oct-2019

21 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

1

CS359: Computer Architecture

Memory Hierarchy Design(Appendix B and Chapter 2)

Yanyan ShenDepartment of Computer Science and Engineering

Page 2: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

2

Memory Hierarchy Design

Roadmap

2.1 Cache OrganizationVirtual MemorySix Basic Cache OptimizationsTen Advanced Optimizations of Cache

PerformanceMemory Technology and OptimizationsVirtual Memory and ProtectionProtection: Virtual Memory and Virtual

Machines

Page 3: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

3

Memory Hierarchy Design

Computer Memory Memory is a large linear array of bytes.

Each byte has a unique address(location).

Byte of data at address 0x100, and 0x101 Most computers support byte (8-bit)

addressing. Data may have to be aligned on word (4

bytes) or double word (8 bytes) boundary. int is 4 bytes double precision floating point is 8 bytes

32-bit vs. 64-bit addresses! we will assume 32-bit for rest of course,

unless otherwise stated!

Page 4: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

4

Memory Hierarchy Design

Our Naïve View of Memory

What issues do we need to worry about in implementing thememory system?

Page 5: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

6

Memory Hierarchy Design

Performance Gap between Memory and Processor

Page 6: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

7

Memory Hierarchy Design

SecondLevelCache

(SRAM)

A Typical Memory HierarchyTake advantage of the principle of locality to present the user with (1) as much memory as is available in the cheapest technology, (2) but at the speed offered by the fastest technology

Control

Datapath

SecondaryMemory(Disk)

On-Chip Components

RegFile

MainMemory(DRAM)D

ataC

acheInstr

Cache

ITLBD

TLB

Access time (ns): 0.1 1 10 100 10,000Size (bytes): 100 10K M G T

Cost: highest lowest

Page 7: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

8

Memory Hierarchy Design

SRAM and DRAM Caches use SRAM for

speed and technology compatibility Fast (typical access times

of 0.5 to 2.5 nsec) Low density (6 transistor

cells), higher power, expensive ($2000 to $5000 per GB in 2008)

Static: content will last “forever” (as long as power is left on)

Main memory uses DRAM for size (density) Slower (typical access times

of 50 to 70 nsec) High density (1 transistor

cells), lower power, cheaper ($20 to $75 per GB in 2008)

Dynamic: needs to be “refreshed” regularly (~ every 8 ms)

- consumes1% to 2% of the active cycles of the DRAM

Addresses divided into 2 halves (row and column)

- RAS or Row Access Strobe triggering the row decoder

- CAS or Column Access Strobe triggering the column selector

SRAM: 6 transistors DRAM: 1 transistor

Page 8: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

9

Memory Hierarchy Design

Locality

• Recently referenced items are likely to be referenced in the near future.

Temporal locality

• Items with nearby addresses tend to be referenced close together in time.

Spatial locality

Principle of Locality:Programs tend to reuse data and instructions near

those they have used recently, or those were recently referenced themselves.

Page 9: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

10

Memory Hierarchy Design

The Memory Hierarchy: How Does it Work? Temporal Locality (locality in time)

If a memory location is referenced then it will tend to be referenced again soon

⇒ Keep most recently accessed data items closer to the processor

Spatial Locality (locality in space) If a memory location is referenced, the locations

with nearby addresses will tend to be referenced soon

⇒ Move blocks consisting of contiguous words closer to the processor

Page 10: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

11

Memory Hierarchy Design

Processor

Control

Datapath

Memory

Devices

Input

Output

Cache

Main

Mem

ory

Secondary M

emory

(Disk)

Overview

Page 11: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

12

Memory Hierarchy Design

Real Organization of Hierarchy

Page 12: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

13

Memory Hierarchy Design

Characteristics of the Memory Hierarchy

Increasing distance from the processor in accesstime

L1$

L2$

Main Memory

Secondary Memory

Processor

(Relative) size of the memory at each level

Inclusive–what is in L1$ is a subset of what is in L2$ is a subset of what is in MM that is a subset of is in SM

4-8 bytes (word)

1 to 4 blocks

1,024+ bytes (disk sector = page)

8-32 bytes (block)

Page 13: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

14

Memory Hierarchy Design

Example

Page 14: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

15

Memory Hierarchy Design

Terminology Block (or line): the minimum unit of information that is present

(or not) in a cache Hit Rate: the fraction of memory accesses found in a level of the

memory hierarchy Hit Time: Time to access that level which consists of

(1) Time to access the block + (2) Time to determine hit/miss

Miss Rate: the fraction of memory accesses not found in a level of the memory hierarchy ⇒ 1 - (Hit Rate) Miss Penalty: Time to replace a block in that level with the

corresponding block from a lower level which consists of(1) Time to access the block in the lower level + (2) Time to transmit that block to the level that experienced the miss + (3) Time to insert the block in that level + (4) Time to pass the block to the requestor

Hit Time << Miss Penalty

Page 15: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

16

Memory Hierarchy Design

Page 16: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

17

Memory Hierarchy Design

How is the Hierarchy Managed?

registers ↔ memoryby compiler

cache ↔ main memoryby the cache controller hardware

main memory ↔ disksby the operating system (virtual memory)physical address mapping assisted by the

hardware (TLB)

Page 17: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

18

Memory Hierarchy Design

Four Memory Hierarchy Questions

Q1 (block placement): where can a block be placed in cache?

Q2 (block identification): how is a block found if it is in cache?

Q3 (block replacement): which block should be replaced on a miss?

Q4 (write strategy): what happens on a write?

Page 18: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

19

Memory Hierarchy Design

Q1: Block Placement

Cache:8 blocks

Memory:32 blocks

Page 19: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

20

Memory Hierarchy Design

Q2: Block Identification Caches have an address tag on each block

frame that gives the block address It is checked to see if it matches the block address

from the processor Each cache block has a valid bit

To indicate if the entry has a valid address

031

Page 20: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

21

Memory Hierarchy Design

Two questions to answer (in hardware):

Question 2: How do we know if a data item is in the cache?

Question 1: If it is, how do we find it?

Page 21: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

22

Memory Hierarchy Design

Direct mapped Each memory block is mapped to exactly one

block in the cache lots of blocks must share blocks in the cache

Address mapping (to answer Q1):(block address) modulo (# of blocks in the cache)

Have a tag associated with each cache block that contains the address information (the upper portion of the address) required to identify the block (to answer Q2)

Page 22: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

23

Memory Hierarchy Design

Caching: A Simple First Example

00011011

Cache

Main Memory

Q1: How do we find it?

Use next 2 low order memory address bits – the index – to determine which cache block (i.e., modulo the number of blocks in the cache)

Tag Data

Q2: Is it there?

Compare the cache tag to the high order 2 memory address bits to tell if the memory block is in the cache

Valid

0000xx0001xx0010xx0011xx0100xx0101xx0110xx0111xx1000xx1001xx1010xx1011xx1100xx1101xx1110xx1111xx

• One-word blocks

• Two low order bits define the byte in the word (32b words)

(block address) modulo (# of blocks in the cache)

Index

Byte addressable!

Page 23: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

24

Memory Hierarchy Design

Direct Mapped Cache Example

One word blocks, cache size = 1K words (or 4KB)

20Tag 10Index

DataIndex TagValid012...

102110221023

31 30 . . . 13 12 11 . . . 2 1 0Blockoffset

20

Data

32

Hit

Page 24: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

25

Memory Hierarchy Design

Multiword Block Direct Mapped Cache

Four words/block, cache size = 1K words

8Index

DataIndex TagValid012...

253254255

31 30 . . . 13 12 11 . . . 4 3 2 1 0 Byte offset

20

20Tag

Hit Data

32

Block offset

Page 25: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

26

Memory Hierarchy Design

Set Associative MappedAllow more flexible block placement

In a direct mapped cache a memory block maps to exactly one cache block

At the other extreme, could allow a memory block to be mapped to any cache block – fully associative cache

A compromise is to divide the cache into sets each of which consists of n “ways” (n-way set associative). A memory block maps to a unique set (specified by the index field) and can be placed in any way of that set (so there are n choices)

(block address) modulo (# sets in the cache)

Page 26: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

27

Memory Hierarchy Design

Set Associative Cache Example

0

Cache

Main Memory

Q1: How do we find it?

Use next 1 low order memory address bit to determine which cache set (i.e., modulo the number of sets in the cache)

Tag Data

Q2: Is it there?

Compare all the cache tags in the set to the high order 3 memory address bits to tell if the memory block is in the cache

V

0000xx0001xx0010xx0011xx0100xx0101xx0110xx0111xx1000xx1001xx1010xx1011xx1100xx1101xx1110xx1111xx

Set

101

Way

0

1

• One-word blocks• Two low order bits

define the byte in the word (32b words)

Page 27: Chapter 2: Memory Hierarchy Design - cs.sjtu.edu.cnshen-yy/cs359/slides/5_memory_A.pdf · Memory Hierarchy Design (Appendix B and Chapter 2) Yanyan Shen Department of Computer Science

28

Memory Hierarchy Design

Four-Way Set Associative Cache28 = 256 sets each with four ways (each with one block)

31 30 . . . 13 12 11 . . . 2 1 0 Byte offset

DataTagV012...

253254255

DataTagV012...

253254255

DataTagV012...

253254255

Index DataTagV012...

253254255

8Index

22Tag

Hit Data

32

4x1 select

Way 0 Way 1 Way 2 Way 3