1/13/2016prashasti kanikar1 memory organization. 1/13/2016prashasti kanikar2 internal memory
TRANSCRIPT
04/21/23 PRASHASTI KANIKAR 1
MEMORY ORGANIZATION
04/21/23 PRASHASTI KANIKAR 2
Internal Memory
04/21/23 PRASHASTI KANIKAR 3
The Memory Hierarchy
04/21/23 PRASHASTI KANIKAR 4
Characteristics of Computer Memory Location
Internal external
Capacity modes of transfer
Word transfer Block transfer
Access time Reliability Cost Access Method
Serial Parallel
04/21/23 PRASHASTI KANIKAR 5
Paging If a page is not in physical memory
find the page on disk find a free frame bring the page into memory
What if there is no free frame in memory?
04/21/23 PRASHASTI KANIKAR 6
Page Replacement Basic idea
if there is a free frame in memory, use it if not, select a victim frame write the victim out to disk read the desired page into the new free frame update page tables restart the process
04/21/23 PRASHASTI KANIKAR 7
Page Replacement Main objective of a good replacement
algorithm is to achieve a low page fault rate insure that heavily used pages stay in memory the replaced page should not be needed for
some time efficient code
04/21/23 PRASHASTI KANIKAR 8
Reference String
Reference string is the sequence of pages being referenced
04/21/23 PRASHASTI KANIKAR 9
First-In, First-Out (FIFO)
The oldest page in physical memory is the one selected for replacement
Very simple to implement
04/21/23 PRASHASTI KANIKAR 10
FIFO
0
2
1
6
4
2
1
6
4
0
1
6
4
0
3
6
4
0
3
1
2
0
3
1
4 0 3 1 2
•Fault Rate = 9 / 12 = 0.75
•Consider the following reference string: 0, 2, 1, 6, 4, 0, 1, 0, 3, 1, 2, 1 x x x x x x x x x
Compulsory Misses
04/21/23 PRASHASTI KANIKAR 11
FIFO Issues Poor replacement policy
Evicts the oldest page in the system usually a heavily used variable should be
around for a long time FIFO replaces the oldest page - perhaps the
one with the heavily used variable
FIFO does not consider page usage
04/21/23 PRASHASTI KANIKAR 12
Least Recently Used (LRU) Basic idea
replace the page in memory that has not been accessed for the longest time
04/21/23 PRASHASTI KANIKAR 13
LRU
•Consider the following reference string: 0, 2, 1, 6, 4, 0, 1, 0, 3, 1, 2, 1 x x x x x x x x
Compulsory Misses
0
2
1
6
4
2
1
6
4
0
1
6
4 0
•Fault Rate = 8 / 12 = 0.67
4
0
1
3
32
0
1
3
2
04/21/23 PRASHASTI KANIKAR 14
LRU Issues
How to keep track of last page access? requires special hardware support
2 major solutions counters
hardware clock “ticks” on every memory reference the page referenced is marked with this “time” the page with the smallest “time” value is replaced
stack keep a stack of references on every reference to a page, move it to top of stack page at bottom of stack is next one to be replaced
04/21/23 PRASHASTI KANIKAR 15
LRU Issues Both techniques just listed require
additional hardware remember, memory reference are very
common impractical to invoke software on every
memory reference LRU is not used very often Instead, we will try to approximate LRU
04/21/23 PRASHASTI KANIKAR 16
Page Replacement Example Page reference string: ABCABDADBCB Three pages of physical memory Question:
Which pages are in memory? And which access causes a page fault?
FIFO: A B C A B D A D B C B1 A . D . C2 B . A3 C B
LRU: A B C A B D A D B C B1 A . . C2 B . . .3 C D .
FIFO: 7 page faults
LRU: 5 page faults
04/21/23 PRASHASTI KANIKAR 17
Semiconductor main memory
04/21/23 PRASHASTI KANIKAR 18
Semiconductor Memory Types
04/21/23 PRASHASTI KANIKAR 19
RAM Misnamed as all semiconductor memories are
random access Read/Write Volatile Temporary storage Static or dynamic
04/21/23 PRASHASTI KANIKAR 20
Memory Cells
0/1SELECTselect cell
CONTROLread or write
DATA IN / SENSEinput or output
SELECTselect cell
CONTROLread or write
1
04/21/23 PRASHASTI KANIKAR 21
Dynamic RAM
Bits stored as charge in capacitors Need refreshing even when powered Simpler construction Less expensive Need refresh circuits Slower Main memory Essentially analogue
Level of charge determines value
04/21/23 PRASHASTI KANIKAR 22
Dynamic RAM Structure
Transistor acts like a switch that is closed (allowing current to flow) if voltage is applied to address line and open (no current flow ) if no voltage is present on the address line.
04/21/23 PRASHASTI KANIKAR 23
DRAM Operation
Address line active when bit read or written Write
Voltage to bit line High for 1 low for 0
Then signal address line Transfers charge to capacitor
Read Address line selected
transistor turns on Charge from capacitor fed via bit line to sense amplifier
Compares with reference value to determine 0 or 1 Capacitor charge must be restored
04/21/23 PRASHASTI KANIKAR 24
Static RAM
Bits stored as on/off switches No charges to leak No refreshing needed when powered More complex construction More expensive Does not need refresh circuits Faster Cache Digital
Uses flip-flops
04/21/23 PRASHASTI KANIKAR 25
Stating RAM Structure
When address line is active,t5 and t6 are switched on for R/W.
Read- the bit value is read from bit line B.
Write-desired value is applied to B and its complement to B (this forces t1,t2,t3,t4 into proper state.
04/21/23 PRASHASTI KANIKAR 26
SRAM v DRAM
Both volatile Power needed to preserve data
Dynamic cell Simpler to build, smaller More dense Less expensive Needs refresh Larger memory units
Static Faster Cache
04/21/23 PRASHASTI KANIKAR 27
Read Only Memory (ROM)
Permanent storage Nonvolatile
Library subroutines Systems programs (BIOS)
04/21/23 PRASHASTI KANIKAR 28
Types of ROM
Written during manufacture Very expensive for small runs
Programmable (once) PROM Needs special equipment to program
Read “mostly” Erasable Programmable (EPROM)
Erased by UV Electrically Erasable (EEPROM)
Takes much longer to write than read Flash memory
Erase whole memory electrically
04/21/23 PRASHASTI KANIKAR 29
Chip logic
A 16Mbit chip can be organised as 1M of 16 bit words
A 16Mbit chip can be organised as a 2048 x 2048 x 4bit array
Address lines supply the address of the word to be selected
Multiplex row address and column address 11 pins to address (211=2048)
04/21/23 PRASHASTI KANIKAR 30
Typical 16 Mb DRAM (4M x 4)
4 bits are read/written at a time.
Logically organized as 4 square arrays of 2048x2048 elements.
04/21/23 PRASHASTI KANIKAR 31
1/2 lines come in, but this is passed to select logic which multiplexes the other 1/2 lines needed
Signals accompanied by CAS and RAS signals to provide timing
Refresh - handled by disabling chip while memory cells are refreshed
RESULT- because of multiplexed addressing & square arrays, memory size quadruples with each generation of chips
04/21/23 PRASHASTI KANIKAR 32
256kByte Module organisation
04/21/23 PRASHASTI KANIKAR 33
It is physically impossible for any data recording or transmission medium to be 100% perfect .
As more bits are packed onto a square centimeter of disk storage, as communications transmission speeds increase, the likelihood of error increases-- sometimes geometrically.
Thus, error detection and correction is critical to accurate data transmission, storage and retrieval.
Error Detection and Correction
04/21/23 PRASHASTI KANIKAR 34
Error Correction
Hard Failure Permanent defect
Soft Error Random, non-destructive No permanent damage to memory
Detected using Hamming error correcting code
04/21/23 PRASHASTI KANIKAR 35
In a single-bit error, only one bit in the data unit has changed.
04/21/23 PRASHASTI KANIKAR 36
A burst error means that 2 or more bits in the data unit have changed.
04/21/23 PRASHASTI KANIKAR 37
10.2 Detection10.2 Detection
Redundancy
Parity Check
Cyclic Redundancy Check (CRC)
Checksum
04/21/23 PRASHASTI KANIKAR 38
Error Correcting Code Function
04/21/23 PRASHASTI KANIKAR 39
Error Correction and Checking
Add bits to a block to use for error discovery Detection only Detection and retransmission Detection and recovery
Check Body Header
Block
04/21/23 PRASHASTI KANIKAR 40
Error Detection Only (Asynchronous Transmission)
*******
Parity Bit
7 Data Bits
*
04/21/23 PRASHASTI KANIKAR 41
Error Detection &Correction (Hamming Code:
4 bit word)
****
3 Error Checking Bits
4 Data Bits
***
04/21/23 PRASHASTI KANIKAR 42
DATA
1
1
10
Error Detection &Correction (Hamming Code: 4 bit word)
1110
***
04/21/23 PRASHASTI KANIKAR 43
PARITY (even)
1
1
10
1 0
0
Error Detection
1110
100
04/21/23 PRASHASTI KANIKAR 44
04/21/23 PRASHASTI KANIKAR 45
In two-dimensional parity check, a block of bits is divided into rows and a redundant row of bits is added to the
whole block.
04/21/23 PRASHASTI KANIKAR 46
Error Correction & Detection Error detection takes fewer bits than error
correction Longer packets take a smaller percent for
correction but have more types of errors Hamming’s scheme detects all errors at a
high overhead cost; others may correct only single bit or double bit errors with shorter check fields
04/21/23 PRASHASTI KANIKAR 47
Advanced DRAM Organization
Basic DRAM same since first RAM chips Enhanced DRAM
Contains small SRAM as well SRAM holds last line read
Cache DRAM Larger SRAM component Use as cache or serial buffer
04/21/23 PRASHASTI KANIKAR 48
Synchronous DRAM (SDRAM)
Access is synchronized with an external clock Address is presented to RAM RAM finds data (CPU waits in conventional DRAM) Since SDRAM moves data in time with system clock,
CPU knows when data will be ready CPU does not have to wait, it can do something else Burst mode allows SDRAM to set up stream of data
and fire it out in block DDR-SDRAM sends data twice per clock cycle
(leading & trailing edge)
04/21/23 PRASHASTI KANIKAR 49
RAMBUS
Adopted by Intel for Pentium & Itanium Main competitor to SDRAM Vertical package – all pins on one side Data exchange over 28 wires < cm long Bus addresses up to 320 RDRAM chips at
1.6Gbps Asynchronous block protocol
480ns access time Then 1.6 Gbps
04/21/23 PRASHASTI KANIKAR 50
DDR SDRAM
SDRAM can only send data once per clock Double-data-rate SDRAM can send data
twice per clock cycle Rising edge and falling edge
04/21/23 PRASHASTI KANIKAR 51
Cache DRAM
Mitsubishi Integrates small SRAM cache (16 kb) onto
generic DRAM chip Used as true cache
64-bit lines Effective for ordinary random access
To support serial access of block of data E.g. refresh bit-mapped screen
CDRAM can prefetch data from DRAM into SRAM buffer
Subsequent accesses solely to SRAM
04/21/23 PRASHASTI KANIKAR 52
CACHE MEMORY
04/21/23 PRASHASTI KANIKAR 53
Placement of Cache Memory in a Computer System
•
04/21/23 PRASHASTI KANIKAR 54
The locality principle: a recently referenced memory location is likely to be referenced again (temporal locality);
a neighbor of a recently referenced memory location is likely to be referenced (spatial locality).
04/21/23 PRASHASTI KANIKAR 55
04/21/23 PRASHASTI KANIKAR 56
Cache
Main Memory
CPU
CACHE
Word
Block
04/21/23 PRASHASTI KANIKAR 57
Cache High speed memory Implemented using SRAM
04/21/23 PRASHASTI KANIKAR 58
Hierarchy List
Registers L1 Cache L2 Cache Main memory Disk cache Disk Optical Tape
04/21/23 PRASHASTI KANIKAR 59
Cache architectures/layout Look through
cpu will first search the cache
If found, transfer will be performed b/w cpu and cache but if not found it will search in main memory
Look aside cpu will search the
cache and main memory simultaneously
Whenever found, the transfer will be performed from that memory
04/21/23 PRASHASTI KANIKAR 60
Cache Design
Size Mapping Function Replacement Algorithm Write Policy Block Size Number of Caches
04/21/23 PRASHASTI KANIKAR 61
Cache consistency When the data of a location in the cache is
different from the data of the same location in main memory then the cache is said to be inconsistent.
Inconsistency may arise when CPU writes into a location in the cache memory.
To avoid inconsistency Write through Write back/buffered write/delayed write Snoopy writes
04/21/23 PRASHASTI KANIKAR 62
Write Policy
Must not overwrite a cache block unless main memory is up to date
Multiple CPUs may have individual caches I/O may address main memory directly
04/21/23 PRASHASTI KANIKAR 63
Write through Whenever the cpu writes into cache, will
write on main memory also High level of consistency Very simple Multiple CPUs can monitor main memory
traffic to keep local (to CPU) cache up to date Lots of traffic Slows down writes
04/21/23 PRASHASTI KANIKAR 64
Write back
Updates initially made in cache only Update bit for cache slot is set when
update occurs If block is to be replaced, write to main
memory only if update bit is set I/O must access main memory through
cache faster
04/21/23 PRASHASTI KANIKAR 65
Snoopy writes The cache controller snoops (monitors) the
activities of other bus masters on system bus.
If a bus master is trying to read a location from main memory that has been modified in cache, it will first copy the updated data from cache to main memory and then allows main memory operation to continue.
04/21/23 PRASHASTI KANIKAR 66
Cache organization/mapping techniques Ways in which data from main memory
can be mapped (brought) into the cache
04/21/23 PRASHASTI KANIKAR 67
Direct Mapping Each block of main memory maps to only one
cache line i.e. if a block is in cache, it must be in one specific
place Address is in two parts Least Significant w bits identify unique word Most Significant s bits specify one memory
block The MSBs are split into a cache line field r and
a tag of s-r (most significant)
04/21/23 PRASHASTI KANIKAR 68
Direct MappingAddress Structure
Tag s-r Line or Slot r Word w
8 14 2
24 bit address 2 bit word identifier (4 byte block) 22 bit block identifier
8 bit tag (=22-14) 14 bit slot or line
No two blocks in the same line have the same Tag field Check contents of cache by finding line and checking Tag
04/21/23 PRASHASTI KANIKAR 69
Direct Mapping Cache Line Table Cache line Main Memory blocks held 0 0, m, 2m, 3m…2s-m 1 1,m+1, 2m+1…2s-m+1
m-1 m-1, 2m-1,3m-1…2s-1
04/21/23 PRASHASTI KANIKAR 70
Direct Mapping Example
04/21/23 PRASHASTI KANIKAR 71
Direct Mapping pros & cons Simple Inexpensive Fixed location for given block
04/21/23 PRASHASTI KANIKAR 72
Associative Mapping A main memory block can load into any
line of cache Memory address is interpreted as tag and
word Tag uniquely identifies block of memory Every line’s tag is examined for a match Cache searching gets expensive
04/21/23 PRASHASTI KANIKAR 73
Associative Mapping Example
04/21/23 PRASHASTI KANIKAR 74
Tag 22 bitWord2 bit
Associative MappingAddress Structure
22 bit tag stored with each 32 bit block of data Compare tag field with tag entry in cache to check
for hit Least significant 2 bits of address identify which 16
bit word is required from 32 bit data block e.g.
Address Tag Data Cache line FFFFFC FFFFFC 24682468 3FFF
04/21/23 PRASHASTI KANIKAR 75
Set Associative Mapping Cache is divided into a number of sets Each set contains a number of lines A given block maps to any line in a given
set e.g. Block B can be in any line of set i
e.g. 2 lines per set 2 way associative mapping A given block can be in one of 2 lines in only
one set
04/21/23 PRASHASTI KANIKAR 76
Set Associative MappingExample
BLOCK 127
BLOCK 0BLOCK 1
BLOCK 127
BLOCK 0BLOCK 1
BLOCK 127
BLOCK 0BLOCK 1
BLOCK 127
BLOCK 0BLOCK 1
SET 0
SET 1
SET 32
04/21/23 PRASHASTI KANIKAR 77
Two way set associative mapping
BLOCK 63
BLOCK 0BLOCK 1
BLOCK 63
BLOCK 0BLOCK 1
BLOCK 63
BLOCK 0BLOCK 1
BLOCK 63
BLOCK 0BLOCK 1
BLOCK 63
04/21/23 PRASHASTI KANIKAR 78
Two Way Set Associative Cache Organization
04/21/23 PRASHASTI KANIKAR 79
Two Way Set Associative Mapping Example
04/21/23 PRASHASTI KANIKAR 80
Replacement Algorithms (1)Direct mapping No choice Each block only maps to one line Replace that line
04/21/23 PRASHASTI KANIKAR 81
Replacement Algorithms (2)Associative & Set Associative Hardware implemented algorithm (speed) Least Recently used (LRU) e.g. in 2 way set associative
Which of the 2 block is lru? First in first out (FIFO)
replace block that has been in cache longest Least frequently used
replace block which has had fewest hits Random
04/21/23 PRASHASTI KANIKAR 82
Performance characteristics of two – level memories
cpum1 m2word block
M1- present on motherboard
M2 –not present on motherboard
04/21/23 PRASHASTI KANIKAR 83
tA1=m1->cpu tA2=(m2->m1) + (m1->cpu) Hit ratio H=no of hits/total attempts Avg access time tA=H.tA1+(1-H).Ta2 Avg cost c=(c1.s1+c2.s2)/(s1+s2) Access efficiency E=tA1/tA Speed ratio R=tA2/tA1
04/21/23 PRASHASTI KANIKAR 84
External memory
04/21/23 PRASHASTI KANIKAR 85
RAID Redundant Array of Independent Disks Redundant Array of Inexpensive Disks 6 levels in common use Not a hierarchy Set of physical disks viewed as single
logical drive by O/S Data distributed across physical drives Can use redundant capacity to store parity
information
04/21/23 PRASHASTI KANIKAR 86
RAID 0 No redundancy Data striped across all disks Round Robin striping Increase speed
Multiple data requests probably not on same disk
Disks seek in parallel A set of data is likely to be striped across
multiple disks
04/21/23 PRASHASTI KANIKAR 87
RAID 1 Mirrored Disks Data is striped across disks 2 copies of each stripe on separate disks Read from either Write to both Recovery is simple
Swap faulty disk & re-mirror No down time
Expensive
04/21/23 PRASHASTI KANIKAR 88
RAID 2 Disks are synchronized Very small stripes
Often single byte/word Error correction calculated across
corresponding bits on disks Multiple parity disks store Hamming code
error correction in corresponding positions Lots of redundancy
Expensive Not used
04/21/23 PRASHASTI KANIKAR 89
RAID 3 Similar to RAID 2 Only one redundant disk, no matter how
large the array Simple parity bit for each set of
corresponding bits Data on failed drive can be reconstructed
from surviving data and parity info Very high transfer rates
04/21/23 PRASHASTI KANIKAR 90
RAID 4 Each disk operates independently Good for high I/O request rate Large stripes Bit by bit parity calculated across stripes
on each disk Parity stored on parity disk
04/21/23 PRASHASTI KANIKAR 91
RAID 5 Like RAID 4 Parity striped across all disks Round robin allocation for parity stripe Avoids RAID 4 bottleneck at parity disk Commonly used in network servers
N.B. DOES NOT MEAN 5 DISKS!!!!!
04/21/23 PRASHASTI KANIKAR 92
RAID 6 Two parity calculations Stored in separate blocks on different
disks User requirement of N disks needs N+2 High data availability
Three disks need to fail for data loss Significant write penalty
04/21/23 PRASHASTI KANIKAR 93
RAID 0, 1, 2
04/21/23 PRASHASTI KANIKAR 94
RAID 3 & 4
04/21/23 PRASHASTI KANIKAR 95
RAID 5 & 6
04/21/23 PRASHASTI KANIKAR 96
Data Mapping For RAID 0