improving system performance and longevity with a new nand flash architecture
DESCRIPTION
Improving System Performance and Longevity with a New NAND Flash Architecture. Jin-Ki Kim Vice President, Research & Development MOSAID Technologies Inc. Agenda. Context Core Architecture Innovation High Performance Interface Summary. Computing Storage Hierarchy. - PowerPoint PPT PresentationTRANSCRIPT
Santa Clara, CA USAAugust 2009 1
Improving System Performance and Longevity with a New NAND Flash Architecture
Jin-Ki Kim
Vice President, Research & Development
MOSAID Technologies Inc.
Santa Clara, CA USAAugust 2009 2
Agenda
Context
Core Architecture Innovation
High Performance Interface
Summary
Santa Clara, CA USAAugust 2009 3
Computing Storage Hierarchy
* SCM: Storage Class MemoryR.Freitas and W.Wilcke, “Storage-class memory: The next storage system technology”, IBM J. RES. & DEV. VOL. 52 NO. 4/5 JULY/SEPTEMBER 2008.
New Storage Hierarchy
Register
Cache
DRAM (10.6 ~ 12.8GB/s per Channel)
HDD (20 ~ 70MB/s)
SCM(500MB/s ~ 2GB/s)
10x
10x
CPU Cycles
1
10
100
105 ~ 106
107 ~ 108
Historical Storage Hierarchy
Register
Cache
DRAM (5.3 ~ 6.4GB/s per Channel)
HDD (20 ~ 70MB/s)
Memory BWGap
100x >
CPU Cycles
107 ~ 108
1
10
100
Santa Clara, CA USAAugust 2009 4
The single-minded focus on bit density improvements has brought Flash technology to current multi-Gb NAND devices
120nm 90nm 73nm 65nm 51nm 35nm
Process Technology
Density
1Gb SLC
2Gb SLC
4Gb MLC
8Gb MLC
16Gb MLC
32Gb MLC
NAND Flash Progression
Santa Clara, CA USAAugust 2009 5
NAND Architecture Trend
Primary focus is density for cost and market adoption
512B 2KB 4KB 8KB 16KBD: Page Size
8 16 32 64C: # of Cell / NAND String
1 2 4B: # of Bitline / Page Buffer
1 2 4A: # of Bit / Cell
Block Size = (A * B * C) * DCell Array Mismatch Factor (CAMF)
Santa Clara, CA USAAugust 2009 6
NAND Block Size Trend
90 91 92 93 94 95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10
4KB
8KB
16KB
128KB
256KB
512KB
8Mb/16Mb32Mb/64Mb
128Mb/256Mb/512Mb
1Gb/2Gb/4Gb
4Gb/8Gb/16Gb
• 8 cells/NAND• 512B Page Buf.
• 16 cells/NAND• 512B Page Buf.
• 16 cells/NAND• 512B Page Buf.• 2 BLs/PB
• 32 cells/NAND• 2KB Page Buf.• 2 BLs/PB
2Gb/4Gb/8GbMLC
8Gb/16Gb/32GbMLC
• 32 cells/NAND• 4KB Page Buf.• 2 BLs/PB
• 64 cells/NAND• 8KB Page Buf.• 2 BLs/PB
90 91 92 93 94 95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10
4KB
8KB
16KB
128KB
256KB
512KB
8Mb/16Mb32Mb/64Mb
128Mb/256Mb/512Mb
1Gb/2Gb/4Gb
4Gb/8Gb/16Gb
• 8 cells/NAND• 512B Page Buf.
• 16 cells/NAND• 512B Page Buf.
• 16 cells/NAND• 512B Page Buf.• 2 BLs/PB
• 32 cells/NAND• 2KB Page Buf.• 2 BLs/PB
2Gb/4Gb/8GbMLC
8Gb/16Gb/32GbMLC
• 32 cells/NAND• 4KB Page Buf.• 2 BLs/PB
• 64 cells/NAND• 8KB Page Buf.• 2 BLs/PB
CAMF = 8
CAMF = 16
CAMF = 32
CAMF = 64 (SLC) & 128(MLC)
CAMF = 128 (SLC) & 256(MLC)
Cell Array Mismatch Factor (CAMF) is a key parameter to degrade write efficiency (i.e. increase write amplification factor)
Santa Clara, CA USAAugust 2009 7
NAND Challenges in Computing Apps
Endurance: 100K 10K 5K 3K 1K Retention: 10 years 5 years 2.5 years Current NAND architecture trend accelerates
reliability degradation
Block Size = (A * B * C) * D
Applications Multi MediaComputing
512B 2KB 4KB 8KB 16KB
8 16 32 64
1 2
1 2
4
4
D: Page Size
C: # of Cell / NAND String
B: # of Bitline / Page Buffer
A: # of Bit / Cell
Cell Array Mismatch Factor
Santa Clara, CA USAAugust 2009 8
Flash System Lifetime Metric
System lifetime heavily relies on NAND architecture and features, primarily cell array mismatch factor
Total Host Writes = NAND Endurance Cycle* System Capacity* Write Efficiency* Wear Leveling Efficiency
Write Efficiency = Total Data Written by Host Total Data Written to NAND
System Lifetime = Total Host Writes Host Writes per Day
Santa Clara, CA USAAugust 2009 9
NAND Architecture Innovation Current NAND Flash Architecture
Plane 1 Plane 2 SSL
W/L31
W/L30
W/L29
W/L28
W/L27
W/L26
GSL
W/L0
W/L1
W/L2
B/L0 B/L1 B/L(j*8-1) B/L(j*8)
CSL
B/L(j+k)*8-2
B/L(j+k)*8-1
BlockPageNAND Cell String
Page Buffer: 512B 2K 4KB 8KB
Santa Clara, CA USAAugust 2009 10
NAND Architecture Innovation FlexPlane with 2-tier Row Decoder Scheme
Smaller & Flexible Page Size (2KB/4KB/6KB/8KB) Smaller & Flexible Erase Size Lower power consumption due to segmented pp-well
2KB Page Buffer
NAND Cell Array on Sub PP-Well
Se
gm
en
t R
ow
De
cod
er
2KB Page Buffer
NAND Cell Array on Sub PP-Well
Se
gm
en
t R
ow
De
cod
er
2KB Page Buffer
NAND Cell Array on Sub PP-Well
Se
gm
en
t R
ow
De
cod
er
2KB Page Buffer
NAND Cell Array on Sub PP-Well
Se
gm
en
t R
ow
De
cod
er
Glo
ba
l Ro
w D
eco
de
r
FlexPlane Operations
FlexPlane
Santa Clara, CA USAAugust 2009 11
NAND Architecture Innovation 2-Dimensional Page Buffer Scheme
Upper Plane
Lower Plane
2-Dimensional Page Buffer(Shared Page Buffer)
Micron 32Gb NAND Flash in 34nm, 2009 ISSCC
0.5x Bitline Length compared to conventional NAND architecture
Santa Clara, CA USAAugust 2009 12
NAND Core Innovation Page-pair & Partial Block Erase
Minimize cell array mismatch factor Improve write efficiency
Block Erase Partial Block Erase Page-pair Erase
SSL
CSL
W/L31
W/L30
W/L29
W/L28
W/L27
W/L0
W/L1
W/L2
B/L
W/L26
GSL
SSL
CSL
W/L31
W/L30
W/L29
W/L28
W/L27
W/L0
W/L1
W/L2
B/L
W/L26
GSL
SSL
CSL
W/L31
W/L30
W/L29
W/L28
W/L27
W/L0
W/L1
W/L2
B/L
W/L26
GSL
• MOSAID’s Erase Scheme
Santa Clara, CA USAAugust 2009 13
NAND Core Innovation Lower Operating Voltage
H.V. Gen
Core and Peri.
I/O
2.7 ~ 3.3V
H.V. Gen
Core and Peri.
I/O
2.7 ~ 3.3V
1.8V
H.V. Gen
Core and Peri.
I/O
3.3V
1.8V
• MOSAID’s Low Stress Program Scheme
Santa Clara, CA USAAugust 2009 14
NAND Core Innovation Low Stress Program
Precharge NAND string to voltage higher than Vcc prior to wordline boost and bitline data load
Reduces program stress (Vpgm & Vpass stress) during programming
Eliminates Vcc dependency to achieve low Vcc operation
Minimizes background data dependency Eliminates W/L to SSL Coupling Addresses Gate Induced Drain Leakage (GIDL)
Santa Clara, CA USAAugust 2009 15
HyperLink (HLNAND™) Flash
MCP HLNAND
Monolithic HLNAND
NewDevice
Architecture
• Device interface independent of memory technology and density
• Packet protocol with low pin count
• Configurable data width link architecture
• Flexible modular command structure
• Simultaneous Read and Write
• EDC for command and ECC for registers
• Daisy-chain cascade with point-to-point connection of up to virtually Unlimited number of devices
• Fully independent bank operations
• Multiple, simultaneous data transactions
• Data throughput independent of core operations
• Device architecture for performance and longevity
FlexPlane Architecture
Two-dimensional Page Buffer
Two-tier Row Decoder
• NAND flash cell technology
• Page/multipage erase function
• Partial block erase function
• Low stress program scheme
• Random page program in SLC
• Low Vcc operations
Santa Clara, CA USAAugust 2009 16
HyperLink Interface
Point-to-point ring topology• Synchronous DDR signaling with source termination only• Up to 255 devices in a ring without speed degradation• Dynamically configurable bus width from 1-8 bits• HL1 parallel clock distribution to 266MB/s• HL2 source synchronous clocking to 800MB/s, backward
compatible to HL1
Controller
HLNANDHost
InterfaceHLNAND
HLNAND
HLNAND
Santa Clara, CA USAAugust 2009 17
32Gb SLC/64Gb MLC HLNAND MCP
HLNAND Bridge Chip
HLNAND MCP
8Gb SLC NAND/16Gb MLC NAND
HL1 Interface
x8b x8b
x8b x8b
8Gb SLC NAND/16Gb MLC NAND
8Gb SLC NAND/16Gb MLC NAND
8Gb SLC NAND/16Gb MLC NAND
HL1 (DDR-200/266) using NAND in MCP - Conform SLC/MLC HLNAND spec
HLNAND Bridge Chip developed by MOSAID and implemented in a MCP
8Gb SLC/16Gb MCL
Santa Clara, CA USAAugust 2009 18
32Gb SLC/64Gb MLC HLNAND MCPMCP Package – 12 x 18 100-Ball BGA
Cross sectional view (A-A)
Cross sectional view (B-B)
Santa Clara, CA USAAugust 2009 19
32Gb SLC/64Gb MLC HLNAND MCPPerformance
DDR-300 Operations151MHz (tCK = 6.6ns) @Vcc = 1.8V
DDR Output Timing (Oscilloscope Signals) – will be inserted
Santa Clara, CA USAAugust 2009 20
HLNAND Flash Module (HLDIMM)
Use cost effective DDR2 SDRAM 200-pin SO-DIMM form factor and sockets
8 x 64Gb or 8 x 128Gb HLNAND MCP (4 on each side)
Santa Clara, CA USAAugust 2009 21
HLDIMM Port Configuration
2 x HyperLink HL1 interfaces with 533MB/s read + 533MB/s write = 1066MB/s aggregate throughput
CH0 inHL1MCP
HL1MCP
CH0 outHL1MCP
HL1MCP
CH1 inHL1MCP
HL1MCP
CH1 outHL1MCP
HL1MCP
HLDIMM
CH0 out
CH0 in
CH1 out
CH1 in
to controller next HLDIMM
front back Pin 1
Pin 199
Pin 2
Pin 200
Santa Clara, CA USAAugust 2009 22
System Configurations with HLDIMM
CH0 inHL1
MCPsHL1
MCPs
CH0 outHL1
MCPsHL1
MCPs
CH1 inHL1
MCPsHL1
MCPs
CH1 outHL1
MCPsHL1
MCPs
HLDIMM
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HLDIMMHLDIMM
Loop-around empty socket
CH0 inHL1
MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
CH0 outHL1
MCPsHL1
MCPs
HLDIMM
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HLDIMM
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HLDIMM
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HLDIMM
HL1MCPs
HL1MCPs
HL1MCPs
HL1MCPs
HL1MCPs
HL1MCPs
HL1MCPs
HL1MCPs
HLDIMM
CH0 inHL1
MCPs
CH0 outHL1
MCPs
CH1 inHL1
MCPs
CH1 outHL1
MCPs
HL1MCPs
HL1MCPs
HL1MCPs
HL1MCPs
HLDIMM
CH2 in
CH2 out
CH3 in
CH3 out
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HLDIMM
CH0 inHL1
MCPsHL1
MCPs
CH0 outHL1
MCPsHL1
MCPs
CH1 inHL1
MCPsHL1
MCPs
CH1 outHL1
MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HLDIMM
CH2 in
CH2 out
CH3 in
CH3 out
2133MB/s Aggregate Throughput
CH0 inHL1
MCPsHL1
MCPs
CH0 outHL1
MCPsHL1
MCPs
CH1 inHL1
MCPsHL1
MCPs
CH1 outHL1
MCPsHL1
MCPs
HLDIMM
controller
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HLDIMM
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HLDIMM
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HL1MCPsHL1
MCPs
HLDIMM
1066MB/s Aggregate Throughput
Santa Clara, CA USAAugust 2009 23
Summary
The driving forces in future NVM are the memory architecture & feature innovation that will support emerging system architectures and applications
Using HLNAND Flash, Storage Class Memory is viable today using proven HLNAND flash technology
Santa Clara, CA USAAugust 2009 24
Resource for HLNAND Flashwww.HLNAND.com
Available• 64Gb MLC MCP sample• 64GB HLDIMM sample• Architectural Specification• Datasheets• White papers• Technical papers• Verilog Behavioral model