prof. yong ho song
TRANSCRIPT
![Page 1: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/1.jpg)
Prof. Yong Ho Song
Department of Electronic Engineering, Hanyang University
![Page 2: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/2.jpg)
![Page 3: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/3.jpg)
3
Need a SSD platform
- to develop a new firmware algorithm
- to explore hardware architecture and organization
Use a commercial product as a platform?
- little information on HW/SW
- no way to change controller SoC
![Page 4: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/4.jpg)
4
Open source SSD design used for research and education
Host Interface Firmware
Flash Translation Layer
Low-Level Driver
NAND Flash Controller
Bus / DMAC ARM Processor
Host Interface Controller
ECC EnginePerformance
Monitor
SSDFirmware
SSDControllerHardware
![Page 5: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/5.jpg)
■ Open-source SSD platforms
● Jasmine OpenSSD (2011)
● Cosmos OpenSSD (2014)
● Cosmos+ OpenSSD (2016)
■ Cosmos/Cosmos+ OpenSSD: FPGA-based platform
● Could modify SSD controller and firmware
● Could add new hardware and software functionality
![Page 6: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/6.jpg)
6
■ Realistic research platform
● Solve your problem in a real system running host applications
● Design your own SSD controller (hardware and firmware), if possible
■ Information exchange
● Share your solution with people in society
■ Community contribution
● Open your own solution to public
■ Expensive custom-made storage system
● Unique
■ Play for fun
![Page 7: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/7.jpg)
7
■ Jasmine OpenSSD (2011)
● SSD controller: Indilinx Barefoot (SoC w/SATA2)
● Firmware: SKKU VLDB Lab
● Users from 10+ countries
BarefootController
SoC
SATA-2Interface
NAND Flash Memory
(32GB/module)
![Page 8: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/8.jpg)
8
■ Cosmos OpenSSD (2014)
● SSD controller: HYU Tiger 3 (FPGA w/PCIe Gen2)
● Firmware: HYU ENC Lab
● Users from 5 countries (mostly in USA)
SSDController in FPGA
External PCIeInterface
NAND Flash Module(128 GB)
![Page 9: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/9.jpg)
9
■ Cosmos+ OpenSSD (2016)
● SSD controller: HYU Tiger 4 (FPGA w/NVMe over PCIe Gen2)
● Same main board with different memory modules
● Firmware: HYU ENC Lab
● Users from ?? countries
Same platform with Cosmos OpenSSD
NAND Flash Modules(1 TB/module)
![Page 10: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/10.jpg)
10
Jasmine OpenSSD Cosmos OpenSSD Cosmos+ OpenSSD
Released in 2011 2014 2016
Main Board
SSD Controller Indilinx Barefoot (SoC) HYU Tiger3 (FPGA) HYU Tiger4 (FPGA)
Host Interface SATA2PCIe Gen2 4-lane
(AHCI)
PCIe Gen2 8-lane
(NVMe)
Maximum Capacity 128 GB (32 GB/module) 256 GB (128 GB/module) 2 TB (1 TB/module)
NAND Data Interface SDR (Asynchronous) NVDDR (Synchronous) NVDDR2 (Toggle)
ECC Type and Strength BCH, 16 bits/512 B BCH, 32 bits/2 KB BCH, 26 bits/512 B
![Page 11: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/11.jpg)
11
http://www.openssd.io
![Page 12: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/12.jpg)
a
a
CPUFTL & NVMe Management
12
Process
Information
Cosmos+ OpenSSD Host PC
USB UART
USB JTAG
USB UART
USB JTAG
UART TerminalCommunicate
with Cosmos+
Xilinx SDK 14.4Build Firmware
Xilinx Vivado 14.4Generate Bitstream
Development PC
Bitstream
& Firmware
Bitstream
& HW InformationNVMe
Controller
NAND
Flash
ControllerNAND
Flash
Module
NAND
Flash
Module
Application
File System
Zynq-7000 FPGA
Bitstream
Firmware
Processing System
Programmable Logic
Ext. P
CIe
Con
ne
cto
r
Ext. P
CIe
Ad
ap
tor
PCIe Chipset
Block Layer
NVMe Driver
![Page 13: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/13.jpg)
13
■ 1 Development PC
● Downloading hardware/software design (JTAG)
● Monitoring Cosmos+ OpenSSD internals (UART)
■ 1 Host PC
● Executing applications such as a benchmark (PCIe)
■ 1 Platform board with 1+ NAND flash modules installed
● Working as a storage device to the host PC
Development PC
Platform board
SSD Controller
NAND Flash
Module
Host PC
AVI
Hardware and software
binary files
Internal
information
NVMe over PCIeUART
JTAG
MP3NAND Flash
Module
![Page 14: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/14.jpg)
14
■ Cosmos+ OpenSSD platform board
● Consists of a Zynq FPGA and other peripherals
■ NAND flash modules
● Configured as multi-channel and multi-way flash array
● Inserted into Cosmos+ OpenSSD platform board
■ External PCIe adapter and cable
● Connected with host PC
■ USB cables for JTAG and UART
● Connected with development PC
■ Power cable and adapter
● 12V supply voltage
![Page 15: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/15.jpg)
15
External
PCIe
Ethernet
JTAG digilent
module
USB to UART
USB 2.0 ULTP
Zynq-7000
AP SoC
DDR3
DRAM
User-configurable SW
User-configurable LEDSD card
connectorQSPI memory
Configuration mode SW
SO-DIMM
SO-DIMM
6-pin PCIe
power connector
Board
power SW
PMbus
connector
SMA connector
Fan connector
JTAG select SW
20pin
JTAG
7&14pin
JTAG
User-configurable
GPIO pin
VCCO_ADJ
select pin
I2C PMOD
pin
VCCO_ADJ
divide pin
PMOD
pin20pin
ARM JTAG
5.5 mm
power connector
![Page 16: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/16.jpg)
16
FPGA Xilinx Zynq-7000 AP SoC (XC7Z045-FFG900-3)
Logic cells 350K (~ 5.2M ASIC gates)
CPUType Dual-Core ARM CortexTM- A9
Clock frequency Up to 1000 MHz
StorageTotal capacity Up to 2 TB (MLC)
Organization Up to 8-channel 8-way
DRAMDevice interface DDR3 1066
Total capacity 1 GB
BusSystem AXI-Lite (bus width: 32 bits)
Storage data AXI (bus width: 64 bits, burst length: 16)
SRAM 256 KB (FPGA internal)
![Page 17: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/17.jpg)
17
■ Xilinx’s embedded SoC
■ Two regions
● Processing System (PS)
– Hardwired components
– Executes the firmware program
● Programmable Logic (PL)– Programmable components
(FPGA region)
– NAND flash controller (NFC) and
NVMe controller reside in PL
■ Benefits of Using Zynq● CPU is more faster than soft core
(such as MicroBlaze)
● No need to worry about
organizing hardware memory
controller, and some other
peripherals (such as UART)
● Xilinx supports BSP (Board
Support Package)
Central Interconnect
ARM Cortex-A9ARM Cortex-A9
Application Processor Unit
Snoop Control Unit
L2 Cache & Cache
Controller
OCM
Interconnect
GP AXI
Slave
ports
GP AXI
Master
ports
HP AXI Slave ports
Programmable
Logic to
Memory
Interconnect
Memory Interface
(DDR3 Controller)
Hardwired
component
Programmable
component
User Defined FPGA Logic
Processing
System
(PS)
Programmable
Logic (PL)
Peripherals
(UART, I2C, …)
Zynq-7000 architecture overview
![Page 18: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/18.jpg)
18
■ Each module has 4 flash packages
● One flash package
– Capacity: 32 GB
– Page size: 8640 Bytes (spare area: 448 Bytes)
● Synchronous NAND
■ Used with Tiger3 Controller
Front side Rear side
Flash package
![Page 19: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/19.jpg)
19
■ Module configuration
● 4 channels/module and 4 ways/channel
■ Shared signals within a channel (a package)
● Dies in the same package share the I/O channel
● Dies in the same package share command signals except Chip Enable (CE)
● Each die has own Ready/Busy (R/B) signal
Die 0
Die 1
Die 2
Die 3Way3
Way2
Way1
Way0
CE0, R/B0
CE1, R/B1
CE2, R/B2
CE3, R/B3
Channel 3
Channel 2
Channel 1
Package 0 Channel 0
Package 1
Package 2
Package 3
![Page 20: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/20.jpg)
20
■ Each module has 8 flash packages
● One flash package
– Capacity: 128 GB
– Page size: 18048 Bytes (spare area: 1664 Bytes)
● Toggle NAND
■ Used with Tiger4 Controller
Front side Rear side
Flash package Flash package
![Page 21: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/21.jpg)
21
■ Module configuration
● 4-channels/module and 8-ways/channel
■ Shared signals within a channel (a package)
● Dies in the same package share the I/O channel
● Dies in the same package share command signals except Chip Enable (CE)
● Each die has own Ready/Busy (R/B) signal
Die 0
Die 1
Die 2
Die 3Way7
Way6
Way5
Way4
CE4, R/B4
CE5, R/B5
CE6, R/B6
CE7, R/B7
Package 4
Package 5
Package 6
Package 7
Die 0
Die 1
Die 2
Die 3Way3
Way2
Way1
Way0
CE0, R/B0
CE1, R/B1
CE2, R/B2
CE3, R/B3
Channel 3
Channel 2
Channel 1
Package 0 Channel 0Package 1
Package 2
Package 3
![Page 22: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/22.jpg)
22
■ Cosmos OpenSSD
● Supports only one flash module slot (J1)
■ Cosmos+ OpenSSD
● Supports both flash module slots (J1, J2)
SO-DIMM (J2) SO-DIMM (J1)
■ Caution
● Cosmos/Cosmos+ OpenSSD flash module slots have custom pin maps
● You should not insert any SDRAM module into this slot
![Page 23: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/23.jpg)
23
■ Expand PCIe Slot of host PC to connect external device
■ Adapter card
● Installed on host PC
● Provide a high-performance and low latency solution for expanding PCIe
■ External PCIe cable (8-lane)
■ External PCIe connector (8-lane) on platform board
● 2.5 GT/s for a Gen1, 5.0 GT/s for a Gen2
● Connected with high data rate serial transceiver in FPGA
External PCIe adapter External PCIe cable
![Page 24: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/24.jpg)
24
■ JTAG cable
● Used for downloading hardware and software binary files
● Available cable types
– USB type A to USB type micro B cable
– Emulator, JTAG N pin cable (N: 7, 14, 20)
■ UART cable
● Used for monitoring internal processes of Cosmos+ OpenSSD
● USB type A to USB type A cable
USB type AUSB type AUSB type micro B
USB emulator
USB cable for
emulator
7pin cable
14 or 20pin
cableAvailable JTAG cables UART cable
![Page 25: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/25.jpg)
25
■ Single-source of power to the platform board
● 6-pin power connector (J181) or 5.5mm X 2.1mm DC power plug (J182)
■ The 6-pin connector looks similar to the regular PC 6-pin PCIe connectorNote: Difference in pin assignment between two connectors
■ Caution
● Do not plug PC 6-pin PCIe power cable in platform board 6-pin power connector (J181)
ConnectorPin map
1 2 3 4 5 6
Platform board 6-pin power 12V 12V NC NC GND GND
PC 6-pin PCIe power GND GND GND 12V 12V 12V
6-pin power5.5mm X 2.1mm DC power
or
![Page 26: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/26.jpg)
26
■ Xilinx Vivado
● Generates a FPGA bitstream
● Exports the generated FPGA bitstream to Xilinx SDK
■ Xilinx SDK
● Builds a SSD controller firmware
● Downloads a FPGA bitstream and a firmware to the Zynq FPGA
■ FPGA bitstream
● Used to configure the programmable logic side of Zynq FPGA
■ Firmware
● Manages the NAND flash array
● Handles NVMe commands
![Page 27: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/27.jpg)
27
Board Support Package
Cosmos+ OpenSSD Firmware
Software Layer
Hardware Layer
ARM Processor, Cache
NAND Flash Controller
NVMeController
Cosmos+ OpenSSD
Executable
BitstreamHard-wired
Device Driver
Block Layer
File System
Host Computer Operating System
Application
![Page 28: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/28.jpg)
29
Generate Board
Support Package
(Library)
Compile & Link
Download to
FPGA
Source files (.c file)
Firmware
executable (.elf file)
Load .hdf File
into Xilinx SDK
Cosmos+
OpenSSD
Firmware
Download
Predefined
Project File
Generate
Bitstream
FPGA bitstream (.bit)
FPGA Bitstream Build Flow Firmware Build Flow
Project and IP
source files
Xilinx
Vivado
Xilinx
SDK
![Page 29: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/29.jpg)
30
DDR3-1066
1 GB DRAM
DDR3
Memory
Controller
ARM
Cortex-A9
Dual-core
Central
Interconnect
FPGA to
Memory
Interconnect
Host
PC
Zynq Processing System
Zynq-7000 FPGA
(XC7Z045FFG900-3)
PCI Express
Gen2 8-Lane
Host Interface
32
64
32
64
64
32
32x4
32x4
32x4
32x4
NAND Flash Controller (NFC)
NAND
Die
NAND Flash Controller (NFC)
Command Path
Data PathNAND
Die
Low-level
NAND Flash
Controller
(Phy)
Dispatcher
Command Path
Data Path Low-level
NAND Flash
Controller
(Phy)
Dispatcher
NVMe Controller
NVMe DMA EngineXilinx 7-Series
Integrated
Block for PCIe
x4
x4
![Page 30: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/30.jpg)
NAND
Flash
Controller
NAND
Flash
Controller
31
GP AXI
ports
(master)
HP AXI ports
(slave)
NVMe Host Controller
NAND
Flash
Controller
NAND
Flash
Controller
■ General Purpose (GP) AXI4 Lite bus
● 32bits interface
● Used for control
● Operates @ 100MHz
■ High Performance (HP) AXI4 bus
● 64bits interface
● Used for Direct Memory Access (DMA)
● Operates @ 250 MHz
x4
x4
(Channel 0~3)
(Channel 4~7)
![Page 31: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/31.jpg)
32
32
Bus Interface Bus Interface
Dispatcher
Low-level NAND Flash Controller (Phy)
Command Filter N
Command Filter 0
Request
Completion
Marker
BCH ECC Engine
Data Scrambler
Command path
Data path
Not present,
but possible
32
8
![Page 32: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/32.jpg)
33
■ Commands and data streams are encapsulated or decapsulated
throughout modules in a layer
■ Users can insert or remove modules more easily
Bus Interface
BCH ECC
Engine
Data Scrambler
SCRB (ED (Page) + ED (Spare))
ED (Page) + ED (Spare)
Page + Spare
AXI4 (Page + Spare)
SCRB (x): Scrambled data
ED (x): BCH-encoded data
![Page 33: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/33.jpg)
34
■ Data transfers throughout a layer from DRAM to NAND flash or
from NAND flash to DRAM are all pipelined
■ Page buffer is not required in channel controller
Request 0
Request 1
Request 2
Bus Data Transfer
BCH Encoding
Data Scrambling
Bus Data Transfer
BCH Encoding
Data Scrambling
Bus Data Transfer
BCH Encoding
Data Scrambling
Time
![Page 34: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/34.jpg)
35
■ Hardware-level way scheduler of NFC in Cosmos OpenSSD is removed
■ FTL is now responsible for channel and way scheduling
■ This enables more flexible scheduling policy
NFC
Way Scheduler
FTL
Low Level Driver
Channel Scheduler
NFC
Command Queue
(FCFS)
FTL
Low Level Driver
Channel and Way
Scheduler
Way
Queue
Way
Queue
Way Scheduler
(Round Robin)
Way Controller
Way Controller
Multi-way Controller
NFC in Cosmos OpenSSD NFC in Cosmos+ OpenSSD
![Page 35: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/35.jpg)
36
■ Key equation solver (KES) used more (≥50 %) of logic cells than
syndrome calculator and chien searcher
■ Shared-KES saves 40 % of logic cells used in a BCH ECC decoder
■ Short BCH code parallelization is applied for high utilization of
hardware resources
SC bundle
Shared-KES
CS bundle
SC bundle CS bundle
SC bundle CS bundle
SC bundle CS bundle
Channel #0
Channel #1
Channel #2
Channel #3
* (256 B x 2) bundle, (13 x 2) bit error correction, (8 bit x 2) parallel level
![Page 36: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/36.jpg)
a
a
a
a
a
37
32
Bus Interface Bus Interface
PCIe Transceiver
Xilinx 7 Series PCI Express Core
AXI
Write
Channel
PCIe
DMA Engine
PCIe
Write
Channel
64
PCI Express
NVMe
CMD
Status
Checker
DMA
CMD
FIFO
NVMe
CMD
FIFO
AXI
Read
Channel
PCIe
Write
Channel
DMA
CMD
Status
Checker
128Command path
Data path
![Page 37: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/37.jpg)
38
■ The NVMe host interface completes NVMe IO commands
automatically
■ The FTL does not need to be involved in the completion process
NVMe Host
InterfaceNVMe Host
Interface
NVMe Storage
Device Firmware
NVMe Storage
Device Firmware
![Page 38: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/38.jpg)
39
■ NVMe specification 1.1/1.2 compliant
● Up to 8 IO submission/completion queues - 256 entries each
● 512B and 4KB sector size
● Physical region page (PRP) data transfer mechanism
● Native device driver for Windows 8/8.1 and Linux kernel>=3.3
● OpenFabrics Alliance (OFA) NVMe driver for Windows 7 and later
NVMe Interface Performance (DRAM Disk)
Workload Read Write
Random 4KB 300K IOPS 300K IOPS
128KB 1.7 GB/s 1.7 GB/s
![Page 39: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/39.jpg)
40
• Static mapping
• Channel/way interleaving
Pure page-level mapping (16 KB page)
• On-demand garbage collection
• Greedy selection of GC victims
Greedy garbage collection
• Predetermined priority between DMA commands and flash commands
• Out of order execution between commands accessing different flash dies
Priority-based scheduling
• Single plane flash commands
• DMA commands for data transfer between host system and SSD
Command Set
• Data transfer between host system and NAND flash memory via data buffer
• Eviction of LRU buffer entry
LRU data buffer management
![Page 40: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/40.jpg)
41
Data buffer
searching
Push to command
queue
Yes
Address translation
Buffer hit?
Host command
fetching
Enough
free block?Garbage collection
FTL initializing
No
No
Yes
Valid CMD?
DMA command
Command
scheduling & issueIs command
queue full?
Flash command
Yes
No
No
Yes
![Page 41: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/41.jpg)
42
■ Buffer entry eviction
● LRU buffer entry is evicted to allocate a buffer entry for a new request
LPN 16
LPN 4
LPN 7
LPN 2 LRU
MRU LPN 10
LPN 16
LPN 4
LPN 7
Read LPN 10
Read LPN 10 Buffer entry 3
Buffer entries Buffer entries
LRU
MRU
Buffer entry 3
Buffer entry 2
Buffer entry 0
Buffer entry 1
Buffer entry 2
Buffer entry 1
Buffer entry 3
Buffer entry 0
Address Translator
Host request
Reformed request
![Page 42: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/42.jpg)
43
■ Main Idea
● Every logical page is mapped to a corresponding physical page
■ Advantage
● Better performance over random write than block-level mapping
■ Disadvantage
● Huge amount of memory space requirement for the mapping table
Block 0
Block 1
Block 2
Block 3
a
data area spare area
ppn 0
ppn 1
ppn 2
ppn 15
flash memory
lpn: logical page number
ppn: physical page number
“write(5, a)”
lsn ppn
0 12
1 11
2 10
3 9
4 8
5 7
6 6
7 5
8 4
9 3
10 2
11 1
12 0
mapping table
ppn 3
.
.
.
lpn
![Page 43: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/43.jpg)
44
■ Mapping tables are managed within a die
● Simple channel/way interleaving for sequential logical access
Channel 1
Die 1
LPN x···x01(2)
Die 3
LPN x···x11(2)
Way 0 Way 1
Channel 0
Die 0
LPN x···x00(2)
Die 2
LPN x···x10(2)
Way 1Way 0
LPN: Logical Page Number
Each LPN is deterministically mapped to specific die (ex. 2-channel, 2-way)
![Page 44: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/44.jpg)
45
■ Why is garbage collection needed
● To reclaim new free blocks for future write requests
– Invalid data occupy storage space before GC
■ What is garbage collection
● Copies the valid data into a new free block and erases the original invalid data
● Basic operations involved in GC are the following
– 1. The victim blocks meeting the conditions are selected for erasure
– 2. The valid physical pages are copied into a free block
– 3. The selected physical blocks are erased
■ What is important in GC
● Victim block selection
– GC time depends on the status of victim block
![Page 45: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/45.jpg)
46
■ GC Trigger
● Each GC is triggered independently of other dies
● GC is triggered when there is no free user block of each die
■ Blocks in GC
● One block per die is overprovisioned
● Single victim block is a target of GC
··· ···
Victim block Free block
··· ···
Free block Returned block
GC
Valid pages in victim block are copied to free block and the role of two blocks are swapped
Source file: pagemap.c
![Page 46: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/46.jpg)
47
V2FCommand_ReadPageTrigger
▶ Read data of a flash page▶ Store data to register of the flash die
V2FCommand_ReadPageTransfer
▶ Transfer data from a flash die to data buffer▶ Inform bit error information to FTL
V2FCommand_ProgramPage
▶ Transfer data from data buffer to a flash die▶ Program data to a flash page
V2FCommand_BlockErase
▶ Erase a flash block
V2FCommand_StatusCheck
▶ Check a previous command execution result
LLSCommand_RxDMA
▶ Transfer data from host system to data buffer
LLSCommand_TxDMA
▶ Transfer data from data buffer to host system
Commands for NVMe DMA engine
Commands for NAND flash controller
![Page 47: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/47.jpg)
48
Command Priority
LLSCommand_RxDMA 0
LLSCommand_TxDMA 0
V2FCommand_StatusCheck 1
V2FCommand_ReadPageTrigger 2
V2FCommand_BlockErase 3
V2FCommand_ProgramPage 4
V2FCommand_ReadPageTransfer 5
■ Waiting commands are issued by scheduler
● Scheduler checks the state of flash memory controller and host interface controller
● Priority of flash commands enhance multi channel, way parallelism
Channel 0
Channel X
Way 0 Way 1 Way Y
Channel 1Scheduler
Command queues
NVMe DMA
engine
NAND Flash
Controller
![Page 48: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/48.jpg)
49
■ Firmware
● Supports
– Buffer management (LRU)
– Static page mapping
– Garbage collection (On-demand)
● Not supports
– Meta flush
– Wear leveling
● Notice
– I / O performance can be degraded when performing garbage collection
– The number of usable blocks is limited when the MLC NAND array is used in the 8-
channel 8-way structure
– The latest firmware in SLC mode accesses only LSB pages of MLC NAND
– Accessing to MSB pages may cause data errors not able to be corrected by ECC
![Page 49: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/49.jpg)
50
■ The bit error rate increases if MSB pages of NAND flash are accessed
■ Increased bit errors might not be corrected by BCH error correction
engine in the current version of NAND flash controller
■ For now, the firmware runs in SLC mode in order to reduce the error
rate due to this reason
![Page 50: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/50.jpg)
51
■ Currently, MLC to SLC mode transition command of NAND flash is not
supported
■ Accessing only LSB pages achieves similar characteristics to real SLC
NAND flash
Paired page address
LSB pages MSB pages
00h 02h
01h 04h
FDh FFh
…
![Page 51: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/51.jpg)
52
■ PCIe-NVMe
● Supports
– Up to PCIe Gen2.0 x8 lanes
– Mendatary NVMe commands
– PRP data transfer mechanism and out-of-order data transfer in PRP list
– 1 namespce (can be extended by updating firmware)
– Up to 8 NVMe IO submission queues and 8 NVMe IO completion queues with 256 depths
– Up to 256 depths internal NVMe command table
– MSI interrupt with 8 interrupt vectors
– x86/x64 Ubuntu 14.04 and Windows 8.1
● Not supports
– 4 byte addressing yet (on debugging)
– Optional NVMe commands (can be supported by updating firmware)
– SGL data transfer mechanism
– Power management (can be supported by updating firmware)
– MSI-X interrupt
– Virtualization and sharing features
![Page 52: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/52.jpg)
53
■ NAND flash controller
● Supports
– Channel can be configured up to 8
– Maximum bandwidth of NAND flash bus 200 MT
● Not supports
– Additional advanced commands are not supported (e.g. multi-plane operation)
![Page 53: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/53.jpg)
![Page 54: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/54.jpg)
55
■ Preparing development environment
● Host computer
● Platform board
● Development tools
■ Building materials
● FPGA bitstream
● Firmware
■ Operating Cosmos+ OpenSSD
● Bitstream and firmware download to the FPGA
● Host computer boot and SSD recognition check
● SSD format
● SSD performance evaluation and analysis
![Page 55: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/55.jpg)
56
Mainboard BIOS Ver. Result Comment
Asrock Z77 Extream 6 P2.40 Working
ASUS H87-Pro 0806x64 Working
Gigabyte H97-Gaming 3 F5 Working
Gigabyte Z97X-UD5H F8 Working
F10c Not working 4-byte addressing problems in
Cosmos+ PCIe DMA engine
![Page 56: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/56.jpg)
57
OS x86/x64 Result Comment
Windows 7 x64 Working with OFA driver
Windows 8.1 x64 Working
Windows 10 x64 Not working 4-byte addressing problems in C
osmos+ PCIe DMA engine
Ubuntu 14.04 LTS or
above
x64 Working Kernel version 3.13 or
above
![Page 57: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/57.jpg)
58
■ Check jumper pins of the platform board
■ Insert NAND flash module(s)
■ Connect the external PCIe cable
■ Connect the USB cable for jtag
■ Connect the USB cable for UART
■ Connect the power cable
![Page 58: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/58.jpg)
59
■ Make sure that jumper pins on board are set as default below
J79 J75 J76 J77 J78
![Page 59: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/59.jpg)
60
■ Make sure that jumper pins on board are set as default below
J177
![Page 60: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/60.jpg)
61
■ Make sure that jumper pins on board are set as default below
J30 J29 J28 J31
J27
![Page 61: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/61.jpg)
62
■ Make sure that jumper pins on board are set as default below
J85 J87 J86 J88
J89
J188 J187
J184
J185
J186
J35
J36
![Page 62: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/62.jpg)
63
■ Make sure that jumper pins on board are set as default below
J89 J80 J74
![Page 63: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/63.jpg)
64
■ A single NAND flash module can support up to 4-channel configuration
● For prebuild 3.0.0, two NAND flash modules are required
● For predefined project 1.0.0, one NAND flash module is required
Push first
Push next
![Page 64: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/64.jpg)
65
■ Hold external PCIe connector and push the cable in it
Hold here
Push
![Page 65: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/65.jpg)
66
■ Make sure that the cable is fixed tightly
![Page 66: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/66.jpg)
67
■ USBJTAG requires a micro-USB type B (male) to USB type A (male)
cable
USB type AUSB type micro B
Push
USBJTAG
![Page 67: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/67.jpg)
68
■ USBUART requires a USB type A (male) to USB type A (male) cable
USB type A
Push
USBUART (Connector)
![Page 68: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/68.jpg)
69
■ Connect the power cable to the 5.5 mm power connector
Push
![Page 69: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/69.jpg)
70
■ Download materials
● Prebuilt FPGA bitstream
● Pre-defined Vivado project for manual FPGA bitstream generation
● Firmware source code
■ Install Xilinx Vivado Design Suite: System Edition 2014.4
● Xilinx Vivado 2014.4
● Xilinx SDK 2014.4
![Page 70: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/70.jpg)
71
■ Go to the OpenSSD project site, and click “Resources”
http://www.openssd.io
![Page 71: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/71.jpg)
72
■ Click “Source”
![Page 72: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/72.jpg)
73
■ Click “Clone or download” -> “Download ZIP”
![Page 73: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/73.jpg)
74
■ Materials include a prebuilt bitstream, a pre-defined project, and a
firmware source code
Prebuild-3.0.0
Pre-defined project-1.0.0
![Page 74: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/74.jpg)
75
Bitstream Type Ver. Channel Way Bits / cell Capacity
Prebuild 3.0.0 8 8 SLC / MLC 1 TB / 2 TB
Predefined 1.0.0 2 8 SLC / MLC 256 GB / 512 GB
Firmware Type Ver. Channel Way Bits / cell Capacity
GreedyFTL
2.5.0
8 8 SLC 1 TB2.6.0
2.7.0
GreedyFTL 2.7.1 2 8 SLC 256 GB
![Page 75: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/75.jpg)
76
■ Prebuild type
● A prebuilt bitstream is included, so you can skip bitstream generation steps
● Prebuild type is distributed as a hardware description file (.hdf) which consists of a
FPGA bitstream, bitstream information, and an initialization code for CPU in Zynq
FPGA
■ Pre-defined type
● bitstream is not included, so you should follow bitstream generation steps
● Pre-defined type is distributed as a vivado project file with register transfer level
(RTL) source codes of intellectual properties (IPs) such as NVMe controller
![Page 76: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/76.jpg)
77
■ Make sure that Vivado is system edition and that “Software
Development Kit” and “Zynq-7000” are checked
![Page 77: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/77.jpg)
78
1. Run synthesis
2. Run implementation
3. Generate bitstream
4. Export hardware
![Page 78: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/78.jpg)
79
■ Open the predefined project included in “OpenSSD2_2Ch8Way-1.0.0”
![Page 79: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/79.jpg)
80
■ Click “Run Synthesis”
![Page 80: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/80.jpg)
81
■ Synthesis is running…
![Page 81: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/81.jpg)
■ Select “Run Implementation” and click OK
● If you want to see the synthesized results, choose “Open Synthesized Design” or
“View Reports”
82
![Page 82: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/82.jpg)
83
■ Implementation is running…
![Page 83: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/83.jpg)
84
■ The following critical messages appear when implementation is
running, but you can ignore it
![Page 84: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/84.jpg)
85
■ Check the status of synthesis and implementation
![Page 85: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/85.jpg)
86
■ Click “Generate Bitstream”
![Page 86: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/86.jpg)
87
■ Generate bitstream is running…
![Page 87: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/87.jpg)
88
■ If you want to see the implemented design, select open implemented
design and click the OK button
![Page 88: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/88.jpg)
89
■ Go to File -> Export and click “Export Hardware”
![Page 89: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/89.jpg)
90
■ Select the “Include bitstream” and click OK
![Page 90: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/90.jpg)
91
■ Go to File -> Launch SDK
![Page 91: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/91.jpg)
92
■ Click the OK button
![Page 92: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/92.jpg)
93
■ Then, SDK is launched
![Page 93: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/93.jpg)
94
■ As shown below, exported hardware platform is set as target hardware
![Page 94: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/94.jpg)
95
1. Create a new application project
2. Add source codes
3. Build firmware source codes
![Page 95: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/95.jpg)
96
■ Go to File -> New -> Application Project
![Page 96: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/96.jpg)
97
■ Fill in the project name and click “Next”
![Page 97: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/97.jpg)
98
■ Select an empty application and finish this template wizard
![Page 98: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/98.jpg)
GreedyFTL-2.7.1
99
■ Copy GreedyFTL source files to “src” folder in project explorer
Copy
![Page 99: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/99.jpg)
100
■ If everything goes well, the automatic build process should finish
successfully
![Page 100: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/100.jpg)
101
■ Click “Build All” to make both debug and release executables
![Page 101: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/101.jpg)
102
1. Create a workspace directory and a new application project
2. Set a hardware platform
3. Add source codes
4. Build firmware source codes
![Page 102: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/102.jpg)
103
■ Launch Xilinx SDK and designate the workspace
![Page 103: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/103.jpg)
104
■ Go to File -> New -> Application Project
![Page 104: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/104.jpg)
105
■ Press “New” to register the hardware description file (HDF)
![Page 105: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/105.jpg)
106
■ Name the hardware project and specify the path of the HDF
![Page 106: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/106.jpg)
107
■ Name the application project and finish this project wizard
![Page 107: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/107.jpg)
108
■ Copy GreedyFTL source files to “src” folder in project explorer
Copy
GreedyFTL-2.7.0
![Page 108: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/108.jpg)
109
■ If everything goes well, the automatic build process should finish
successfully
![Page 109: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/109.jpg)
110
■ Click “Build All” to make both debug and release executables
![Page 110: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/110.jpg)
111
1. Power on the platform board
2. Configure UART
3. Program FPGA
4. Execute firmware
![Page 111: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/111.jpg)
112
■ Before you power on the board, make sure that your host computer is
powered off
Slide
![Page 112: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/112.jpg)
113
■ In SDK, go to Terminal -> New Terminal Connection as shown below
![Page 113: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/113.jpg)
114
■ Set “Connection Type” and “Baud Rate” to serial and 115200,
respectively
![Page 114: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/114.jpg)
115
■ If then, UART is connected as shown below
![Page 115: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/115.jpg)
116
■ Click “Xilinx Tools” -> click “Program FPGA”
![Page 116: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/116.jpg)
117
■ Click “Program” to program FPGA
![Page 117: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/117.jpg)
118
■ Hang on a second
![Page 118: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/118.jpg)
119
■ Check FPGA programming done successfully
![Page 119: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/119.jpg)
120
■ Right click on the application project -> “Run As” -> click “1 Launch
on Hardware (GDB)”
![Page 120: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/120.jpg)
121
■ Click the firmware to execute -> click “OK” -> wait UART message
![Page 121: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/121.jpg)
122
■ Press ‘n’ to maintain the bad block table
![Page 122: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/122.jpg)
123
■ Choose whether remake the bad block table in FTL initialization step
● If you want to remake the bad block table, press “X” on UART terminal
– Bad block table format of greedy FTL v2.7.0 is different from the previous versions
– Damaged bad block table can be recovered
“X” erases all blocks including a metadata block
Others maintain the bad block table
![Page 123: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/123.jpg)
124
■ Bad blocks are detected in FTL initialization step
Firmware
Start
FTL
Initialization
Host Request
Fetch
Host Request
Processing
Bad Block Management
Way 0
Metadata Block
Bad Block Table
…
Way Y
Metadata Block
……
Channel 0
Read a bad block table
Update mapping data
Bad block table
does not exist
“Block” number means mapped block number
“phyBlock” number means physical block number
Read a bad mark of all blocks
Distinguish bad block
Save a bad block table
![Page 124: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/124.jpg)
125
■ Turn on the host PC when the firmware reset is done
![Page 125: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/125.jpg)
126
■ NVMe SSD initialization steps are on going
![Page 126: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/126.jpg)
127
1. Check device recognition
2. Create a partition
3. Check the created partition
4. Format the partition
5. Create a mount point
6. Mount the partition
7. Check the mounted partition
![Page 127: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/127.jpg)
128
■ Click the pointed icon
![Page 128: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/128.jpg)
129
■ Click the terminal icon
![Page 129: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/129.jpg)
130
■ Types “lspci” -> press ENTER -> check “Non-Volatile memory controlle
r: Xilinx Corporation Device 7028” on the PCI device list
![Page 130: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/130.jpg)
131
■ Types “ls /dev” -> press ENTER -> check “nvme0nxxxx” on the device
list
![Page 131: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/131.jpg)
132
■ Type “sudo fdisk /dev/nvme0nxxxx”, press ENTER -> type your passw
ord, press ENTER -> type “n”, press ENTER -> type “p”, press ENTER
-> type “1”, press ENTER -> type “4096”, press ENTER
![Page 132: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/132.jpg)
133
■ Types “ls /dev” -> press ENTER -> check “nvme0nxxxxp1” on the devi
ce list
![Page 133: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/133.jpg)
134
■ Type “mkfs -t ext4 / dev/nvme0nxxxxp1”, press ENTER
![Page 134: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/134.jpg)
135
■ Type “sudo mkdir /media/nvme”, press ENTER
![Page 135: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/135.jpg)
136
■ Type “sudo mount /dev/nvme0nxxxxp1 /media/nvme”, press ENTER
![Page 136: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/136.jpg)
137
■ Type “lsblk”, press ENTER -> check the mounted partition on the bloc
k device list
![Page 137: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/137.jpg)
138
■ Type “df -h”, press ENTER -> check the mounted partition on the stora
ge list
![Page 138: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/138.jpg)
139
1. Check device recognition
2. Create a partition
3. Format the partition
![Page 139: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/139.jpg)
140
■ This PC → click left mouse button → click “Properties”
![Page 140: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/140.jpg)
141
■ System → click “Device Manager”
![Page 141: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/141.jpg)
142
■ Disk drives → double-click “NVMe Cosmos+ OpenSSD”
![Page 142: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/142.jpg)
143
■ Control panel → click “Administrative Tools”
![Page 143: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/143.jpg)
144
■ Administrative tools → double-click “Computer Management”
![Page 144: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/144.jpg)
145
■ Computer management → click “Disk Management” → click “OK” to
confirm disk initialization
![Page 145: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/145.jpg)
146
■ Click right mouse button on “Disk 2” which was shown in 3rd step →
click “Properties”
![Page 146: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/146.jpg)
147
■ Make sure that the “Disk 2” is Cosmos+ OpenSSD before you proceed
to the next step
![Page 147: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/147.jpg)
148
■ Click right mouse button on the right part of “Disk 2” → click “New
Simple Volume”
![Page 148: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/148.jpg)
149
■ Click “Next”
![Page 149: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/149.jpg)
150
■ Click “Next”
![Page 150: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/150.jpg)
151
■ Select desired drive letter → Click “Next”
![Page 151: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/151.jpg)
152
■ Type desired volume label → Click “Next”
![Page 152: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/152.jpg)
153
■ Click “Finish”
![Page 153: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/153.jpg)
154
■ Formatting is now finished
![Page 154: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/154.jpg)
155
■ Now you can find the formatted Cosmos+ OpenSSD at “This PC”
![Page 155: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/155.jpg)
156
1. Install benchmark application (Iometer)
2. Disconnect workers except one worker
3. Generate a access specification
4. Set the sufficient number of outstanding I/Os
5. Assign a access specification
6. Run an evaluation
7. Check evaluation results
![Page 156: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/156.jpg)
157
■ Iometer 1.1.0 (http://www.iometer.org/doc/downloads.html)
● Cosmos+ OpenSSD is recognized as NVMe storage device
![Page 157: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/157.jpg)
158
■ Avoid Workers having a same access specifications
● Workers can access the same logical address almost the same time
– Increase the data buffer hit ratio
● Performance can be measured higher than real performance
![Page 158: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/158.jpg)
159
■ User can define a access specification
![Page 159: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/159.jpg)
160
■ Select a desired access specification and click “Add” button
![Page 160: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/160.jpg)
161
■ X channel – Y way flash array needs “X * Y” outstanding flash requests
at least for utilizing multi channel/way parallelism
● In case of a Cosmos+ OpenSSD configuration (8 channel – 8 way, 16KB page size),
“128KB sequential write” access specification needs 8 outstanding I/Os at least
● Recommend the environment generating 2 * X * Y outstanding flash requests
16
64 (128KB/16KB * 8) outstanding flash requests
![Page 161: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/161.jpg)
162
■ Set the update frequency and click “Run” button
Run
![Page 162: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/162.jpg)
163
■ “Results display” tab shows the performance evaluation results
● IOPs, throughput, average/maximum response time
![Page 163: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/163.jpg)
164
■ Perform pre-fill process before the read performance evaluation
● There are no mapping information for unwritten data
■ Set the number of outstanding I/Os equal or less than 256
● Unknown problem of host interface
■ Set the write request size equal or larger than the page size
● Read-modify-write process can degrade the performance
– In case of “4KB random write”, IOPs can be decreased as the experiment progresses
![Page 164: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/164.jpg)
165
■ Maximum throughput/channel ≒ 173 MB/s
● 100Mhz DDR flash bus (bit width: 8) → 200MB/s
● 16,384 + 1,664(spare) byte page → 90% (16,384/18048) of 200MB/s = 181MB/s
● Overhead of flash memory controller → 173 MB/s
■ Measured throughput/channel of 8channel-8way configuration
● Sequential read: 99% of maximum throughput
● Sequential write: 45~90% of maximum throughput
SLC
MLC
(MB/s) (MB/s)
128KB sequential read 128KB sequential write
0
200
400
600
800
1000
1200
1400
1600
0
200
400
600
800
1000
1200
1400SLC
MLC
![Page 165: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/165.jpg)
166
■ Maximum 4KB IOPs/channel ≒ 10812 IOPs
● Page mapping → a page is accessed in order to access 4KB data
● 173MB/s(Maximum throughput/channel) ÷ 16KB (page size) = 10812 IOPs
■ Measured throughput/channel
● 1channel-8way configuration
– Random 4KB read: 96% of maximum 4KB IOPs
– Random 4KB write: 38~88% of maximum 4KB IOPs
● 8channel-8way configuration
– SW-based scheduling has a larger latency in many channel/way configuration
– Scheduling latency can increase the idle time of hardware controllers
SLC
MLC
(IOPs) (IOPs)
4KB random read 4KB random write
(w/o read-modify-write)
0
10000
20000
30000
40000
50000
60000
70000
80000
0
10000
20000
30000
40000
50000
60000
70000SLC
MLC
![Page 166: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/166.jpg)
167
■ Performance degradation by on-demand garbage collection
● After all available blocks are used, garbage collection is triggered steadily
● Effect of performance degradation varies depending on copy operation overhead
– Copy operation overhead depends on the number of valid page belong to victim blocks
1190
1200
1210
1220
1230
1240
1250
1260
0 5 10 15 20 25 30
- 3.6 %
0
200
400
600
800
1000
1200
1400
0 5 10 15 20 25 30
(MB/s) (MB/s)
Victim blocks with no valid pages
in SLC 8channel-8way configuration
Victim blocks with valid pages (half of total page)
in SLC 8channel-8way configuration
- 67.4 %
![Page 167: Prof. Yong Ho Song](https://reader031.vdocuments.mx/reader031/viewer/2022013015/61cfb695dbf0f01703246225/html5/thumbnails/167.jpg)