ddr4 memory compliance testing barbara aichinger futureplus systems

38
FuturePlus Systems Corporation 15 Constitution Drive Bedford NH 03110 USA Barbara P. Aichinger Vice President New Business Development DDR4 Memory Compliance Testing

Upload: barbara-aichinger

Post on 22-Feb-2017

52 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

FuturePlus Systems Corporation

15 Constitution Drive

Bedford NH 03110 USA

Barbara P. Aichinger Vice President New Business Development

DDR4 Memory Compliance Testing

Page 2: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Agenda

• DDR Memory Standards for Compliance Testing

• Memory problems continue to plague the industry – Recent Published Papers

– Row Hammer Failures

– Security Issues

• The concept of an Audit for Compliance Testing – Electrical

– Protocol

– Row Hammer

– SPD/MRS

– Performance/Margin

• Summary

Page 3: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Compliance Testing Documents

• Not yet…getting closer…

• FuturePlus Systems Sponsoring a

Protocol Checks Document

– Task Group has several Industry members

and several T&M vendors

– Several ballots have been passed and a

document is expected in 2017

Page 4: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Memory Errors continue to

plague the industry

Page 5: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Memory Errors in Modern Systems

Page 6: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

This is called Thresholding

Page 7: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Average ~2%

Page 8: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems
Page 9: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Errors in Facebook’s Fleet of

Servers

Page 10: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

If FB has 100K Servers • ~2% have a memory failure every month

• Of that number 46% of those have a DIMM

swap

• Doing the math….2% of 100K is 2000

• 46% of 2000 = 920 DIMM Swaps a Month!

• 30 days a month, 24 hours a day = 720 hours

in a month

Facebook is swapping out DIMMs every hour of every day of every month all year long!

Page 11: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

An Update on Row Hammer

Failures

• Seen on DDR4

– Passmark Blog

• Several reports for DDR4 failing the Row Hammer

test

– ThirdIO paper

• http://www.thirdio.com/rowhammer.pdf

– Usenix

– Blackhat

– SGI seeing DDR4 RH failures in HPC

Page 12: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Row Hammer

A quick review!

0

1

0 0 0 0 0 0 0 0

1 1 1 1 1

Activate Command

Columns

Rows (pages)

Victim Row

Page 13: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

USENIX Security Symposium

August 2016

Page 14: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

ECC will not save you!

Page 15: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Row Hammer Failures on

DDR4

https://www.sgi.com/pdfs/4567.pdf

Page 16: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Introducing: The concept of an

AUDIT for JEDEC Compliance

Testing

• Not a repeat of a Design Verification

• A check to make sure the JEDEC

specification is being met

Page 17: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

For the System and DIMM

• Audit the signal integrity of the memory channel

• Monitor the system for Protocol Violations

– BIOS programming errors

– SPD programmed incorrectly

– Memory Controller Issues

• SPD Check

• Row Hammer Testing

• Performance/Margin Testing

Page 18: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Using a Scan from a Logic

Analyzer instead of a Scope

• Allows for an easy and quick check of:

– Signal Alignment

– Relative Data Valid Eye

– Signal Swing

Page 19: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

To see all signals at once a slot

interposer is used

Page 20: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

DIMM Slot Interposer allows the system to operate up to 4200MT/s and run any application

Page 21: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Audit: Signal Swing

Slide Courtesy of

Overdriving DDR4 DRAM

to 1.4V could cause

damage.

Potential ODT setting issue. Threshold of first bit in burst has less swing than remainder of burst. Could also be ISI (inter-symbol interference)

Page 22: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Audit: Signal Alignment

For READS the Strobe is level

aligned For WRITES the

Strobe is Edge Aligned to the

Data

Page 23: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Signal Alignment

All the Data signals in a

Byte should be aligned

Page 24: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Relative Data Eye

DQ Write Eye overlay on Byte 5

5000 cycles (2400MT/s)

Eye threshold

centered at 790mv – 838mv

Eye size

Avg. of 272mV x 205 ps

Observations

All eyes are consistent in size and alignment.

Page 25: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Address Signals

Page 26: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Easy to check even at higher speeds

3200MT/s

Read data with Strobe

Write data with Strobe

Page 27: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Next Check for JEDEC Protocol

Violations by the memory controller

• The DDR4 JEDEC spec contains rules on

event ordering

• Examples

– Do not ACTIVATE a bank that is already open

– Do not PRECHARGE a bank that is already

closed

– Do not RD/WR a non open page

Page 28: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Memory Controller

Timing Violations

• Clock edge boundary

– Commands can not be too close together or too far apart

– Examples

• tREFI - Average refresh interval

• tRC - ACT to ACT or REF

• tMOD - MRS to PDE

• tCCD_L - RD to RD to Same Bank Group

Page 29: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

65 violations identified with over

1000+ simultaneous checks

Page 30: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Protocol and Timing Compliance

‘in the wild’

Page 31: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

JEDEC Specification Violation

Page 32: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

The SPD has to be checked! Serial Presence Detect Device

Mistakes in the SPD can lead to the BIOS not

programming the Memory

Controller correctly

Page 33: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Mode Register Settings

Page 34: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Performance Metrics Not necessary for JEDEC compliance but a nice to

know!

• Which power management features are implemented

– Is Self Refresh ever being used?

– Is Max Power Down implemented?

• Can we look to see if any timing parameters can be improved?

Page 35: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Increasing Performance by

looking at timing margins RD to WR same Rank

Spec says 7 system operating at 10

Operating right at

Specification

Not happening! No Power

Management

Page 36: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Making the Measurement

Photos Courtesy of Keysight Technologies Photos Courtesy of FuturePlus Systems

Page 37: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Summary

• Memory Errors in the Field are pervasive!

• DDR Memory Compliance Testing can be

achieved using the method outlined

• Tools are available

– Purchase or Rent

• Companies needing help can hire industry

experts to perform the testing for them

Page 38: DDR4 Memory Compliance Testing   Barbara Aichinger FuturePlus Systems

Contact Information

Barbara P. Aichinger

FuturePlus Systems

[email protected]

603-472-5905

www.FuturePlus.com

www.DDRDetective.com