embedded lab. park yeongseong

20
Regularities Considered Harmful: Forcing Randomness to Memory Accesses to Reduce Row Buffer Conflicts for Multi-Core, Multi-Bank Systems Embedded Lab. Park Yeongseong ACM ASPLOS’13 Heekwon Park, Computer Science Department University of Pittsburgh

Upload: fathia

Post on 22-Mar-2016

49 views

Category:

Documents


4 download

DESCRIPTION

Regularities Considered Harmful: Forcing Randomness to Memory Accesses to Reduce Row Buffer Conflicts for Multi-Core, Multi-Bank Systems. ACM ASPLOS’13 Heekwon Park, Computer Science Department University of Pittsburgh . Embedded Lab. Park Yeongseong. Contents. Introduction Background - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Embedded Lab. Park  Yeongseong

Regularities Considered Harmful: Forcing Randomness to Memory Accesses to Re-

duce Row Buffer Conflicts for Multi-Core, Multi-Bank Systems

Embedded Lab.Park Yeongseong

ACM ASPLOS’13 Heekwon Park, Computer Science Department University of Pittsburgh

Page 2: Embedded Lab. Park  Yeongseong

Introduction Background Regularity Considered Harmful Design and Implementation Performance Evaluation Conclusions Q&A

Contents

Page 3: Embedded Lab. Park  Yeongseong

Recent computer architecture (Multi-Core) A vast amount of main memory

Introduction

Need to re-examine ◦ internal policies, mechanisms

Rethinking the memory allocation issue

Page 4: Embedded Lab. Park  Yeongseong

Background Problem

◦ Row buffer conflict

Approach◦ Memory container◦ Randomize memory ac-

cess

< Conceptual memory organization >

Page 5: Embedded Lab. Park  Yeongseong

Row-buffer Conflict◦ Precharging◦ Activating operation

Delay Energy Consumption

Background

< Row-buffer hit and conflict overhead >

Page 6: Embedded Lab. Park  Yeongseong

Background

< Conflict does not occur > < Conflict occurs>

Kernel-level memory allocator◦ Mapping between virtual pages and physical page

frames Memory controller

◦ Banks

Page 7: Embedded Lab. Park  Yeongseong

CPU cache mode ◦ Uncacheable

Variables numerous times Access

two variables mutu-ally dependent

Memory Organization Analysis

Page 8: Embedded Lab. Park  Yeongseong

Memory Organization Analysis Figure (d) ranges from 0 to

2,000,000 (roughly 128MB size)

Figure (c) zooms in on the 590,000 ~ 640,000 portion of Figure (d)

Figure (b) zooms in on a por-tion of iterations of Figure (c)

Figure (a) zooms in on a por-tion of iterations of Figure (b)

< Analysis result>

Page 9: Embedded Lab. Park  Yeongseong

Regularity Considered Harmful

< Sequential access pattern >

Modified Algorithm◦ Set the two variable : lo-

cated in the same cache line

◦ Different starting physical address

Average elapsed time◦ 2052μsec

Page 10: Embedded Lab. Park  Yeongseong

Regularity Considered Harmful

< Random access pattern >

Average elapsed time◦ 1925 μsec

“1/total number of banks”.

Page 11: Embedded Lab. Park  Yeongseong

Design and Implementation

< Memory container design >

The minimum memory unit of page frame

Page 12: Embedded Lab. Park  Yeongseong

Design and Implementation

< Comparison between buddy and randomized algorithm>

Individual page frame management Downward search

Page 13: Embedded Lab. Park  Yeongseong

Experiment Environment◦ IBM x3650 M2 Server◦ Intel XEON x5570 quad core processors◦ 32GB DDR3 Memory◦ 450GB SAS Disk 8◦ Linux kernel version 2.6.32

Performance Evaluation

Page 14: Embedded Lab. Park  Yeongseong

Benchmark category◦ 1 Group : Memory intensive benchmark

Stream, Sysbench-memory, Ramspeed

◦ 2 Group : CPU or I/O intensive benchmark Kernel Compile, Dbench, Unixbench

◦ 3 Group : To represent diverse application do-mains PARSEC

Performance Evaluation

Page 15: Embedded Lab. Park  Yeongseong

Performance Evaluation

< Memory intensive benchmark results >

< CPU or I/O intensive benchmark results >

Page 16: Embedded Lab. Park  Yeongseong

Performance Evaluation

< PARSEC benchmark result >

Page 17: Embedded Lab. Park  Yeongseong

: kernel-level memory allocator◦ Multi-core, Multi-bank systems

Dedicate multiple banks to a core◦ Maximize memory parallelism

Same bank Access reduce

Conclusions

Memory container

Randomizing memory allocation algorithm

Page 18: Embedded Lab. Park  Yeongseong

http://people.cs.pitt.edu/~parkhk/publications.html

멀티 - 코어 멀티 - 뱅크에서의 메모리 참조 패턴에 따른 성능 분석 – 학위논문 ( 석사 ) 이 상엽

References

Page 19: Embedded Lab. Park  Yeongseong
Page 20: Embedded Lab. Park  Yeongseong

Q&A