locality of reference

27
L ocality of R eference 2017.10.28 NAGOYA.BIN #1 KOUJI MATSUI (@KEKYO2)

Upload: kouji-matsui

Post on 21-Jan-2018

170 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Locality of Reference

LocalityofReference2017.10.28 NAGOYA.BIN #1 KOUJI MATSUI (@KEKYO2)

Page 2: Locality of Reference

Kouji Matsui - kekyo• NAGOYA city, AICHI pref., JP

• Twitter – @kekyo2 / Facebook

• ux-spiral corporation

• Microsoft Most Valuable Professional VS and DevTech 2015-

• Certified Scrum master / Scrum product owner

• Center CLR organizer.

• .NET/C#/F#/IL/metaprogramming or like…

• Bike rider

Page 3: Locality of Reference

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Page 4: Locality of Reference

Physical side scales

Page 5: Locality of Reference

Physical side scales

Processor #1

Physical Core #4

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #3

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #2

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #1

Logical Core #1Logical Core #1Logical Core #1Logical Core #1

Processor #2

Physical Core #8

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #7

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #6

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #5

Logical Core #1Logical Core #1Logical Core #1Logical Core #17

Processor #3

Physical Core #12

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #11

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #10

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #9

Logical Core #1Logical Core #1Logical Core #1Logical Core #33

Page 6: Locality of Reference

Physical side scales

The memory/IO bind at the fixed CPU/Core(Non configurable)

Page 7: Locality of Reference

Physical side scales

The “shared cache memory” bind at the fixed CPU/Core

(Non configurable)

Page 8: Locality of Reference

Physical side scales

The “cache memory” bind at the fixed CPU/Core(Non configurable)

The “shared cache memory” bind at the fixed CPU/Core

(Non configurable)

Page 9: Locality of Reference

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Page 10: Locality of Reference

Logical side scales

Process #1

VirtualMemory

Space

Thread #1Thread #1Thread #1Thread #1Thread #1

Process #2

VirtualMemory

Space

Thread #1Thread #1Thread #1Thread #1Thread #11

Process #3

VirtualMemory

Space

Thread #1Thread #1Thread #1Thread #1Thread #21

Process #4

VirtualMemory

Space

Thread #1Thread #1Thread #1Thread #1Thread #31

Process #5

VirtualMemory

Space

Thread #1Thread #1Thread #1Thread #1Thread #41

Page 11: Locality of Reference

Logical side scales

Thread #1Thread #1Thread #1Thread #1Thread #1

Thread #1Thread #1Thread #1Thread #1Thread #11

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3

Logical Core #2

Logical Core #1

This is true story

Execution context

Page 12: Locality of Reference

Logical side scales

Thread #1Thread #1Thread #1Thread #1Thread #1

Thread #1Thread #1Thread #1Thread #1Thread #11

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3

Logical Core #2

Logical Core #1

Switch execution context

Page 13: Locality of Reference

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Page 14: Locality of Reference

Data stream between physicals and logicals

Thread #1Thread #1Thread #1Thread #1Thread #1

Thread #1Thread #1Thread #1Thread #1Thread #11

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3

Logical Core #2

Logical Core #1

L1/L2 cache #1

L1/L2 cache #2 L1/L2 cache #4

L1/L2 cache #3

Page 15: Locality of Reference

Thread #1Thread #1Thread #1Thread #1Thread #1

Thread #1Thread #1Thread #1Thread #1Thread #11

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3

Logical Core #2

Logical Core #1

L1/L2 cache #1

L1/L2 cache #2 L1/L2 cache #4

L3 cache #1 L3 cache #2

L1/L2 cache #3

Page 16: Locality of Reference

Thread #1Thread #1Thread #1Thread #1Thread #1

Thread #1Thread #1Thread #1Thread #1Thread #11

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3

Logical Core #2

Logical Core #1

L1/L2 cache #1

L1/L2 cache #2

L1/L2 cache #3

L1/L2 cache #4

L3 cache #1 L3 cache #2

NUMA node bound memory

Page 17: Locality of Reference

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Page 18: Locality of Reference

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

L1/L2 cache #4

L3 cache #2

NUMA node bound memorydeclaredType

currentType

stopType

field

FieldInfo[]

Thread #33 context

Load

/Prelo

ad

Page 19: Locality of Reference

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

L1/L2 cache #4

L3 cache #2

NUMA node bound memory

__stack0_0

Thread #42 context

__stack0_1

__stack0_2

__stack1_0

declaredType

currentType

local0

local1

field

Load

/Prelo

ad

Switch

Page 20: Locality of Reference

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3L1/L2 cache #3

L1/L2 cache #4

L3 cache #2

NUMA node bound memorydeclaredType

currentType

stopType

field

FieldInfo[]

Page 21: Locality of Reference

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3L1/L2 cache #3

L1/L2 cache #4

L3 cache #2

NUMA node bound memorydeclaredType

currentType

stopType

field

FieldInfo[]

stopType

field

FieldInfo[]

field

Load

/Prelo

ad

Switch

Page 22: Locality of Reference

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Page 23: Locality of Reference

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3L1/L2 cache #3

L1/L2 cache #4

L3 cache #2

NUMA node bound memoryCommon value

Common value

Common value

Load

/Prelo

ad

Load

/Prelo

ad

These threads access common value

Page 24: Locality of Reference

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3L1/L2 cache #3

L1/L2 cache #4

L3 cache #2

NUMA node bound memoryCommon value

Common value

Common value

Race condition(Receive coherence penalty)

STRATEGY:• Turn to immutable• Hashed indexer

Wri

te b

ack

Wri

te b

ack

Page 25: Locality of Reference

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Page 26: Locality of Reference

Conclusion

The execution context bounds not THREAD. The code executor is CPU CORE.

CPU cores have structuable nested cache system.

Cache miss penalty is large.

Cache coherency penalty is large.

Both I/O systems too.

Important cache-related architecture:◦ Locality of reference

◦ Immutable

Page 27: Locality of Reference

Thanks join!

My blog◦ http://www.kekyo.net/

Current active project:◦ IL2C - A translator implementation of .NET intermediate language to C

language.◦ YouTube: http://bit.ly/2xtu4MH

◦ GitHub: https://github.com/kekyo/IL2C