![Page 1: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/1.jpg)
1
Constructing Virtual Architectures
on Tiled Processors
David Wentzlaff
Anant Agarwal
MIT
![Page 2: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/2.jpg)
2
Emulators and JITs
for Multi-Core
David Wentzlaff
Anant Agarwal
MIT
![Page 3: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/3.jpg)
3
Why Multi-Core?
![Page 4: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/4.jpg)
4
Why Multi-Core?
Future architectures will be on-chip parallel machines
Moore’s Law provides more parallel silicon resources
Diminishing sequential returns
Growth applications are parallel
![Page 5: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/5.jpg)
5
Why Multi-Core?
Future architectures will be on-chip parallel machines
Moore’s Law provides more parallel silicon resources
Diminishing sequential returns
Growth applications are parallel
Future architectures will be optimized for parallel applications
Hardware compatibility will be broken
![Page 6: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/6.jpg)
6
Why Emulators and JITs on Multi-Core?
Future architectures will be on-chip parallel machines
Moore’s Law provides more parallel silicon resources
Diminishing sequential returns
Growth applications are parallel
Future architectures will be optimized for parallel applications
Hardware compatibility will be broken
![Page 7: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/7.jpg)
7
Why Emulators and JITs on Multi-Core?
Future architectures will be on-chip parallel machines
Moore’s Law provides more parallel silicon resources
Diminishing sequential returns
Growth applications are parallel
Future architectures will be optimized for parallel applications
Hardware compatibility will be broken
Future architectures will need to run legacy applications
Market forces will require future chips to run 1983 “Frogger” for DOS
Software re-verification on new architectures too costly
![Page 8: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/8.jpg)
8
Are Emulators and JITs for Multi-Core Different?
![Page 9: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/9.jpg)
9
Are Emulators and JITs for Multi-Core Different?
Yes
![Page 10: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/10.jpg)
10
Are Emulators and JITs for Multi-Core Different?
Yes
Bountiful parallel resources
Example: Code optimization cost is reduced
“hot spot” analysis may miss sequential performance
Parallelize client application
![Page 11: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/11.jpg)
11
Are Emulators and JITs for Multi-Core Different?
Yes
Bountiful parallel resources
Example: Code optimization cost is reduced
“hot spot” analysis may miss sequential performance
Parallelize client application
Parameters for Multi-Core are different
Core-to-Core latencies reduced
![Page 12: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/12.jpg)
12
Road Map
Exploit on-chip parallel resources to accelerate emulation
![Page 13: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/13.jpg)
13
Road Map
Exploit on-chip parallel resources to accelerate emulation
![Page 14: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/14.jpg)
14
Road Map
Exploit on-chip parallel resources to accelerate emulation
Focused on performance
![Page 15: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/15.jpg)
15
Road Map
Exploit on-chip parallel resources to accelerate emulation
Focused on performance
Acceleration Mechanisms
![Page 16: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/16.jpg)
16
Road Map
Exploit on-chip parallel resources to accelerate emulation
Focused on performance
Acceleration Mechanisms
1. Pipelining Virtual Architectures
![Page 17: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/17.jpg)
17
Road Map
Exploit on-chip parallel resources to accelerate emulation
Focused on performance
Acceleration Mechanisms
1. Pipelining Virtual Architectures
2. Speculative Parallel Translation
![Page 18: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/18.jpg)
18
Road Map
Exploit on-chip parallel resources to accelerate emulation
Focused on performance
Acceleration Mechanisms
1. Pipelining Virtual Architectures
2. Speculative Parallel Translation
3. Static & Dynamic Architecture Reconfiguration
![Page 19: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/19.jpg)
19
Road Map
Exploit on-chip parallel resources to accelerate emulation
Focused on performance
Acceleration Mechanisms
1. Pipelining Virtual Architectures
2. Speculative Parallel Translation
3. Static & Dynamic Architecture Reconfiguration
Proof of concept system
All software parallel translator: x86 on Raw
![Page 20: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/20.jpg)
20
Data RAM
Disk
Background: Translation
x86
Binary
Runtime -- Execution
x86
Binary
Code Cache Code Cache
Tags
Translator
x86 Parser &
High Level
Translator
High Level
Optimization
Low Level
Code Generation
Low Level
Optimization and
Scheduling
![Page 21: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/21.jpg)
21
Background: “Old” Parallel Translation
![Page 22: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/22.jpg)
22
1. Pipelining Virtual Architectures
![Page 23: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/23.jpg)
23
1. Pipelining Virtual Architectures
Translator
Tile
Code Cache
Tile
Execution
Tile
MMU
Tile
Data Cache
Tile
![Page 24: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/24.jpg)
24
1. Pipelining Virtual Architectures
Utilize a Tiled Processor as fabric to construct virtual processor
Translator
Tile
Code Cache
Tile
Execution
Tile
MMU
Tile
Data Cache
Tile
![Page 25: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/25.jpg)
25
1. Pipelining Virtual Architectures
Utilize a Tiled Processor as fabric to construct virtual processor
Coarse grain pipelining to exploit parallelism
Translator
Tile
Code Cache
Tile
Execution
Tile
MMU
Tile
Data Cache
Tile
![Page 26: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/26.jpg)
26
Sequential TranslationExecution
Time
Translation
Time
Optimization
Time
Tim
e
![Page 27: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/27.jpg)
27
2. Speculative Parallel TranslationExecution
Time
Translation
Time
Optimization
Time
Tim
e
![Page 28: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/28.jpg)
28
3. Reconfiguration
Different programs have different characteristics
Processor Architect uses benchmarks to choose “compromise”
processor
![Page 29: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/29.jpg)
29
3. Reconfiguration
Different programs have different characteristics
Processor Architect uses benchmarks to choose “compromise”
processor
Static Reconfiguration
Choose different virtual machine configuration based off application
Dynamic Reconfiguration
Detect phases/programs dynamic needs and reconfigure at runtime
Cost to reconfiguration
![Page 30: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/30.jpg)
30
Reconfiguration
Photo
Courtesy
Intel Corp.
![Page 31: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/31.jpg)
31
Prototype System and Evaluation
![Page 32: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/32.jpg)
32
Background: Architectures
x86 (Pentium III)
CISC instruction set
Hardware Virtual Memory (VM)
Hardware Memory Protection
Condition Codes used for branching
Hardware instruction cache
1 superscalar processor core
3-way parallelism
Out-of-order processor
Raw
RISC instruction set
No VM
No Memory Protection
No Condition Codes
Software managed instruction memory
16 Processors arranged in 4x4 mesh
4 low latency networks
In-order processors
ISA
Impl
emen
tatio
n
* Photo
Courtesy
Intel Corp.
*
![Page 33: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/33.jpg)
33
System Design
Manager
L2 Code
Cache
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Banked L1.5 Code Cache
Runtime -- Execution
L1 Code
Cache
L1 Data
Cache
MMU
TLB
System
Functionality
& Loader
L2 Data $
Bank 0
L2 Data $
Bank 1
L2 Data $
Bank 2
L2 Data $
Bank 3
![Page 34: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/34.jpg)
34
System Design
Manager
L2 Code
Cache
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Banked L1.5 Code Cache
Runtime -- Execution
L1 Code
Cache
L1 Data
Cache
MMU
TLB
System
Functionality
& Loader
L2 Data $
Bank 0
L2 Data $
Bank 1
L2 Data $
Bank 2
L2 Data $
Bank 3
Runtime-ExecL1
I-$
L1
D-$
![Page 35: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/35.jpg)
35
Methodology
Cycle comparison of Raw vs. Pentium III
All results collected on real hardware
No hardware added
Same binaries (unmodified)
Metric: Slowdown
Raw executing x86 code compared by cycle against Pentium III
![Page 36: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/36.jpg)
36
Baseline Performance
![Page 37: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/37.jpg)
37
2. Speculative Parallel Translation
![Page 38: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/38.jpg)
38
L2 Code Cache Miss Rate
2. Speculative Parallel TranslationC
ode
Cac
he
![Page 39: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/39.jpg)
39
3. Static Reconfiguration
Manager
L2 Code
Cache
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Banked L1.5 Code CacheRuntime-Exec
L1
I-$
L1
D-$
MMU
TLB
System
Functionality
& Loader
L2 Data $
Bank 0
L2 Data $
Bank 1
L2 Data $
Bank 2
L2 Data $
Bank 3
![Page 40: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/40.jpg)
40
3. Static Reconfiguration
Manager
L2 Code
Cache
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Banked L1.5 Code CacheRuntime-Exec
L1
I-$
L1
D-$
MMU
TLB
System
Functionality
& Loader
L2 Data $
Bank 0
L2 Data $
Bank 1
L2 Data $
Bank 2
L2 Data $
Bank 3
![Page 41: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/41.jpg)
41
3. Static Reconfiguration
Manager
L2 Code
Cache
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Banked L1.5 Code CacheRuntime-Exec
L1
I-$
L1
D-$
MMU
TLB
System
Functionality
& Loader
L2 Data $
Bank 0
Translation
Slave
Translation
Slave
Translation
Slave
![Page 42: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/42.jpg)
42
3. Dynamic Reconfiguration
Manager
L2 Code
Cache
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Banked L1.5 Code CacheRuntime-Exec
L1
I-$
L1
D-$
MMU
TLB
System
Functionality
& Loader
L2 Data $
Bank 0
L2 Data $
Bank 1
L2 Data $
Bank 2
L2 Data $
Bank 3
Translation
Slave
Translation
Slave
Translation
Slave
![Page 43: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/43.jpg)
43
3. Dynamic Reconfiguration
Manager
L2 Code
Cache
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Translation
Slave
Banked L1.5 Code CacheRuntime-Exec
L1
I-$
L1
D-$
MMU
TLB
System
Functionality
& Loader
L2 Data $
Bank 0
L2 Data $
Bank 1
L2 Data $
Bank 2
L2 Data $
Bank 3
Translation
Slave
Translation
Slave
Translation
Slave
![Page 44: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/44.jpg)
44
3. Reconfiguration Zoom
![Page 45: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/45.jpg)
45
Baseline Performance Analysis
![Page 46: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/46.jpg)
46
Baseline Performance Analysis
![Page 47: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/47.jpg)
47
Baseline Performance Analysis
Intrinsic Raw Emulator Pentium III
latency occupancy latency occupancy
L1 Cache Hit 6 4 3 1
L2 Cache Hit 87 87 7 1
L2 Cache Miss 151 87 79 1
![Page 48: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/48.jpg)
48
Baseline Performance Analysis
Intrinsic Raw Emulator Pentium III
latency occupancy latency occupancy
L1 Cache Hit 6 4 3 1
L2 Cache Hit 87 87 7 1
L2 Cache Miss 151 87 79 1
CPI = (memory_access_rate * (((1 – L1_miss_rate) * L1_hit_occupancy) +
(L1_miss_rate * (((1 – L2_miss_rate * L2_miss_occupancy))))) + ((1 –
memory_access_rate) * non_memory_CPI)
![Page 49: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/49.jpg)
49
Baseline Performance Analysis
Intrinsic Raw Emulator Pentium III
latency occupancy latency occupancy
L1 Cache Hit 6 4 3 1
L2 Cache Hit 87 87 7 1
L2 Cache Miss 151 87 79 1
CPI = (memory_access_rate * (((1 – L1_miss_rate) * L1_hit_occupancy) +
(L1_miss_rate * (((1 – L2_miss_rate * L2_miss_occupancy))))) + ((1 –
memory_access_rate) * non_memory_CPI)
Memory CPI 3.9 1
![Page 50: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/50.jpg)
50
Baseline Performance Analysis
Memory System 3.9x slowdown
Lack of ILP 1.3x slowdown
Condition Codes (Flags) 1.1x slowdown
![Page 51: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/51.jpg)
51
Baseline Performance Analysis
Memory System 3.9x slowdown
Lack of ILP 1.3x slowdown
Condition Codes (Flags) x 1.1x slowdown
5.5x slowdown
![Page 52: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/52.jpg)
52
Baseline Performance Analysis
Memory System 3.9x slowdown
Lack of ILP 1.3x slowdown
Condition Codes (Flags) x 1.1x slowdown
5.5x slowdown
Code Cache Misses 1 – 20x slowdown
![Page 53: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/53.jpg)
53
Baseline Performance Analysis
![Page 54: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/54.jpg)
54
Future Work
Hardware additions to facilitate parallel emulation
x86 Server farm on a chip
Inter-Virtual Processor dynamic load balancing
Differing forms of Dynamic Reconfiguration
Vary number of functional units
![Page 55: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/55.jpg)
55
Questions ?
![Page 56: Constructing Virtual Architectures on Tiled Processorswentzlaf/documents/CGO_Wentzlaff_slides.pdfConstructing Virtual Architectures on Tiled Processors David Wentzlaff. Anant Agarwal](https://reader033.vdocuments.mx/reader033/viewer/2022041714/5e4a4d8f23453068911f2576/html5/thumbnails/56.jpg)
56
Extras