hll vm implementation

29
HLL VM Implementation

Upload: natala

Post on 21-Jan-2016

58 views

Category:

Documents


0 download

DESCRIPTION

HLL VM Implementation. Contents. Typical JVM implementation Dynamic class loading Basic Emulation High-performance Emulation Optimization Framework Optimizations. Memory. Method Area. Heap. Java Stacks. Native Method Stacks. Typical JVM implementation. Binary Classes. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: HLL VM Implementation

HLL VM Implementation

Page 2: HLL VM Implementation

Contents

Typical JVM implementation Dynamic class loading Basic Emulation High-performance Emulation

Optimization Framework Optimizations

Page 3: HLL VM Implementation

Typical JVM implementation

Class LoaderSubsystem

MethodArea

HeapJava

Stacks

NativeMethodStacks

Memory

GarbageCollector

PCs &ImpliedRegs

Emulation EngineNativeMethod

Interface

NativeMethodLibs.

BinaryClasses

Addresses Data & Instrs

Page 4: HLL VM Implementation

Typical JVM Major Components

Class loader subsystemMemory system

Including garbage-collected heapEmulation engine

Page 5: HLL VM Implementation

Class loader subsystem

Convert the class file into an implementation-dependent memory image

Find binary classes Verify correctness and consistency of binary

classes Part of the security system

Page 6: HLL VM Implementation

Dynamic class loading

Locate the requested binary class Check the integrity

Make sure the stack values can be tracked statically Static type checking Static branch checking

Perform any translation of code and metadata Check the class file format Check arguments between caller and callee Resolve fully qualified references

Page 7: HLL VM Implementation

Garbage Collection

Garbage : objects that are no longer accessible Collection: memory reuse for new objects Root set

Set of references point to objects held in the heap Garbage

Cannot be reached through a sequence of references beginning with the root set

When GC occurs, need to check all accessible objects from the root set and reclaim garbages

Page 8: HLL VM Implementation

Root Set and the Heap

Root set

:

A B

CD

Global Heap

E

Page 9: HLL VM Implementation

Garbage Collection Algorithms

Mark-and-Sweep Compacting Copying Generational

Page 10: HLL VM Implementation

Basic Emulation

The emulation engine in a JVM can be implemented in a number of ways. interpretation just-in-time compilation

More efficient strategy apply optimizations selectively to hot spot

Examples from interpretation : Sun HotSpot, IBM DK from compilation : Jikes RVM

Page 11: HLL VM Implementation

Optimization Framework

Host Platform

Interpreter

Bytecodes

Profile Data CompiledCode

OptimizedCode

SimpleCompiler

OptimizingCompiler

translated codeprofile data

Page 12: HLL VM Implementation

High-performance Emulation

Code Relayout Method Inlining Optimizing Virtual Method Calls Multiversioning and Specialization On-Stack Replacement Optimization of Heap-Allocated Objects Low-Level Optimizations Optimizing Garbage Collection

Page 13: HLL VM Implementation

OptimizationCode Relayout

the most commonly followed control flow paths are in contiguous location in memory

improved locality and conditional branch predictability

Page 14: HLL VM Implementation

FLASHBACKCode Relayout

A

B D

CF

G97

30

1

1

70

29

1

3

68

E

6829

2

ABr cond1 = = falseDBr cond3 = = true

F

Br uncond

G

Br cond2 = = false

E

Br uncond

B

C

Br cond4 = = true

Br uncond

Page 15: HLL VM Implementation

OptimizationMethod Inlining

Two main effects calling overheads decrease.

passing parameters managing stack frame control transfer

code analysis scope expands. more optimizations are applicable.

effects can be different by method’s size small method : beneficial in most of cases large method : sophisticated cost-benefit analysis is

needed code explosion can occur : poor cache behavior, performance losses

Page 16: HLL VM Implementation

OptimizationMethod Inlining (cont’d)

General processing sequence1. profiling by instrument2. constructing call-graph at certain intervals3. if # of call exceeds the threshold, invoke dynamic optimization system

To reduce analysis overhead profile counters are included in one’s stack frame. Once meet the threshold, “walk” backward through the

stack

Page 17: HLL VM Implementation

OptimizationMethod Inlining (cont’d)

MAIN

A X

B C Y

900 100

1500100 100025

MAIN

900

A

1500

C

using call-graph via stack frame

threshold

threshold

Page 18: HLL VM Implementation

OptimizationOptimizing Virtual Method Calls

What if the method code called changes? Which code should be inlined?

we always deal “the most common case”. Determination of which code to use is done at run time via a dyn

amic method table lookup.

Invokevirtual <perimeter>

If (a.isInstanceof(Sqaure)) {inlined code…

.

.}Else invokevirtual <perimeter>

Page 19: HLL VM Implementation

OptimizationOptimizing Virtual Method Calls (cont’d)

If inlining is not useful, just removing method table lookup is also helpful

Polymorphic Inline Caching

…invokevirtual

perimeter…

…call PIC stub

if type = circle jump to circle perimeter codeelse if type = square jump to square perimeter codeelse call lookup

circle perimeter code

square perimeter code

update PIC stub;method table lookup code

polymorphic Inline Cache stub

Page 20: HLL VM Implementation

OptimizationMultiversioning and Specialization

Multiversioning by specialization If some variables or references are always assigned data values

or types known to be constant (or from a limited range), then simplified, specialized code can sometimes be used in place of more complex, general code.

for (int i=0;i<1000;i++) { if(A[i]<0) B[i] = -A[i]*C[i]; else B[i] = A[i]*C[i];}

for (int i=0;i<1000;i++) { if (A[i] ==0 )

B[i] = 0;

}

if(A[i]<0) B[i] = -A[i]*C[i]; else B[i] = A[i]*C[i];

general code

specialized code

Page 21: HLL VM Implementation

OptimizationMultiversioning and Specialization (cont’d)

compile only a single code version and to defer compilation of the general case

for (int i=0;i<1000;i++) { if(A[i]<0) B[i] = -A[i]*C[i]; else B[i] = A[i]*C[i];}

for (int i=0;i<1000;i++) { if (A[i] ==0 )

B[i] = 0;

}

jump to dynamic compiler for deferred compilation

Page 22: HLL VM Implementation

OptimizationOn-Stack Replacement

When do we need on-stack replacement? After inlining, want executing inlined version right away Currently-executing method is detected as a hot spot Deferred compilation occurs Debugging needs un-optimized version of code

Implementation stack needs to be modified on the fly to dynamically changing optimizations E.g., inlining: merge stacks into a single stack E.g., JIT compilation: change stack and register map

Complicated, but sometimes useful optimization

Page 23: HLL VM Implementation

OptimizationOn-Stack Replacement (cont’d)

stack

implementationframe A

stack

implementationframe B

methodcode

opt. level x

architectedframe

methodcode

opt. level y

optimize/de-optimizemethod code

1. extract architected state

2. generate a new implementation frame

3. replace the current implementation stack frame

Page 24: HLL VM Implementation

OptimizationOptimization of Heap-Allocated Objects

the code for the heap allocation and object initialization can be inlined for frequently allocated objects

scalar replacement escape analysis, that is, an analysis to make sure all references

to the object are within the region of code containing the optimization.

class square {int side;int area;}void calculate() { a = new square(); a.side = 3; a.area = a.side * a.side; System.out.println(a.area);}

void calculate() { int t1 = 3; int t2 = t1 * t1; System.out.println(t2);}

Page 25: HLL VM Implementation

OptimizationOptimization of Heap-Allocated Objects(cont’d)

field ordering for data usage patterns to improve D-cache performance to remove redundant object accesses

a = new square;b = new square;c = a; …a.side = 5;

b.side = 10;z = c.side;

a = new square;b = new square;c = a;…t1 = 5;a.side = t1;b.side = 10z = t1;

redundant getfield (load) removal

Page 26: HLL VM Implementation

OptimizationLow-Level Optimizations

Array range and null reference checking may incur two drawbacks. checking overhead itself Disable some optimizations for a potential exception

thrown p = new Zq = new Zr = p …p.x = … <null check p>… = p.x <null check p> …q.x = … <null check q> …r.x = … <null check r(p)>

p = new Zq = new Zr = p …p.x = … <null check p>… = p.x …r.x = …q.x = … <null check q>

Removing Redundant Null Checks

Page 27: HLL VM Implementation

OptimizationLow-Level Optimizations (cont’d)

Hoisting an Invariant Check checking can be hoisted outside the loop

for (int i=0;i<j;i++) { sum += A[i]; <range check A>}

if (j < A.length)then for (int i=0;i<j;i++) { sum += A[i];}else for (int i=0;i<j;i++) { sum += A[i]; <range check A>}

Page 28: HLL VM Implementation

OptimizationLow-Level Optimizations (cont’d)

Loop Peeling the null check is not needed for the remaining loop

iterations.

for (int i=0;i<100;i++) { r = A[i]; B[i] = r*2; p.x += A[i]; <null check p>}

r = A[0];B[0] = r*2;p.x = A[0]; <null check p>for (int i=1;i<100;i++) { r = A[i]; p.x += A[i]; B[i] = r*2;}

Page 29: HLL VM Implementation

OptimizationOptimizing Garbage Collection

Compiler support Compiler provide the garbage collector with “yield

point” at regular intervals in the code. At these points a thread can guarantee a consistent heap state so that control can be yielded to the garbage collector

Called GC-point in Sun’s CDC VM