g22.3033-002 scripting languages: performance of dynamic … · 2012-08-03 · enterprise...

16
P G22.3033-002 Scripting Languages: Performance of Dynamic Scripting Languages (main focus: Python) Priya Nagpurkar, IBM Research August 2, 2012

Upload: others

Post on 30-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

P G22.3033-002 Scripting Languages: Performance of Dynamic Scripting Languages (main focus: Python)

Priya Nagpurkar, IBM Research August 2, 2012

Page 2: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

Popularity of Scripting Languages

• High productivity, quick and simple prototyping

– Language features like dynamic typing, high level data structures

– Frameworks that simplify development and deployment: Rails (Ruby), Django and Zope (Python)

• Popular even in emerging server application domains • Cloud: Google AppEngine

• Web 2.0: FaceBook (PHP/Python), YouTube (Python), Twitter (Ruby)

• Python application domains:

2

Web applications, frameworks, servers

Systems management

Scientific Computing Enterprise applications

Page 3: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

The Performance Gap

• Focus primarily on ease of programming, flexibility, and features, not on performance

• Original, and often the most commonly used implementations are interpreters

– Easy to add new features, portable, lightweight

– Also inefficient and very slow!

3

Performance

Productivity

26

79 110

24

0.01

0.10

1.00

10.00

100.00

1000.00

10000.00

Execu

tion

Tim

e n

orm

ali

zed

to J

ava

(lo

g s

cale

)

Python Ruby 1.8 JavaScript Lua

Page 4: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

Dynamic Scripting Languages: Implementation Highlights

• Popular and widely used implementations are interpreters written in C

– Cpython 2.x, Ruby 1.9

• Interpretation and dynamic typing introduce lots of:

– Indirect branches

– Indirect memory accesses

– Indirect function calls

• Fat bytecodes make interpreter dispatch (big switch statement) less significant as compared to interpreted Java, threading improves dispatch

– 36x on Java vs. 2x on Python, for n-body benchmark

• Frequent use of shared libraries

4

Bytecode compiler

dispatch

fetch

Python bytecode def foo(x): a = x + 1 ...

0: LOAD_FAST 1(x) 3: LOAD_CONST 1 6: BINARY_ADD 7: STORE_FAST 2(a) ...

execute Runtime libraries

Page 5: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

Where is the time spent? (Unladen-Swallow benchmarks)

5

0

10

20

30

40

50

60

70

80

90

100 OTHERS

Builtin operators

Control Flow

Stack Manipulation

Generic Numeric

CMP and JMP

Method Call and Return

Locals Load/Store

Type Resolution and Access

Page 6: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

Where is the time spent? (Python Web Application)

6

0

10

20

30

40

50

60

70

80

90

100

%C

ycle

s S

pen

t

OS

other

_json.so

libz.so

ld-2.11.1.so

datetime.so

itertools.so

_mysql.so

libpthread.so

libmysqlclient_r.so

libc.so

libpython.so

httpd

mod_wsgi

Python Python Python

(codespeed/ django)

client

DB (MySQL)

• Codespeed: Python web application used to track performance over software revisions for multiple projects (used by the PyPy, Twisted projects)

• Three-tier application that uses the python django web-application framework and a backend database, deployed using apache httpd

• Request-level parallelism is achieved using multiple httpd processes/threads, each with an embedded python interpreter instance

Page 7: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

At which level should we optimize?

• Improve interpreter performance through runtime techniques

• Add JIT compilation

– Method based vs. trace based compilation

– Speculative optimization

• Perform dynamic binary optimization of interpreted code

• Evaluate and optimize microarchitecture for interpreters

– How to run poorly optimized code well

• Provide architectural support for:

– Speculative optimization

– Frequent and expensive operations: e.g. type checking (Checked Load: Architectural support for JavaScript type-

checking on mobile processors, O. Anderson, E. Fortuna, L. Ceze, S. Eggers, HPCA 2011)

7

Page 8: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

Some existing approaches

Improving existing interpreters – Zend: PHP

New Interpreter and JIT – LuaJIT – PyPy

Mapping to existing managed

runtime – On top of CLR (.NET) – Jython/JRuby on Java Virtual

Machine (JVM) – Oracle (BEA): PHP on JVM

Extending existing managed runtime for Dynamic Scripting Languages – Microsoft DLR: IronPython,

IronRuby, IronJscript – Microsoft Mono: open source CLR

implementation – Sun DaVinci (JVM): multi-language

8

Add JIT to existing interpreters - Unladen-swallow (Google) : CPython +

open source LLVM JIT

- Rubinius: Ruby + open source LLVM JIT

- Psyco: CPython + mini-JIT

Convert to static language (C/C++) - Cython/Pyrex: Python

- HipHop (FaceBook): PHP

Page 9: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

Improving interpreter performance: example

• Global variable and Attribute Caching

– Addresses the overhead associated with loading global variables and object attributes (fields and methods)

– Global variables and object attributes are stored in dictionaries and looked up by name • A single read can involve complex control flow, indirect function calls,

and multiple dictionary lookups

– Global variable caching caches the address of the dictionary entry for a global variable along with a version number • Change in value does not invalidate cached address • Change in shape of dictionary is tracked using version numbers

– Global variable caching and attribute caching together lead to a 7% improvement in performance (on an average across the unladen-swallow benchmark suite)

More details in UCSB tech report: Understanding the Potential of Interpreter-based Optimizations for Python

9

Page 10: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

Improving Performance through Compilation: Challenges

• Dynamic typing and metaprogramming make static analysis and optimization difficult

• Large and potentially changing variety of languages makes per-language approaches non-scalable

• Moving target:

– Many of these languages continue to evolve

– Deployment practices and popular components (web servers, frameworks) are also evolving

• Compatibility with significant amount of existing code in the form of reusable modules/libraries and applications important

10

Page 11: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

© 2011 IBM Corporation 11

Adaptive Just in Time Compilation

• IBM Research Project: Fiorano JIT Compiler for Dynamic Scripting Languages

• Enable Dynamic scripting language support on top of IBM’s Testarossa compiler (TR) – Leverage TR as the optimization framework and backend for native

code generation

– Handle multiple languages with minimal engineering effort by creating a generic Dynamic Scripting Languages optimization framework

– Attach Fiorano JIT compiler to open source language virtual machines • Preserve original language semantics and compatibility to deal with

rapid language evolution

• Current performance: 63% improvement over CPython

Testarossa

x86

System z

Power

Native code

Common optimizer

Java optimizer

C/C++ optimizer

DSL optimizer Python Frontend

Ruby Frontend

Java Frontend

C/C++ Frontend codegen

codegen

codegen

Page 12: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

12

Improving Performance through Compilation: Effective techniques

• Hidden Classes (Self, v8)

– Optimize class offsets and accesses

• Tracing (TraceMonkey, LuaJIT, PyPy)

– Method + Tracing (JaegerMonkey)

• Profiling Feedback (Unladen Swallow, PyPy)

– Type

– Value

– Global state (dictionary, …)

• Unboxing

• DynamicSites (DLR), InvokeDynamic (Java7/DaVinci)

– Cache method lookup, specialized implementation

– Polymorphic Inline Cache

– Type inference and propagation

Page 13: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

13

PyPy

• A Python implementation written in Rpython

– Implementations for other languages also exist

• RPython is a restricted version of Python

– Well-typed according to type inference rules of RPython

– Class definitions do not change

– Object model implementation exposes runtime constants

– Various hints to trace selection engine to capture user program scope

• Tracing JIT through both user program and runtime

– A trace is a single-entry-multiple-exit code sequence (like long extended basic block)

– Tracing automatically incorporates runtime feedback and guards into the trace

• The optimizer fully exploits the simple topology of a trace to do very powerful data-flow based redundancy elimination

Page 14: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

PyPy Performance

• Source: http://speed.pypy.org/

14

Page 15: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

Performance of JavaScript Implementations

15

2

56

19

4

0

2

4

6

8

10

12

14

16

18

20

binarytrees fasta mandelbrot nbody spectralnorm geomean

Speedup (

rela

tive t

o J

avascript)

TraceMonkey V8 Rhino

Page 16: G22.3033-002 Scripting Languages: Performance of Dynamic … · 2012-08-03 · Enterprise Scientific Computing applications . The Performance Gap • Focus primarily on ease of programming,

Thank You!

Questions?

16