pyjion - a jit extension system for cpython
TRANSCRIPT
Pyjion
What are we going to discuss
When you develop a Python app, what happens when you execute it Why does that happen and how does it work? How can it be improved? What is a JIT What is Pyjion
Abstract Syntax Trees
Clone https://github.com/quantifiedcode/python-ast-visualizer pip install –r requirements.txt cat example.py | python astvisualizer.py Uses built in module ast and function compile
What is ByteCode
Machine code is the lowest level but not portable between processors Assembly is short-hand for machine code C is compiled and linked into machine-code which is efficient but not
portable once compiled Contrast Jython to Cpython, Jython compiles the AST to JVM bytecode,
Cpython to Cpython bytecode (pyc files)
cPython bytecode
The cPython compiler reads the text (command, file etc) Prepares Abstract Syntax Tree The cPython compiler then converts the AST into Bytecode Bytecode operations are low level operations that translate to likely
OS/machine level instructions.
Using DIS to inspect byte-code
def helloworld(): print("Hello world!")dis.dis(helloworld)
2 0 LOAD_GLOBAL 0 (print) 3 LOAD_CONST 1 ('Hello world!') 6 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 9 POP_TOP 10 LOAD_CONST 0 (None) 13 RETURN_VALUE
Using the standard library ‘dis’ module, you can disassemble a Python object into byte-code in the Python CLI.
https://github.com/python/cpython/blob/ab8b3ff250a13d506f572abd66c37bce0759ab7d/Lib/dis.py#L31-L65
Python threads and the Global Interpreter Lock Python has native support for threading Python threads are real system threads (either POSIX or Windows) Those threads are managed by the host operating system Python threads are simple callable objects The GIL ensures that sure each thread gets exclusive access to the
interpreter internals when it's running (and that call-outs to C extensions play nice)
See: http://www.dabeaz.com/python/GIL.pdf
How a Python thread executes
Thread createdPython creates data structure (PyThreadState
) for thread
The pthread is launched
The thread calls
PyEval_CallObject (run the
callable)
Just-in-time compilation
Just-in-time compilation requires an intermediate language (IL) to allow the code to be split into chunks (or frames)
Ahead of time (AOT) compilers are designed to ensure that, the CPU can understand every line in the code before any interaction takes place.
In the case of Python, the execution is converted to byte-code when called. Those byte-code statements are cached in .pyc files.
This diagram shows the .NET MSIL to JIT compiler flow
.NET core
Microsoft.NET is a JIT framework, with a series of languages supported by Microsoft and interopability with the common MSIL intermediate language. The .NET runtime is the MSIL JIT compilation module, but the executable and installer includes 100’s of modules specific for Windows and Windows application development.
Microsoft have redeveloped .NET into a new lightweight framework focusing on the core elements (JIT and compilation) as well as the core data structures and common classes. This is developed in C++ and is cross-platform.
The JIT module itself in .NET core can be used without the MSIL purely for executing byte-code operators on the local CPU.
This is what Pyjion does.
Python CodeObjects
The goal of the JIT is to convert Python Code Objects (either loaded via imports, on the CLI or in a thread) to machine instructions.
The low level data structure is called PyCodeObject, declared in Cpython.
Pyjion introduces a new abstract type, IPythonCompiler, which takes a pointer to a PyCodeObject and executes (compiles) the code against a target ILGenerator
The ILGenerator generates DotNetCore IL instructions for each Python code
End to end workflow
Python creates
PyCodeObject from code file/input
PythonCompiler takes
PyCodeObject and converts
Python OpCodes into intermediate instructions (Pyjion IL?)
ILGenerator converts
abstract IL operations to send to the
ICorJitCompiler
ICorJitCompiler compiles stacks to
machine code
ICorJitInfo is the main interface that the JIT uses to call back to the execution engine (EE) and get information.
DotNetCore
PyjionPython
print(‘hello world!’)
010101110011
Proposed extensions to CPython
In order to support Pyjion (or another JIT engine) some changes are proposed to Cpython
Load pyjit.dll and run ‘jitInit’ (pylifecycle.cpp) JitInitFunction jitinit = (JitInitFunction)GetProcAddress(pyjit, "JitInit"); jitinit();
Ceval (the code that evaluates the .pyc opcode cache) can now ask the JIT module to evaluate frames
Does this help me run .NET from Python or vice versa? No. .NET is not only an intermediate-language and
execution engine, but also a huge collection of standard libraries (similar to Python).
Pyjion bridges the gap between the low-level function that CPython performs executing byte-code operations on the local CPU and replaces the last step with a 3rd party execution engine.
Building the project
You will need Git, SVN Cmake Visual Studio (well MSBuild)
1. Download Pyjion (github.com/Microsoft/Pyjion)2. Install submodules3. Apply patches to CoreCLR and Python4. Compile Pyjion5. Copy build output to the CPython distribution
output (pyjit.dll)
About me
@anthonypjshaw
Head of Innovation at Dimension Data