DYNAMIC COMPILER OPTIMIZATION TECHNIQUES FOR ?· I am grateful to Matthieu Dubet and Kamal Zellag for…

Download DYNAMIC COMPILER OPTIMIZATION TECHNIQUES FOR ?· I am grateful to Matthieu Dubet and Kamal Zellag for…

Post on 16-Oct-2018

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • DYNAMIC COMPILER OPTIMIZATION TECHNIQUES FORMATLAB

    by

    Nurudeen Abiodun Lameed

    School of Computer Science

    McGill University, Montreal

    April 2013

    A THESIS SUBMITTED TO MCGILL UNIVERSITY

    IN PARTIAL FULFILLMENT OF THE REQUIREMENTS OF THE DEGREE OF

    DOCTOR OF PHILOSOPHY

    Copyright 2013 by Nurudeen Abiodun Lameed

  • Abstract

    MATLAB has gained widespread acceptance among engineers and scientists. Several

    aspects of the language such as dynamic loading and typing, safe updates, copy semantics

    for arrays, and support for higher-order functions contribute to its appeal, but at the same

    time provide many challenges to the compiler and virtual machine. MATLAB is a dynamic

    language. Traditional implementations of the language use interpreters and have been found

    to be too slow for large computations. More recently, researchers and software developers

    have been developing JIT compilers for MATLAB and other dynamic languages. This the-

    sis is about the development of new compiler analyses and transformations for a MATLAB

    JIT compiler, McJIT, which is based on the LLVM JIT compiler toolkit.

    The new contributions include a collection of novel analyses for optimizing copying

    of arrays, which are performed when a function is first compiled. We designed and imple-

    mented four analyses to support an efficient implementation of array copy semantics in a

    MATLAB JIT compiler. Experimental results show that copy optimization is essential for

    performance improvement in a compiler for the MATLAB language.

    We also developed a variety of new dynamic analyses and code transformations for

    optimizing running code on-the-fly according to the current conditions of the runtime en-

    vironment. LLVM does not currently support on-the-fly code transformation. So, we first

    developed a new on-stack replacement approach for LLVM. This capability allows the run-

    time stack to be modified during the execution of a function, thus enabling a continuation

    of the execution at a higher optimization level. We then used the on-stack replacement

    implementation to support selective inlining of function calls in long-running loops. Our

    experimental results show that function calls in long-running loops can result in high run-

    time overhead, and that selective dynamic inlining can be used to drastically reduce this

    i

  • overhead.

    The built-in function feval is an important MATLAB feature for certain classes of

    numerical programs and solvers which benefit from having functions as parameters. Pro-

    grammers may pass a function name or function handle to the solver and then the solver

    uses feval to indirectly call the function. In this thesis, we show that although feval

    provides an acceptable abstraction mechanism for these types of applications, there are

    significant performance overheads for function calls via feval, in both MATLAB inter-

    preters and JITs. The thesis then proposes, implements and compares two on-the-fly mech-

    anisms for specialization of feval calls. The first approach uses our on-stack replacement

    technology. The second approach specializes calls of functions with feval using a combi-

    nation of runtime input argument types and values. Experimental results on seven numerical

    solvers show that the techniques provide good performance improvements.

    The implementation of all the analyses and code transformations presented in this thesis

    has been done within the McLab virtual machine, McVM, and is available to the public as

    open source software.

    ii

  • Resume

    MATLAB est devenu reconnu parmi les ingenieurs et les scientifiques. Plusieurs as-pects du langage comme le chargement et le typage dynamique, la mise a jour sur, lasemantique de copie pour les tableaux, et le support des fonctions dordre superieur con-tribuent a son attrait, mais induisent de nombreuses difficultes pour les compilateurs et lesmachines virtuelles. MATLAB est un langage dynamique. Les implementations classiquesdu langage fonctionnent grace a des interpreteurs et sont generalement trop lentes pour deslarges calculs. Plus recemment, les chercheurs ainsi que les programmeurs ont developpedes compilateurs JIT pour MATLAB et dautres langages dynamiques. Cette these traitele developpement de nouvelles analyses et transformations pour un compilateur JIT MAT-LAB, McJIT, qui est base sur loutil LLVM.

    Ces nouvelles contributions comprennent plusieurs analyses novatrices pour optimiserla copie de tableaux, qui sont executees quand une fonction est compilee pour la premierefois. Nous avons implemente quatre analyses pour permettre une implementation efficacede la semantique de copie de tableaux dans un compilateur JIT MATLAB. Les resultatsexperimentaux montrent que loptimisation de la copie est essentielle pour ameliorer lesperformances dans un compilateur pour le langage MATLAB.

    Nous avons aussi developpe une variete danalyses dynamiques novatrices et des trans-formations de code pour optimiser du code a la volee en fonction de lenvironnementdexecution. Actuellement, LLVM ne supporte pas les transformations de code a la volee.En consequence, nous avons dabord developpe une nouvelle approche pour faire du rem-placement sur la pile avec LLVM. Cette fonctionnalite permet a la pile dexecution detremodifiee pendant lexecution de la fonction, ce qui permet de continuer lexecution a unniveau superieur doptimisation. Nous avons ensuite utilise cette implementation du rem-placement sur la pile pour permettre len line des appels de fonctions dans les boucles. Nosresultats experimentaux montrent que les appels de fonctions dans les boucles a long tempsdexecution peuvent induire un cout important en termes de performances, et que len line

    iii

  • dynamique et selectif peut etre utilise pour reduire drastiquement ce cout.La fonction feval est une fonctionnalite importante de MATLAB pour certains pro-

    grammes de calcul numerique qui profitent de la possibilite de passer des fonctions commeparametres. Les programmeurs peuvent passer le nom dune fonction ou un pointeur defonctions a un programme qui utilisera ensuite feval pour appeler indirectement cette fonc-tion. Dans cette these, nous montrons que malgre le fait que feval soit un mecanismedabstraction appreciable pour certaines applications, il induit un cout significatif, a lafois pour les interpreteurs et pour les compilateurs JIT. Cette these propose, implementeet compare deux mecanismes a la volee pour la specialisation des appels utilisant feval.La premiere methode utilise notre mecanisme de remplacement sur la pile. La secondemethode specialise les appels de fonctions utilisant feval en combinant le type et la valeurdes arguments a lexecution. Les resultats experimentaux sur sept programmes differentsmontrent que ces techniques permettent une bonne amelioration des performances.

    Limplementation de toute les analyses et transformations de code presentees dans cette

    these a ete effectue dans la machine virtuelle McLab, appelee McVM, et est disponible au

    public en tant que logiciel libre.

    iv

  • Acknowledgements

    First, I would like to thank my supervisor, Professor Laurie Hendren, for her support

    and constant encouragement. I benefited greatly from her intelligence and wealth of expe-

    rience. Her insightful comments and suggestions helped a lot to improve this thesis.

    I thank Professor Clark Verbrugge and all the members of my PhD committee for their

    useful comments and suggestions on both the proposal and the final version of this thesis. I

    also thank Professor Jose Nelson Amaral for his suggestions for improving the final version

    of the thesis.

    I thank all the members of the Sable Group, in particular, the members of the McLAB

    team for their contributions to the McLAB project. I would like to thank Maxime Chevalier-

    Boisvert for developing the first version of McVM.

    I am grateful to Matthieu Dubet and Kamal Zellag for their help in translating the

    abstract of this thesis into French language.

    I wish to thank the School of Computer Science Systems staff and the administrative

    staff for their support in my role as the system administrator for Sable Lab.

    Many friends have helped me throughout my programme at McGill. I thank you all.

    I am grateful to FQRNT for their financial support. This thesis was partly supported by

    NSERC as well.

    Finally, I would like to thank my family, my beloved wife, Aderonke and my children,

    Hanifah, Azizah and Ibrahim, for their support, encouragement and understanding without

    which this thesis would have been impossible to undertake.

    v

  • vi

  • Contents

    Abstract i

    Resume iii

    Acknowledgements v

    Contents vii

    Contents xiii

    List of Figures xv

    List of Tables xvii

    List of Abbreviations xix

    1 Introduction 1

    1.1 Virtual Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.1.1 JIT Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.2.1 Characteristics of MATLAB Programs . . . . . . . . . . . . . . . . 5

    1.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    1.3.1 Challenge 1: Copy Semantics . . . . . . . . . . . . . . . . . . . . 6

    1.3.2 Challenge 2: Function Calls in Loops . . . . . . . . . . . . . . . . 7

    1.3.3 Challenge 3: Dynamic Function Evaluation (feval) . . . . . . . . . 8

    vii

  • 1.4 Solution Overview . .