fast polymorphic math parser

Upload: dinh-truong-cong

Post on 10-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Fast Polymorphic Math Parser

    1/9

    Introduction

    The article is a proposal for a mathematical parser implementation which uses polymorphic methodsfor fast evaluation of given expressions. Although this paper talks about parsing input text, itfocuses on methods for fast calculation of a given mathematical formula. Very good performancecan be achieved by generating opcode using basic assembler level instructions. The describedlibrary interprets a math formula and emits, in run-time, machine code (using standard 8087directives) which calculates a given expression. This approach is a combination of lexicalanalysis algorithms with opcode manipulation techniques, widely used in compilers and virtualmachines (platforms like .NET or Java).

    Further tests have shown that the presented approach is very efficient and is almost as fast asmath code generated during program compilation! It is also up to 30 times faster than otherpopular math parsers.

    It's assumed that the reader is familiar with C/C++ programming and the basics of the Assemblerx86 language.

    Background

    How Compilers Generate Code

    Let's see how programs are performed inside a system, how compilers generate code, and how theyoperate on the computer's memory. The execution of each process in a system demands generallyfour kinds of memory areas:

    y code area - This part of memory contains machine code - a series of instructions executedby the CPU.

    y data segment - This segment contains static data and constants defined in the program.y stack - The stack is responsible for storing program execution data and local variables.y heap - Segments of memory that can be allocated dynamically (this part of memory is used

    by the malloc or new operator).

    The program executes a series of instructions from the code area using data from the datasegment. When a program calls a function (in a so-called cdecl manner), it pushes on the stackall parameters defined in the function, a pointer to a place in the code area where it is alreadyin, and performs a jump to the code responsible for executing the given function. The stack isalso a place where local variables are stored - when they are needed, the proper amount of memoryis reserved on the top of the stack.

    Let's analyze how compilers generate code which is responsible for mathematical calculations on asimple function that calculates a math expression, (x+1)*(x+2):

    Collapsefloat testFun( float x ){

    return ( x + 1.0f ) * ( x + 2.0f );}

    The compiler for the given function generates these assembler instructions:

    Collapse; float testFun( float x ); {PUSH ebpMOV ebp,esp; return ( x + 1.0f ) * ( x + 2.0f );FLD dword ptr [ebp+8]FADD qword ptr [__real@3ff0000000000000 (1D0D20h)]FLD dword ptr [ebp+8]FADD qword ptr [__real@4000000000000000 (1D0D18h)]FMULP st(1),st

  • 8/8/2019 Fast Polymorphic Math Parser

    2/9

    ; }POP ebpRET

    When the function is called, the caller pushes on the top of the stack a list of parameters and apointer to the return code. In our example, the function has only one parameter (of type float),so at the moment of the function's execution, on the top of the stack, there are (in order):

    1. 4-byte float variable2. 4-byte return address

    At the beginning of the function, the ebp register is saved on the stack (instruction PUSH ebp),and subsequently, the stack pointer (the esp register) is copied to ebp (MOV ebp,esp). After thisoperation, ebp points to a place on the stack where an old value of ebp is stored, ebp+4 pointsto a return address, and ebp+8 points to the parameter of the function (variable x of typefloat).

    The next part of the code performs strictly mathematical calculations (8087 instructions) using amathematical cooprocessor's internal stack. This stack contains eight registers symbolized byST(0), ST(1), ..., ST(6). The top of the stack is ST(0). The compiler interprets the math formula(x+1.0f)*(x+2.0f) and generates the code which performs the calculations in sequence (with theusage of the mentioned stack):

    1. load variable x on the top of the stack - ST(0) (instruction FLD)2. add constant to ST(0) and store the result in ST(0)3. load variable x on the top of the stack - ST(0) becomes ST(1), x is loaded to ST(0)4. add constant to ST(0) and store the result in ST(0)5. multiply ST(0) by ST(1), pop the stack, and store the result in ST(0)

    After these series of operations, ST(0) (the top of the floating point stack) contains the resultof the given math formula. Subsequently, the program recovers the original value of the ebpregister (POP ebp) and returns to the caller (RET). The output result is passed to the callerthrough the top of the math stack - register ST(0).

    The basic idea of creating a fast mathematic parser is to create the code which will generateanother code for calculation of the given math formula. The generated opcode should have thestructure of the function presented above. When the program needs to use a math parser, firstly,it should perform the analysis of the mathematical formula and generate the opcode. After this,

  • 8/8/2019 Fast Polymorphic Math Parser

    3/9

    every time calculations are needed, the program should execute the generated code, giving thedesired result. The main advantage of this approach is that the process of calculation is veryfast and highly efficient - it is an imitation of the math executing the code which was generatedduring the program compilation.

    Lexical Analysis

    The basic issue in writing a mathematical parser is lexical analysis - it gives the structure ofa given expression, and tells how a formula should be interpreted. Simple math expressions (withoperators +, -, *, /, and parenthesis) can be described by a set of so-called productions:

    CollapseEXPR -> TERM + TERMEXPR -> TERM - TERMEXPR -> TERMTERM -> FACT * FACTTERM -> FACT / FACTTERM -> FACTFACT -> numberFACT -> xFACT -> GROUPGROUP -> ( EXPR )

    These operations describe how given expressions are built - how the set of input chars (non-terminals) can be brought out into smaller parts (terminals - like numbers, math operators, orsymbols).

    The production EXPR -> TERM + TERM means that the non-terminal EXPR can be divided into asequence of non-terminal TERMs, char '+', and another non-terminal TERM. The non-terminal FACT,for example, can be brought out into number, a symbol x, or another non-terminal - GROUP.

    These productions for a given mathematical expression define a parse tree (presented below) thatrepresents the syntactic structure of the input string. Let's have a look at the example formula,(x+1)*(x+2). This expression can be derived from the non-terminal EXPR in this series oftransformations:

    CollapseEXPR -> TERM ->

    -> FACT ->-> FACT * FACT ->-> GROUP * GROUP ->-> ( EXPR ) * ( EXPR ) ->-> ( TERM + TERM ) * ( TERM + TERM ) ->-> ( FACT + FACT ) * ( FACT + FACT ) ->-> ( x + 1 ) * ( x + 2 )

  • 8/8/2019 Fast Polymorphic Math Parser

    4/9

    Productions (defined earlier) were used in each of these transformations. The knowledge of how agiven production is used in the derivation process is useful in generating machine instructionsfor the calculation of a math formula.

    Implementation

    Using Spirit as a Syntactic Analyzer

    The algorithm for lexical analysis is not an issue in this article - the paper describes only thebasic knowledge which is needed to parse math formulas. Syntactic analysis can be performed bymany (freely available) libraries. In this article, the Spirit library from the Boost package wasused.

    Spirit is a very useful and easy to use C++ tool, which takes advantage of modern meta-programming methods. The code below defines a grammar which is used to interpret simple mathexpressions:

    Collapseclass MathGrammar : public grammar{public:

    template struct definition{

    definition( MathGrammar const& self ){

    expr = term >> *( ( '+' >> term )[ self._doAdd ] |( '-' >> term )[ self._doSub ] )

    ;term = fact >> *( ( '*' >> fact )[ self._doMul ] |

    ( '/' >> fact )[ self._doDiv ] );

    fact = real_p[ self._doConst ] |ch_p('x')[ self._doArg ] |

  • 8/8/2019 Fast Polymorphic Math Parser

    5/9

    group;

    group = '(' >> expr >> ')';

    }

    rule expr, term, fact, group;rule const& start() const { return expr; }

    };

    protected:

    private:AddFunctor _doAdd;SubFunctor _doSub;MulFunctor _doMul;DivFunctor _doDiv;ArgFunctor _doArg;ConstFunctor _doConst;

    };

    The class MathGrammar derives from the grammar generic spirit class. It uses fournon-terminals (expr, term, fact, and group). In the constructor of the internal class definition,productions for the grammar are defined. Productions are developed by using overloaded operators:

    y >> is an operator which merges nonterminals and terminals. For example, a>>b matches theexpression ab.

    y * is the Kleene star operator - the multiple concatenation of symbols inside parentheses.For example, *(a>>b) matches the expressions: ab, abab, ababab, etc.

    y | is an OR operator. For example ,(a|b) matches expressions a or b.y [] these brackets contain functors which will be called in case of usage of the given

    production.

    y real_p is the terminal representing the constants - real numbers.y ch_p('x') is a terminal representing the char x.

    The usage of the presented class is very simple - during the process of expression parsing, whena given production is detected, a defined functor is called. The analysis is always performedbottom-up - this means that at first, productions are called from the bottom of the parse tree.

    Let's analyze the formula (x+1)*(x+2) with the code given below:

    Collapse// instance of grammar classMathGrammar calc;

    // perform string parsingif ( false == parse( "(x+1)*(x+2)", calc ).full ){

    // parse error handling}

    During the parse procedure execution, the defined functors are called in order:

    1. x argument (functor ArgFunctor _doArg)2. 1 constant (functor ConstFunctor _doConst)3. + operator (functor AddFunctor _doAdd)4. x argument (functor ArgFunctor _doArg)5. 2 constant (functor ConstFunctor _doConst)6. + operator (functor AddFunctor _doAdd)7. * operator (functor MulFunctor _doMul)

    This order of operations can be directly used during the generation of the proper machine codefor the calculation of the given formula. The obtained notation:

  • 8/8/2019 Fast Polymorphic Math Parser

    6/9

    Collapsex 1 + x 2 + *

    is a so-called Reverse Polish Notation (RPN) or postfixnotation - operators follow theiroperands and no parentheses are used. The interpreters of RPN are often stack-based; that is,operands are pushed onto the stack, and when the operation is performed, its operands are poppedfrom the stack, and its result is pushed back on.

    The Machine Code Emission

    The idea of the math parser described in this article is based on dynamic machine code generationduring program run-time. The program interprets the input text (given by a user) using thesyntactic analyzer - the input formula is transformed to a more suitable postfix notation, whichcan be calculated with the usage of the processor's math stack. After this analysis, the programgenerates the machine code which executes the process of evaluation of the math expression.

    How do we generate the machine code in Windows dynamically? Firstly, the program must allocate anew memory segment with read/write access using the function VirtualAlloc (in the code below, itallocates 1024 bytes of memory).

    Collapse#include

    // allocate memory segment with read/write accessBYTE *codeMem = (BYTE*)::VirtualAlloc( NULL, 1024,

    MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );if ( NULL == codeMem ){

    // error handling}

    This memory will contain the body of the new function that will be generated. The code emittedinto the allocated memory should have this structure:

    1. PUSH ebp - save the base register ebp on the stack.2. MOV ebp,esp - put the current stack pointer esp into the base register ebp.3. The series of mathematical calculations on the function parameter (the parameter is

    placed on the stack at the address ebp+8).4. POP ebp - restore the saved base register ebp value.5. RET - return from the function instruction.

    The table below describes the opcodes of instructions which should be emitted when the syntacticanalyzer detects a mathematical operation:

    Assembler code Machine opcode Operation

    PUSH ebp 0x55 Save the base pointer on the stack.

    MOV ebp, esp 0x8B, 0xEC Move the stack pointer to the base pointer.

    FLD dword ptr[ebp+8]

    0xD9, 0x45, 0x08 Load a float parameter (from ebp+8) to ST(0).

    FLD [mem32]0xD9, 0x05, xx,xx, xx, xx

    Load the constant from a given address to ST(0) (the lastfour bytes are the address of the constant in memory).

    FADDP st(1), st(0) 0xDE, 0xC1 Add ST(1) to ST(0), store the result in ST(1), and popST(0).

    FSUBP st(1), st(0) 0xDE, 0xE9Subtract ST(1) from ST(0), store the result in ST(1), andpop ST(0).

    FMULP st(1), st(0) 0xDE, 0xE9Multiply ST(1) by ST(0), store the result in ST(1), and popST(0).

    FDIVP st(1), st(0) 0xDE, 0xE9Divide ST(1) by ST(0), store the result in ST(1), and popST(0).

    POP ebp 0x5D Restore the base pointer from stack.

    RET 0xC3 Return from the function.

  • 8/8/2019 Fast Polymorphic Math Parser

    7/9

    The described instructions (which begin with Fxxx) operate on the processor's internal mathstack, and always use at most two registers - ST(0) and ST(1). With their usage, the program caneasily calculate every simple math expression. Of course, this list of mathematical operationscan be extended to more sophisticated functions (like logarithms, trigonometric functions, root,power, etc.), but this article focuses only on the idea of machine code generation - forsimplicity, it describes only simple math operators.

    The instruction FLDS is responsible for loading mathematical values onto the stack, from memory.

    This operation will be used to load the parameter x or constants from the given math formula. Thelatter situation demands a special table (inside memory) with saved numeric values.

    During the expression's parsing, the program creates a table with constants, which will be usedduring an evaluation process. When the syntactic analyzer detects a constant in a formula (forexample, 1 in (x+1)) - it writes its value into the table, and emits the opcode FLD [mem32] witha pointer to a place in memory where the constant was saved. When the generated code is executed,this instruction loads the value of the given constant (read from memory) on the top of thestack.

    After the process of opcode emission, the program should change the access rights of theallocated memory (given by the pointer codeMem) from read/write to executiononly:

    Collapse// change access protection for memory segment to execute only

    DWORD oldProtect;BOOL res = ::VirtualProtect( codeMem, 1024, PAGE_EXECUTE, &oldProtect );if ( FALSE == res ){

    // error handling}

    After this operation, each process that tries to write into this memory will crash. In Windowssystems, this approach insures protection from malware code injections. Execution of emitted codecan be performed by using a simple code as shown below:

    Collapsefloat ( *fun )( float );fun = ( float ( __cdecl * )( float ) )codeMem;fun( x );

    Of course, when the generated formula evaluator is no longer needed, the allocated memory shouldbe released:

    Collapseif ( NULL != codeMem ){

    // deallocate memory segmentBOOL res = ::VirtualFree( codeMem , 0, MEM_RELEASE );if ( FALSE == res ){

    // error handling}

    }

    Using the Code

    The project attached to this article contains an example implementation of the described fastpolymorphic math parser. The library is based on these classes:

    y MathFormula - Class containing code emission functionality. It encapsulates the handlingof memory segments with the opcode to execute.

    y MathGrammar - Class defining the mathematical expression grammar.y AddFunctor, SubFunctor, MulFunctor, DivFunctor, ArgFunctor, ConstFunctor - The set of

    functors for handling math grammar productions.

  • 8/8/2019 Fast Polymorphic Math Parser

    8/9

    The usage of the library is very simple:

    Collapse#include "MathFormula.h"

    // create MathFormula instance for given expressionMathFormula fun( "(x+1)*(x+2)" );

    // perform evaluation of expression for given argument xfloat result = fun( 2.0f );

    The constructor of the MathFormula object performs a syntactic analysis of the input string byusing the MathGrammar class. During the expression interpretation, proper functors are calledwhen the given production rules are satisfied. Each functor invokes the emission methods in theMathFormula object - generating the machine code. After the initial analysis, math formulas canbe evaluated with the usage of the overloaded parentheses operator.

    The attached project can be compiled under Visual Studio 2008 with the usage of the Boostlibrary. Information about the build process can be found in the ReadMe.txt file.

    Summary

    As we can see, the implementation of the math expression evaluator with the usage of dynamicallygenerated machine code is quite simple. It is a combination of a syntactic analyzer (in thisarticle, we used the Spirit library), a few WinAPI functions for memory allocation, and somebasic knowledge about programming in Assembler.

    The described implementation of the math parser has a very important advantage - it is almost asfast as the code generated statically during the program compilation! The table below shows theaverage math formula evaluation times for the expression(x+1)*(x+2)*(x+3)*(x+4)*(x+5)*(x+6)*(x+7)*(x+8)*(x+9)*(x+10)*(x+11)*(x+12), for different typesof parsers:

    Math parser Execution time [ns]

    C++ 20.7 [ns]

    MathParser 22.2 [ns]

    muParser 700.69 [ns]

    uCalc 762.36 [ns]

    fParser 907.07 [ns]

    The presented execution times are the average from 100 000 000 executions, and they all wereperformed on the processor: AMD Turion X2 2.0 GHz.

    C++ stands for code generated by compiler during compilation. Our math parser (MathParser)execution time is only 7% worst than C++ code! In comparison, the table also contains executiontimes for mathematical parsers which are freely available on the internet - they are at least 30-35 times slower than MathParser!

    Future Work

    This article describes only an idea of a polymorphic math parser - of course, the presented codecan be extended to more advanced solutions:

    y more sophisticated mathematical functions could be used,y the program could use more than one input function parameter,y for better performance, optimizations on the generated opcode could be performed (for

    example: mathematical expression reduction),

    y SIMD (Single Instruction Multiple Data) assembler instructions could be considered formuch faster calculations on the whole vectors of values - the usage of the SSEprocessor's extensions.

  • 8/8/2019 Fast Polymorphic Math Parser

    9/9