compiler 1

Upload: vasuki1964

Post on 29-May-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Compiler 1

    1/35

    Cse321, Programming Languages and Compilers

    18/21/2010

    Lecture #1, Jan. 9, 2007Course Mechanics

    Text Book

    Down-loading SML

    Syllabus - Course Overview

    Entrance Exam

    Standard MLThis weeks assignment

    Top to bottom example

    Lexical issues

    Parsing and syntax issues

    Translation issues

  • 8/9/2019 Compiler 1

    2/35

    Cse321, Programming Languages and Compilers

    28/21/2010

    Acknowledgements

    The material taught in this course was madepossible by many people. Here is a partiallist:

    Andrew Tolmach

    Nathan Linger Harry Porter

    Jinke Lee

  • 8/9/2019 Compiler 1

    3/35

  • 8/9/2019 Compiler 1

    4/35

    Cse321, Programming Languages and Compilers

    48/21/2010

    Todays AssignmentsReading

    Engineering a Compiler Available In the PSU bookstore

    Chapter 1, pp 1-26

    There will be a 5 minute quiz on the reading Wednesday.

    Search Find the class webpage

    1 page programming Assignment Due Wednesday, Jan 10, 2007. In Just 2 Days!!

    Login to some SML system. See how the system operates.Type in solutions (in a file) to the programming problems (InClass exercises 1 and 2 in this handout), load them into SML.Get them running, and print them out then turn them in onWednesday. What matters here is that you try out the SMLsystem, not that you get them perfect.

  • 8/9/2019 Compiler 1

    5/35

    Cse321, Programming Languages and Compilers

    58/21/2010

    Course Information CS321 - Languages and Compiler Design

    Time: Monday & Wednesday 18:00-19:50 pm Place: PCAT 138

    Instructor: Tim Sheard

    office: room 115, CS Dept, 4th Ave Building, Portland State Univ.

    phone: 503-725-2410 (work) 503-649-7242 (home)

    office hours: Before class in my office (5:00-5:50), or by Appt.

    Assignments Reading from text and handouts (quizzes on reading)

    Daily, 1 page programming assignments

    3 part programming project

    Grading: midterm exam (25%)

    3 parts of project (30%)

    Daily 1 page assignments and quizzes (15%)

    Final exam (30 %)

  • 8/9/2019 Compiler 1

    6/35

    Cse321, Programming Languages and Compilers

    68/21/2010

    Examinations

    Entrance Exam. Do you know your REs and CFGs?

    Quizzes on Reading Material. There is a possible quiz on every reading assignment

    There will be a quiz on Wednesday!

    Mid Term exam Wed. Feb 14, 2007. Time: in class.

    Final exam Monday, Mar. 19, 2007. Time: 6:00-7:50.

  • 8/9/2019 Compiler 1

    7/35

    Cse321, Programming Languages and Compilers

    78/21/2010

    Text Book

    Text: Engineering a Compiler Keith D. Cooper, and Linda Torczon

    Other Reference Materials Auxilliary Material

    Elements of Functional Programming (SML book)

    by Chris Reade, Addison Wesley, ISBN 0-201-12915-9

    Using the SML/NJ Systemhttp://www.cs.cmu.edu/~petel/smlguide/smlnj.htm

    Class Handouts Each class, a copy of that days slides will be available as a

    handout.

    I will post files that contain the example programs used in eachlecture on the class web page

    www.cs.pdx.edu/~sheard/course/Cs321

    I will post Assignments there as well.

  • 8/9/2019 Compiler 1

    8/35

    Cse321, Programming Languages and Compilers

    88/21/2010

    Labs

    Whenever you learn a new language its great to have

    someone looking over your shoulder.

    In this spirit I have scheduled some lab times wherepeople can work on learning ML while I am there tohelp. FAB INTEL Lab (FAB 55-17) downstairs by the Engineering and

    Technology Manangements departmental offices

    Friday Jan. 12, 2007. 4:00 5:30 PM

    Tueday Jan. 16, 2007 4:00 5:30

    Friday Jan. 19, 2005. 4:00 5:30 PM

    Labs are not required, but attendance of at least oneis highly recommended!

  • 8/9/2019 Compiler 1

    9/35

    Cse321, Programming Languages and Compilers

    98/21/2010

    Installing SML

    Software can be obtained at:

    http://www.smlnj.org/ I am using the most recent version 110.60

    but it displays the version 110.57 when it runs

    Browse the documentation and Literature section of the SMLweb page. Find some resources that you can use.

    SML also runs on the PSU linux and Intel labs linux

    usepkg sml

    then logout, or start a new shell

    type: sm

    Intel

    In a commnd window

    p:\programs\smlnj\addpkg.cmd

    then logout, or start a new command window

    then just type:

    N:\>sml

  • 8/9/2019 Compiler 1

    10/35

    Cse321, Programming Languages and Compilers

    108/21/2010

    Entrance Exam

    CS321 has some pretty serious prerequisites.

    1. Write a regular expression for the set of strings thatbegins with an a which is followed by an arbitrarynumber of bs or cs, and is ended by a d.

    e.g. ad, abbbd, abcbcbcd, etc.2. Transform your regular expression into a DFA

    3. Write a context free grammar that recognizes thesame set of strings as your RE

    4 Transform your CFG into a CFG that is left-recursionfree.

  • 8/9/2019 Compiler 1

    11/35

    Cse321, Programming Languages and Compilers

    118/21/2010

    Academic Integrity

    Students are expected to be honest in their academicdealings. Dishonesty is dealt with severely.

    Homework. Pass in only your own work.

    Program assignments. Program independently. Examinations. Notes and such, only as each instructor allows.

    OK to discuss how to solve

    problems with other students,

    but each student should

    write up, debug, and turn in his

    own solution.

  • 8/9/2019 Compiler 1

    12/35

    Cse321, Programming Languages and Compilers

    128/21/2010

    Course Thesis This course is about programming languages. We

    study languages in two ways. From the perspective of the user

    From the perspective of the implementer (compiler writer)

    We will learn about some languages you may neverhave heard of. We will learn to program in one of

    them (Standard ML). Its good to learn a newlanguage in depth.

    This course is also about programming. There willbe extensive programming assignments in SML. Ifyou dont do them - you wont learn Youre deluding yourself if you think you can learn the material

    without doing the exercises!

    We will write a comiler for a Java subset. Its good tounderstand the implementation details of a languageyou already know.

  • 8/9/2019 Compiler 1

    13/35

    Cse321, Programming Languages and Compilers

    138/21/2010

    This course is all about programming

    What makes a good program?

    Write at least 3 things on a piece of paper.

  • 8/9/2019 Compiler 1

    14/35

    Cse321, Programming Languages and Compilers

    148/21/2010

    Standard ML

    In this course we will use an implementation of the

    language Standard ML

    The SML/NJ Homepage has lots of usefulinformation: http://www.smlnj.org//

    You can get a version to install on your ownmachine there.

    I will use the version 110.57 or 110.60 of SML. Earlier versions probably

    will work as well. I dont foresee any problems with other versions, butif you want to use the identical version that I use in class then this isthe one.

  • 8/9/2019 Compiler 1

    15/35

    Cse321, Programming Languages and Compilers

    158/21/2010

    Characteristics of SML

    Applicative style input output description of problem.

    First class functions pass as parameters

    return as value of a function

    store in data-structures

    Less Importantly: Automatic memory management (G.C. no new or malloc)

    Use of a strong type system which uses type inference, i.e. nodeclarations but still strongly typed.

  • 8/9/2019 Compiler 1

    16/35

    Cse321, Programming Languages and Compilers

    168/21/2010

    Syntactic Elements

    Identifiers start with a letter followed by digits orother letters or primes or underscores. Valid Examples: a a3 ab aF

    Invalid Examples: 12A

    Identifiers can also be constructed with a sequenceof operators like: !@#$%^&*+~

    Reserved words include fun val datatype if then else

    if of let in end type

  • 8/9/2019 Compiler 1

    17/35

    Cse321, Programming Languages and Compilers

    178/21/2010

    Interacting

    The normal style for interaction is to start SML, andthen type definitions into the window.

    Types of commands 4+ 5;

    val x = 34;

    fun f x = x + 1;

    Here are two commands you might find useful.

    val pwd = OS.FileSys.getDir;

    val cd = OS.FileSys.chDir;

    To load a file that has a sml program type

    Use file.sml;

  • 8/9/2019 Compiler 1

    18/35

    Cse321, Programming Languages and Compilers

    188/21/2010

    The SML Read-Typecheck-Eval-Print Loop

    Standard ML of New Jersey v110.57 [built: Mon Nov 21 21:46:28 2005]

    -- 3+5;

    val it = 8 : int

    -

    - print "Hi there\n";

    Hi there

    val it = () : unit

    -- val x = 22;

    val x = 22 : int

    -

    - x+ 5;

    val it = 27 : int

    -

    -val pwd = OS.FileSys.getDir;-val pwd = fn : unit -> string

    - val cd = OS.FileSys.chDir;

    val cd = fn : string -> unit

    -

    Note the semicolon when

    youre ready to evaluate.

    Otherwise commands can

    spread across several lines.

  • 8/9/2019 Compiler 1

    19/35

    Cse321, Programming Languages and Compilers

    198/21/2010

    fun lastone x = hd (rev x)

    fun prefix x = rev (tl (rev x))

    In Class Exercise 1 Define prefix and lastone in terms of head tail and reverse.

    First make a file S01code.sml Start sml

    Change directory to

    where the file resides

    Load the file ( use S01code.html )

    Test the function

    Standard ML of New Jersey v110.57 - K;

    - val cd = OS.FileSys.chDir;

    val cd = fn : string -> unit

    - cd "D:/work/sheard/courses/PsuCs321/web/notes";

    - use "S01code.html";

    [opening S01code.html]

    val lastone = fn : 'a list -> 'aval prefix = fn : 'a list -> 'a list

    val it = () : unit

    - lastone [1,2,3,4];

    val it = 4 : int

  • 8/9/2019 Compiler 1

    20/35

    Cse321, Programming Languages and Compilers

    208/21/2010

    In Class Exercise 2

    define map and filter functions mymap f [1,2,3] = [f 1, f 2, f 3]

    filter even [1,2,3,4,5] = [2,4]

    fun mymap f [] = []

    | mymap f (x::xs) = (f x)::(mymap f xs);

    fun filter p [] = []

    | filter p (x::xs) =

    if (p x) then x::(filter p xs) else (filter p xs);

    Sample Session

    - mymap plusone [2,3,4]

    [3, 4, 5]

    - filter even [1,2,3,4,5,6]

    [2, 4, 6]

  • 8/9/2019 Compiler 1

    21/35

    Cse321, Programming Languages and Compilers

    218/21/2010

    Course topics

    Programming Language

    Types of languages Data types and languages

    Types and languages

    Compilers Lexical analysis

    Parsing Translation to abstract syntax using modern parser generator

    technology.

    Type checking

    identifiers and symbol table organization,

    Next Quarter in the second class of the sequence Intermediate representations

    Backend analysis

    Transformations and optimizations for a number of different kindsof languages

  • 8/9/2019 Compiler 1

    22/35

    Cse321, Programming Languages and Compilers

    228/21/2010

    Multi Pass Compilers

    Passes

    text

    tokens

    syntax trees

    intermediate forms

    (three address code, CPS code, etc)

    assembly code machine code

    Each phase is from one form to another, OR fromone form to the same form, which is often called asource to source transformation.

  • 8/9/2019 Compiler 1

    23/35

    Cse321, Programming Languages and Compilers

    238/21/2010

    The Top to Bottom Example

    text:

    tokens:

    syntax tree:

    id(z) eql id(x) plus id(pi) times float(12.0)

    z = x + pi * 12.0

    +Id(z)

    float(12.0)

    =

    Id(z)

    *Id(x)

    Id(pi)

  • 8/9/2019 Compiler 1

    24/35

    Cse321, Programming Languages and Compilers

    248/21/2010

    Passes (cont)

    Three address code:

    temp1 := pi * 12.0

    z := x * temp1

    Assembly level code:

    ld r1,x

    ld r2,pi

    add r1,r2

    ldi r2,12.0 mul r1,r2

    st r1,z

  • 8/9/2019 Compiler 1

    25/35

    Cse321, Programming Languages and Compilers

    258/21/2010

    Lexical Analysis

    Produces Tokens and Deals with: white space

    comments

    reserved word identification

    symbol table interface

    Tokens are the terminals of grammars.

    Lexical analysis reads the whole program, characterby character thus it needs to be efficient. This

    implies fancy buffering techniques etc. Modernlexical generators handle these problems so we willignore them.

  • 8/9/2019 Compiler 1

    26/35

    Cse321, Programming Languages and Compilers

    268/21/2010

    Tokens, Patterns & Lexemes

    Many strings from the input may produce the same

    TOKEN i.e. identifiers, integers constants, floats

    A PATTERN describes a rule which describes whichstrings are assigned to a token.

    A LEXEME is the exact sequence of input charactersmatched by a PATTERN.

  • 8/9/2019 Compiler 1

    27/35

    Cse321, Programming Languages and Compilers

    278/21/2010

    Examples

    lexeme pattern token

    x * Id "x"

    abc * Id "abc"

    152 + Constant(152)

    then then ThenKeyword

    Many lexemes map to the same token. e.g. x andabc .

    Note, some lexemes might match many patterns.e.g. "then" above. Need to resolve ambiguity.

    Since tokens are terminals, they must be "produced"by the lexical phase with synthesized attributes inplace. (e.g. name of an identifier). e.g. id(x) andconstant(152)

  • 8/9/2019 Compiler 1

    28/35

    Cse321, Programming Languages and Compilers

    288/21/2010

    Syntax, Parse Trees & Grammars

    Syntax (the physical layout of the program)

    Grammars describe precisely the syntax of a language. Two kindsof grammars which compiler writers use a lot are: regular, andcontext free

    Informal Definitions of:

    Regular:

    concatenation, union, star

    Context Free:

    only one symbol on the lhs of

    a production

  • 8/9/2019 Compiler 1

    29/35

    Cse321, Programming Languages and Compilers

    298/21/2010

    Example GrammarSentence ::= Subject Verb Object

    Subject ::= Proper-nounObject ::= Article Adjective Noun

    Verb ::= ate | saw | called

    Noun ::= cat | ball | dish

    Article ::= the | a

    Adjective ::= big | bad | pretty

    Proper-noun ::= tim | mary

    Start Symbol = Sentence

    Example sentence: tim ate the big ball

  • 8/9/2019 Compiler 1

    30/35

    Cse321, Programming Languages and Compilers

    308/21/2010

    Recursive Grammar Examples

    Recursive Grammars describe infinite languages

    list ::= [ num morenum ]

    morenum ::= , num morenum

    |

    derives [ 2 ], [2,4], [2,4,6] ...

    Exp ::= id

    | Exp + Exp

    | Exp * Exp

    | ( Exp )

    derives x, x+x, x+x+x, ...

  • 8/9/2019 Compiler 1

    31/35

    Cse321, Programming Languages and Compilers

    318/21/2010

    Parse Trees

    Each nonterminal on the lhs of a production

    "roots" a tree:

    Each node in a tree with all its immediate children isderived from a single production of the grammar

    We desire a program which constructs a parsetree from a string. Such programs are different forevery grammar, we some times use tools toconstruct such programs (yacc).

    Exp

    ExpExp +

    Id Id

  • 8/9/2019 Compiler 1

    32/35

    Cse321, Programming Languages and Compilers

    328/21/2010

    Syntax Directed Translations

    A syntax directed translation traverses a syntax tree

    and builds a translation in the process.

    Considerations

    Tree Traversal orders Left to right?

    right to left?

    in-order, pre-order, or post-order

    Where does the information about what to do in thetraversal come from? Attribute grammars

    Inherited attributes

    Synthesized attributes

  • 8/9/2019 Compiler 1

    33/35

    Cse321, Programming Languages and Compilers

    338/21/2010

    Example Translation Process

    Translation as an abstract syntax to abstract syntax

    transformerWe represent this as a grammar with actions { ... }. The

    action is performed when that production is reduced.

    Exp ::= Term termsterms ::= + Term { print "+" } term

    |

    Term ::= Factor factors

    factors ::= * Factor { print "*" } factors

    |

    Factor ::= id { print id.name }

    | ( Exp )

  • 8/9/2019 Compiler 1

    34/35

    Cse321, Programming Languages and Compilers

    348/21/2010

    Semantics

    How do we know what to translate the syntax tree

    into? How do we know if it is correct?

    Semantics denotational semantics

    operational semantics interpreters

    Very useful in writing compilers since they give areference when trying to decide what the compiler

    should do in particular cases.

  • 8/9/2019 Compiler 1

    35/35

    Cse321, Programming Languages and Compilers

    358/21/2010

    Over view

    Compilation is a large process

    It is often broken into stages

    The theories of computer science guide us in writingprograms at each stage.

    We must understand what a program means if we

    are to translate it correctly. Many phases of the compiler try and optimize by

    translating one form into a better (more efficient?)form.

    Most of compiling is about pattern matchinglanguages and tools that support pattern matchingare very useful.