compiler 3.2

25
1 Compiler 3.2 A Level computing

Upload: jerod

Post on 19-Jan-2016

54 views

Category:

Documents


0 download

DESCRIPTION

Compiler 3.2. A Level computing. So, Sir what is a compiler?. Compiler is one which converts source program into object program. Source. Tokens. Interm. Language. Parsing. Today we start. The Structure of a Compiler. Lexical analysis. Code Gen. Machine Code. Optimization. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Compiler 3.2

1

Compiler 3.2

A Level computing

Page 2: Compiler 3.2

2

So, Sir what is a compiler?

S ou rce P rog ram C om p ile r O b jec t P rog ram

Page 3: Compiler 3.2

3

The Structure of a Compiler

Source Tokens

Interm.Language

Lexicalanalysis

Parsing

CodeGen.

MachineCode

Today we start

Optimization

Page 4: Compiler 3.2

4

Lexical Analysis (scanner)

• Scanner reads characters from the source program• Scanner groups the characters into lexemes • The scanner is called by parser• Each lexeme corresponds to a token• Example:

if (i == j)z = 0;

elsez = 1;

• The input is just a sequence of characters:\tif (i == j)\n\t\tz = 0;\n\t\telse\n\t\tz = 1;

lexemes

Page 5: Compiler 3.2

5

Sir, what the hell then you mean by Token and Lexeme ?

A token is a syntactic category– In English:

noun, verb, adjective, …

– In a programming language:Identifier, Integer, Keyword, Whitespace, …

– Given the source code I = I + rate * 60;A Java scanner would return the following

sequence of tokens…

IDENT ASSIGN IDENT PLUS IDENT TIMES INT-LIT SEMI-COLON

tokens

Page 6: Compiler 3.2

6

The Parser

• Group of tokens into “grammatical phrases”, discovering the underlying structure of the source program

• Finds syntax errors. For example, in java source code

• I = * 5;Corresponds to the following sequence of

tokensIDENT ASSIGN TIMES INT-LIT SEMI COLON

All are legal tokens, but sequence of token is erroneous.

Page 7: Compiler 3.2

7

Once again Sir, what the hell does the parser do finally?

• Might find some “static semantic” errors, e.g., use of an undeclared variable, or operands that are wrongly used with variables.

• Might generate code, or build some intermediate representation of the program such as an abstract-syntax tree.

Page 8: Compiler 3.2

8

num num+

( )*num

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Syntax Analyzer (Parser)

Page 9: Compiler 3.2

9

Semantic Analyzer

Consider the Source code: I = I + rate * 60; =

(float) I + (float)

(float) I + (float) rate* 60

(float) rate * 60 (int)

The semantic analyzer checks for more “static semantic” errors. It may also annotate and or change the abstract syntax tree.

Page 10: Compiler 3.2

10

The Intermediate Code Generator

• The ICG translates from abstract-syntax tree to intermediate code. One possibility is 3-address code.

• Example: temp1 = 60 temp2 = rate * temp1 temp3 = I + temp2

I = temp3

Page 11: Compiler 3.2

11

The Optimizer

• The Optimizer tries to improve code generated by the intermediate code generator. The optimizer may also try to reduce the code smaller

• So.. The previous intermediate code thus becomes

temp2 = rate * temp1 I = I + temp2

Page 12: Compiler 3.2

12

The Code Generator generates object code finally from the (optimized) intermediate code.Example:

sumcalc: xorl %r8d, %r8d

xorl %ecx, %ecx movl %edx, %r9d cmpl %edx, %r8d jg .L7 sall $2, %edi.L5: movl %edi, %eax cltd idivl %esi leal 1(%rcx), %edx movl %eax, %r10d imull %ecx, %r10d movl %edx, %ecx imull %edx, %ecx leal (%r10,%rcx), %eax movl %edx, %ecx addl %eax, %r8d cmpl %r9d, %edx jle .L5.L7: movl %r8d, %eax ret

int sumcalc(int a, int b, int N){ int i;

int x, t, u, v; x = 0;

u = ((a<<2)/b); v = 0;

for(i = 0; i <= N; i++) { t = i+1;

x = x + v + t*t; v = v + u;

} return x;}

Page 13: Compiler 3.2

13

S ou rce P rog ram C om p ile r O b jec t P rog ram

So what?

So the compiler converts the source program into object program. When doing this process of converting (source to object) it takes the sources code and passes it to different stages: lexical analysis, which reads the source Code character by character and coins it to form the lexemes and these lexemes then are framed into bits of tokens which are then called by Parser. The parser then groups these token into meaningful “grammatical phrases”, finds syntax errors if any and then built a intermediate representation of the program as an “abstract-syntax” tree. This now is passed through another process called Semantic analyzer, which further annotates and augments the “abstract-syntax tree”. Now this “abstract-syntax” tree is translated to an intermediate code by the Intermediate Code Generator. The outcome of the Intermediate code is further optimized by Optimizer. The optimizer can further simplify the codes and passes it to the Code Generator, which generates the desired Object code. These object code are then linked by the linkers and loaded into the memory for execution by the loader.

Page 14: Compiler 3.2

14

Anatomy of a Computer Step by Step (We shall see what happens)

Lexical Analyzer (Scanner)

Token Stream

Program (character stream)Lexical Analyzer (Scanner)

Token Stream

Program (character stream)

Page 15: Compiler 3.2

15

Lexical Analyzer (Scanner)

Num(234) mul_op lpar_op Num(11) add_op rpar_op

2 3 4 * ( 1 1 + - 2 2 )

Num(-22)

Page 16: Compiler 3.2

16

Lexical Analyzer (Scanner)

18..23 + val#ue

Num(234) mul_op lpar_op Num(11) add_op rpar_op

2 3 4 * ( 1 1 + - 2 2 )

Num(-22)

Not a number Variable names cannot have ‘#’ character

Page 17: Compiler 3.2

17

Anatomy of a Computer

Lexical Analyzer (Scanner)

Syntax Analyzer (Parser)Token Stream

Parse Tree

Program (character stream)

Syntax Analyzer (Parser)Token Stream

Parse Tree

Page 18: Compiler 3.2

18

Syntax Analyzer (Parser)

num num+

( )*num

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Page 19: Compiler 3.2

19

Syntax Analyzer (Parser)

int * foo(i, j, k))

int i;

int j;

{

for(i=0; i j) {

fi(i>j)

return j;

}

Extra parentheses

Missing increment

Not an expression

Not a keyword

Page 20: Compiler 3.2

20

Anatomy of a Computer

Intermediate Representation

Semantic Analyzer

Lexical Analyzer (Scanner)

Syntax Analyzer (Parser)Token Stream

Parse Tree

Program (character stream)

Intermediate Representation

Semantic AnalyzerParse Tree

Page 21: Compiler 3.2

21

Semantic Analyzer

int * foo(i, j, k)

int i;

int j;

{

int x;

x = x + j + N;

return j;

}

Type not declared

Mismatched return type

Uninitialized variable used

Undeclared variable

Page 22: Compiler 3.2

22

Anatomy of a Computer

Code Optimizer

Optimized Intermediate Representation

Intermediate Representation

Semantic Analyzer

Lexical Analyzer (Scanner)

Syntax Analyzer (Parser)Token Stream

Parse Tree

Program (character stream)

Code Optimizer

Optimized Intermediate Representation

Intermediate Representation

Page 23: Compiler 3.2

23

Optimizer

int sumcalc(int a, int b, int N){ int i;

int x, t, u, v; x = 0; u = ((4*a)b*i); v = 0; for(i = 0; i <= N; i++) { t = i+1; x = x + u + t*t;

} return x;}

int sumcalc(int a, int b, int N)

{

int i;

int x, y;

x = 0;

y = 0;

for(i = 0; i <= N; i++) {

x = x+4*a/b*i+(i+1)*(i+1);

}

return x;

}

Page 24: Compiler 3.2

24

Anatomy of a Computer

Code Optimizer

Code GeneratorOptimized Intermediate Representation

Assembly code

Intermediate Representation

Semantic Analyzer

Lexical Analyzer (Scanner)

Syntax Analyzer (Parser)Token Stream

Parse Tree

Program (character stream)

Code GeneratorOptimized Intermediate Representation

Assembly code

Page 25: Compiler 3.2

25

Code Generator

sumcalc: xorl %r8d, %r8d

xorl %ecx, %ecx movl %edx, %r9d cmpl %edx, %r8d jg .L7 sall $2, %edi.L5: movl %edi, %eax cltd idivl %esi leal 1(%rcx), %edx movl %eax, %r10d imull %ecx, %r10d movl %edx, %ecx imull %edx, %ecx leal (%r10,%rcx), %eax movl %edx, %ecx addl %eax, %r8d cmpl %r9d, %edx jle .L5.L7: movl %r8d, %eax ret

int sumcalc(int a, int b, int N){ int i;

int x, t, u, v; x = 0;

u = ((a<<2)/b); v = 0;

for(i = 0; i <= N; i++) { t = i+1;

x = x + v + t*t; v = v + u;

} return x;}