compiler design introduction

19
COMPILER DESIGN INTRODUCTION Richa Sharma 1

Upload: richa-sharma

Post on 14-Feb-2017

105 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Compiler Design Introduction

COMPILER DESIGNINTRODUCTION

Richa Sharma 1

Page 2: Compiler Design Introduction

MAIN REFERENCE BOOK:

• COMPILERS – PRINCIPLES, TECHNIQUES AND TOOLS, SECOND EDITION BY

ALFRED V. AHO, RAVI SETHI, JEFFERY D. ULLMAN

• PRINCIPLES OF COMPILER DESIGN BY V R RAGHAVAN .

Richa Sharma 2

Page 3: Compiler Design Introduction

COMPILER

• IT’S A SOFTWARE UTILITY THAT TRANSLATED HIGH LANGUAGE CODE INTO TARGET LANGUAGE

CODE ,AS COMPUTER DOESN’T UNDERSTAND HIGH LANGUAGE.

DATA

OUTPUT

• IMPORTANT ROLE OF COMPILER IS TO REPORT THE ERROR IN THE SOURCE PROGRAM BY

TRANSLATING THE PROGRAM IN ONE GO.

• STRUCTURE OF THE COMPILER IS OFFLINE, MEANING THAT WE PRE-PROCESS THE PROGRAM FIRST

AND CREATES AND EXECUTABLE CODE AND THIS CODE CAN RUN ON DIFFERENT INPUTS OR DATA .

Richa Sharma 3

COMPILERSource

ProgramTarget

Program

Page 4: Compiler Design Introduction

INTERPRETER

• IT’S ANOTHER SOFTWARE UTILITY THAT TRANSLATES HIGH LANGUAGE CODE INTO TARGET

LANGUAGE CODE LINE BY LINE UNLIKE COMPILER .

OUTPUT

• INTERPRETER IS ONLINE MODE, IN WHICH DATA AND SOURCE PROGRAM ARE EXECUTED

SIMULTANEOUSLY GIVING THE OUTPUT .NO PRE-PROCESSING IS DONE EARLIER.

Richa Sharma 4

COMPILERSource Program

Data

Page 5: Compiler Design Introduction

EXAMPLES

MOST LANGUAGES ARE USUALLY THOUGHT OF AS USING EITHERONE OR THE OTHER:

• COMPILERS: FORTRAN, COBOL, C, C++, PASCAL, PL/1

• INTERPRETERS: LISP, SCHEME, BASIC, APL, PERL, PYTHON,SMALLTALK

Richa Sharma 5

Page 6: Compiler Design Introduction

Preprocessor

Compiler

Assembler

Linker/Loader

Source program

Modified Source program

Target assembly program

Relocatable machine code

Target machine code

Library filesRicha Sharma

STEPS FOR LANGUAGE PROCESSING SYSTEM.

Page 7: Compiler Design Introduction

PHASES OF COMPILER

THE TRANSLATION OF INPUT FILE INTO TARGET CODE IS DIVIDED INTO 2 STAGES :

1. FRONT END (ANALYSIS): TRANSFORM SOURCE CODE INTO INTERMEDIATE CODE ALSO CALLED INTERMEDIATE REPRESENTATION (IR) . IT’S A MACHINE-INDEPENDENT REPRESENTATION.

2. BACK END (SYNTHESIS): IT TAKES IR AND GENERATES THE TARGET ASSEMBLY LANGUAGE PROGRAM.

FRONT END BACK END

1) LEXICAL ANALYZER (SCANNING) 5) CODE OPTIMIZATION

2) SYNTAX ANALYZER (PARSING) 6) MACHINE CODE GENERATION

3) SEMANTIC ANALYZER

4) INTERMEDIATE CODE GENERATION

Richa Sharma 7

Page 8: Compiler Design Introduction

PHASES OF COMPILER

Richa Sharma 8

Symbol table Error handler

Front End

Back End

Page 9: Compiler Design Introduction

LEXICAL ANALYSIS/SCANNING

READS THE STREAM OF CHARACTERS MAKING UP THE SOURCE PROGRAM AND GROUP THE

CHARACTERS INTO MEANINGFUL SEQUENCES CALLED LEXEMES.

LEXEME ---- TOKEN

< TOKEN-NAME, ATTRIBUTE-VALUE>

TOKEN NAME – IS AN ABSTRACT SYMBOL USED DURING SYNTAX ANALYSIS.

ATTRIBUTE VALUE – POINTS TO AN ENTRY IN THE SYMBOL TABLE FOR THIS TOKEN.

POSITION = INITIAL + RATE * 60

<ID,1> < = > <ID,2> < +> <ID,3> < *> <60>

Richa Sharma 9

Page 10: Compiler Design Introduction

SYNTAX ANALYSIS/PARSING

• RECOGNIZES “SENTENCES” IN THE PROGRAM USING THE SYNTAX OF THE LANGUAGE

• CREATES TREE LIKE STRUCTURE FROM TOKENS (SYNTAX TREE)

• NODE REPRESENTS OPERATION

• CHILDREN REPRESENTS ARGUMENTS

• REPRESENTS THE SYNTACTIC STRUCTURE OF THE PROGRAM, HIDING A FEW DETAILS THAT ARE

IRRELEVENT TO LATER PHASES OF COMPILATION.

Richa Sharma 10

Page 11: Compiler Design Introduction

SEMANTIC ANALYSIS

• INFERS INFORMATION ABOUT THE PROGRAM USING THE SEMANTICS OF THE LANGUAGE

• USES SYNTAX TREE AND INFO. IN SYMBOL TABLE TO CHECK FOR SEMANTIC CONSISTENCY.

• GATHERS TYPE INFO. AND SAVES IT IN EITHER THE SYNTAX TREE OR SYMBOL TABLE FOR USE IN

ICG

• TYPE CHECKING – CHECKS THAT EACH OPERATOR HAS MATCHING OPERANDS. E.G ARRAY

INDEX SHOULD BE INTEGER.

• TYPE CONVERSIONS CALLED COERCIONS

• BINARY ARITHMETIC OPERATOR (INT OR FLOAT)

• IF 6+7.5, THEN CONVERT 6 TO 6.5

Richa Sharma 11

Page 12: Compiler Design Introduction

INTERMEDIATE CODE GENERATION

• GENERATES “ABSTRACT” CODE BASED ON THE SYNTACTIC STRUCTURE OF THE PROGRAM AND THE SEMANTIC

INFORMATION FROM PHASE 2.

SYNTAX TREE ARE ALSO IR

COMPILER MAY PRODUCE EXPLICIT IR

IR HAS TWO PROPERTIES:

EASY TO PRODUCE

EASY TO TRANSLATE INTO TARGET MACHINE

E.G IR – THREE ADDRESS CODE

T1 = INTTOFLOAT(60)

T2 = ID3 * T1

T3 = ID2 + T3

ID1 = T3

Richa Sharma 12

Page 13: Compiler Design Introduction

CODE OPTIMIZATION

• REFINES THE GENERATED CODE USING A SERIES OF OPTIMIZING TRANSFORMATIONS.

• Eg: REMOVING DEAD CODE .

REDUCING ITERATIONS AND LOOPS ETC..

• APPLY A SERIES OF TRANSFORMATIONS TO IMPROVE THE TIME AND SPACE EFFICIENCY OF THE

GENERATED CODE.

• PEEPHOLE OPTIMIZATIONS: GENERATE NEW INSTRUCTIONS BY COMBINING/EXPANDING ON

A SMALL NUMBER OF CONSECUTIVE INSTRUCTIONS.

• GLOBAL OPTIMIZATIONS: REORDER, REMOVE OR ADD INSTRUCTIONS TO CHANGE THE

STRUCTURE OF GENERATED CODE.

Richa Sharma 13

Page 14: Compiler Design Introduction

CODE GENEARTION• MAP INSTRUCTIONS IN THE INTERMEDIATE CODE TO SPECIFIC MACHINE INSTRUCTIONS.

• SUPPORTS STANDARD OBJECT FILE FORMATS.

• GENERATES SUFFICIENT INFORMATION TO ENABLE SYMBOLIC DEBUGGING.

IR -> CG -> TARGET LANGUAGE (E.G MACHINE CODE)

REGISTERS AND MEMORY LOCATIONS ARE SELECTED FOR EACH VARIABLE USED BY THE PROGRAM.

LDF R2, ID3

MULF R2, R2, #60.0

LDF R1, ID2

ADDF R1, R1, R2

STF ID1, R1

R1,R2 – REGISTERS F – FLOATING POINT NUMBERS # - IMMEDIATE CONST.

Richa Sharma

14

Page 15: Compiler Design Introduction

SYMBOL TABLE

• SYMBOL TABLE – DATA STRUCTURE WITH A RECORD FOR EACH IDENTIFIER AND ITS ATTRIBUTES

• ALL THE PHASES ARE CONNECTED TO THE SYMBOL TABLE.

• ATTRIBUTES INCLUDE STORAGE ALLOCATION, TYPE, SCOPE, ETC

• ALL THE COMPILER PHASES INSERT AND MODIFY THE SYMBOL TABLE

Richa Sharma 15

Page 16: Compiler Design Introduction

Richa Sharma

16

OVERALL WORKING OF COMPILER PHASES

Page 17: Compiler Design Introduction

COMPILER CONSTRUCTION TOOLS

• DEVELOPER MAY USE MODERN SOFTWARE DEVELOPMENT ENVIRONMENT CONTAINING

TOOLS Ex: LANGUAGE EDITORS, VERSION MANAGERS, TEC.

• SOME SPECIAL TOOLS CAN ALSO BE USED. THESE TOOLS ARE THOSE WHICH HIDE THE DETAILS

OF GENERATION ALGORITHM AND PRODUCE COMPONENTS THAT CAN BE EASILY INTEGRATED

TO REMAINDER OF THE COMPILER.

• PARSER GENERATOR: TAKES GRAMMAR DESCRIPTION AND PRODUCES SYNTAX ANALYSER.

• SCANNER GENERATOR: TAKES REGULAR EXPRESSION AND PRODUCES LEXICAL ANALYSER.

• AUTOMATIC CODE- GENERATOR : TAKES INTERMEDIATE AND CONVERT TO MACHINE

LANGUAGE)

• DATA FLOW ANALYSIS ENGINES (FOR OPTIMIZATION)

• COMPILER CONSTRUCTION TOOLKIT.

Richa Sharma 17

Page 18: Compiler Design Introduction

APPLICATION OF COMPILER TECHNOLOGY• IMPLEMENTATION OF HIGH LEVEL PROGRAMMING LANGUAGES

- CONCEPTS OF OOPS

• OPTIMIZATION FOR COMPUTER ARCHITECTURE

- PARALLELISM

- MEMORY HIERARCHIES OF MACHINES(REG, ARRAYS ETC)

• DESIGN OF NEW COMPUTER ARCHITECURES

- RISC

- CISC

- SIMD ETC

• PROGRAM TRANSLATIONS

• SOFTWARE PRODUCTIVITY TOOLS

Richa Sharma18

Page 19: Compiler Design Introduction

Richa Sharma 19