Compiler Construction
Composed By,
Muhammad Bilal Qureshi
2
Introduction
Book: Compilers Principles, Techniques, and
Tools by Alfred V.Aho, Ravi Sethi, Jeffery D.Ullman
References:Introduction to computer theory Automata theory by Daniel I.A. Cohen
Pre Requisite: Theory of Automata
Course Information
3
Introduction Slides are only brief discussion about the topic. Consult book
and lecture notes for complete details. Don’t depend on slides only.
No extension in assignments deadline will be allowed. Strictly follow timeline to get good marks.
Seniors (if any) and juniors are treated equally. No credit will be given on seniority basis.
Internal marks will be awarded as actual. No annoying query about marks will be entertained at the end of the course.
Ensure your presence during the lecture to avoid attendance shortage.
Golden Policy: No class miss – No course headache
Class and Evaluation Policy
4
Introduction Computer Program: A self-contained set of instructions used to operate a
computer to produce a specific result. Programming Language: A set of instructions that can be used to construct a
program. Low-Level Languages use instructions that are directly tied to one type of
computer. Programs written in such language are fastest in execution. E.g., Assembly, Machine language,…
High Level Languages use instructions that resemble written languages (e.g., English language). Examples are C, C++, Fortran,…
Source Code/Source Program: Programs written in a computer language (high or low level).
Once a program is written in a high level language be translated in to machine language of that computer on which it will run.
This translation can be accomplished in two ways:
Interpretation or Compilation.
Introduction
5
Introduction Interpreter: When each statement in a high level source program is
translated individually and executed immediately upon translation, the program doing the translation is called as Interpreter.
Compiler: When all of the statements in a high level source program are translated as a complete unit before anyone statement is executed, the program doing this translation is called as compiler.
Source program may be converted into an executable file (executable program) by a compiler and later on executed by a CPU.
A compiler is a computer program (or set of programs) that transform source code written in a programming language (the source program) into another computer language (the target language having a binary form known as object code or machine code).
Introduction
6
Introduction Simple taxonomy of compilation process:
Source Code Object Code
The main role of compiler is to report any errors in the source program that it detects during the compilation process. (e.g. a missing semicolon at the end of a statement)
Introduction
Compilation Process
Executable Code(extension is .exe)
Linked with different libraries (.h, .lib, etc)
Group of compilation phases
(human readable)(extensions are .c, .cpp, etc)
(Non-executable machine code)
(extension is .obj)
7
Introduction Compilation process operates as a sequence of phases each of which
transforms one representation of the source program to another. Two-Pass Compiler:
Front End: Recognizes legal and illegal programs presented to it.
Consists of two modules: Scanner & Parser.
Introduction
Front End Back EndSource Code
Intermediate Representation
Machine Code
Report errors if any
8
Introduction
Scanner: Maps the source code/source program character stream into words. It produces pairs (tokens) that consist of a word and its part of speech.
Example: z = x – y
It will be converted as; <id, z>
<assign,=>
<id, x>
<op, ->
<id, y>
Tokens may be identifier, number, new, while, switch, if, +, -, / etc…
Introduction
Scanner ParserSource CodeIntermediate
Representation
Report errors if any
Tokens
Token Name
Attribute Value
9
Introduction Parser: Takes tokens as input, recognizes context-free syntax, report
errors, and builds IR for the source program. It analysis the semantics like type checking. It builds a parse thus named parser. Parse can be represented by a tree called a parse tree.
Back End: Translates IR into target machine code. It selects instruction, allocate registers, and then schedule the instructions for execution.
Introduction