cs 332 programming language conceptsmercury.pr.erau.edu/.../spr-16/lecture-week-2.pdf · template...

29
January 18, 2016 Sam Siewert CS 332 Programming Language Concepts Lecture 2 – Compilers and Binary Utilities for Code Generation

Upload: others

Post on 02-Oct-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

January 18, 2016 Sam Siewert

CS 332 Programming Language Concepts

Lecture 2 – Compilers and Binary Utilities for Code Generation

Page 2: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Scaffolding Assignment – Option #1 Problem – Proliferation of PLs and Selection Hypothesis for Alternative PL Selection Collaboration (Alternate PL, Primary PL You Know Well) – Declarative

Functional – Lisp, Haskell, Scheme Dataflow – Verilog, VHDL, Halide, OpenCL, CUDA Logic – Prolog Multi-Paradigm – C#, Python (Must Use Functional, Declarative Features) Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language]

– Imperative Scripted (Interpreted) – SQL (Relational Algebra), R, MATLAB Procedural – C, Fortran Strong Typed OO – Java, Smalltalk, Ada 83 or 95 OO – C++

– Other (Propose to Instructor) Literature Review of Alternate PL and Applications Application Design and Prototype A-PL, P-PL Evaluation and Experiment Design Presentation of Comparative Results

Sam Siewert 2

Page 3: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Scaffolding Assignment – Option #2 Problem – Build a Small Application Specific Interpreter Hypothesis for Custom Interpreter vs. Traditional Application Collaboration (Alternate Custom PL, Primary PL You Know Well) – Custom Interpreter (App Specific Operations and Features)

Language Designed for Class of Application(s) – e.g. Scheduling Custom Syntax (Command Line or Script) Custom Semantics (Execution of Interpreted Command or Script) Example in Custom Interpreted Language

– Imperative Interpreted Use Favorite Interpreted or Scripted Language (Java, Python, C#) Implement Same Example Compare to Custom Interpreter

– Other (Propose to Instructor) Literature Review of Alternate PL and Applications Application Design and Prototype Custom-PL, P-PL Evaluation and Experiment Design Presentation of Comparative Results

Sam Siewert 3

Page 4: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Minute Paper In Your Own Words, Define what a Programming Language is? Once Defined Informally, can you Define more Formally So we can Say A is a PL, but B is Not? Or A > B (more expressive, able to computer more)?

Sam Siewert 4

Page 5: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

What Constitutes a PL? A programming language is - a formal constructed language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs to control the behavior of a machine or to express algorithms. – Formal vs. Operational Definition [https://en.wikipedia.org/wiki/Programming_language]

What about a Formal Computer Science Definition? – Denotational Semantics (Lambda Calculus, Alonzo Church) -

exposition of the denotational (or `mathematical' or `functional') approach to the formal semantics of programming languages (in contrast to `operational' and `axiomatic' approaches) [1]

– Turing Machine – Church-Turing Thesis [2] – Predicate Calculus - Sentences in first-order predicate logic can

be usefully interpreted as programs. [3] – Axiomatic – Build Up Programming [4]

Sam Siewert 5

Page 6: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

PL Must Have - Syntax and Semantics Syntax – Informal - set of rules that defines the combinations of symbols that

are considered to be a correctly structured document or fragment in that language [Wikipedia – Syntax (programming languages)]

– Formal – Figure 1.3, pp 26-28: Parsing with a context-free grammar which is a set of potentially recursive rules that are used to form a parse tree (constructs including statements, expressions, subroutines, …)

Semantics – Informal - the field concerned with the rigorous mathematical study

of the meaning of programming languages. It does so by evaluating the meaning of syntactically legal strings defined by a specific programming language [Wikipedia - Semantics (computer science)]

– Formal – Code generation to implement the meaning of a program (construct), pp. 29-35

Sam Siewert 6

Page 7: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Defining a Language Lex and Yacc –

– http://flex.sourceforge.net/ – http://www.gnu.org/software/bison/

Lexical analysis Parsing ASM and Machine Code Generation and Interpretation - http://www.gnu.org/software/binutils/ Specifying a Programming Language

– Reference Implementation – An example compiler or interpreter – Denotational Semantics - http://www.cs.cmu.edu/~crary/819-f09/Scott71.pdf – Detailed Written Language Specifications – E.g. C99 - http://www.open-

std.org/jtc1/sc22/WG14/www/docs/n1256.pdf

1. You will learn how to build an interpreter or simple lexer/parser/code-generator (compiler) piece by piece – A reference Implementation

2. We will tear apart, analyze compilers, code generation, and implement key pieces of a compiler

3. We’ll compare and contrast languages and language features Sam Siewert 7

Page 8: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Anatomy of a Compiler Interpreter: Source program + Input Interpreter Output

Compiler: Source program Compiler Executable Program – Input Executable Program Output – Front End

Syntax: Lexical Analysis and Parsing, Errors Semantics: Intermediate Language (e.g. GCC RTL), Errors Implementation: Machine Code Generation Front End Back End Executable Program

Generally Compiled Programs Run Faster, but Runtime Error Diagnostics Limited Compilers Normally Link Libraries and Object Code (Incomplete Machine Code) into an Executable Program Language Runtime Object Code (e.g. crt.o) and Linker/Loader on Operating System Run an Executable

Sam Siewert 8

Page 9: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

View of Compilation from PLP From a file to Object code Ready for Linking and Loading

Sam Siewert 9

Page 10: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Quick Primer on State Machines States are Circles Transitions Occur Due to an Input and Produce an Output Transitions Cause One State to be left and a new State to be Entered Input is External to the State Machine Ouput is produced external to the State Machine E.g. Switch with LED to Indicate Power State

Sam Siewert 10

Off On Power-on / LED-on

Power-off / LED-off

Page 11: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Example Instruction Format For the Example … – Operator (8 bits) – 256 unique instructions – Input Operand Registers (8 bits each) – Output Register (8 bits) – Add Contents of R0 and R1 and Write Back to R3

Sam Siewert 11

Operator Operand-1 Operand-2 Operand-3

8-bit 8-bit 8-bit 8-bit

add (0xD7) R0 (0x00) R1 (0x01) R3 (0x11)

1101_0111 0000_0000 0000_0001 0000_0011

Page 12: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

The ALU as a State Machine 4 State Machine – 4 Stages of Execution for each ASM Instruction

Sam Siewert 12

Start

Ready (Written Back)

Power-on / Ready

Fetched

{Clk-1, IP} /

{Instruction}

Decoded {Clk-2, Instruction}

/ {Opcode, R0, R1, R3}

Executed

{Clk-3, EU-Select} /

Result

Clk-4 / R3

Page 13: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

ALU States

Sam Siewert 13

R3

R0

R1

EU

Ctl

IP

Decode 1 2

3

Clk

WB 4

Page 14: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Execution Unit Arithmetic or Logical Operation (Combinational Logic) – Applied to Latched Operand Registers – Provides Output to Write-Back Unit – E.g. Add Unsigned Numbers with Carry and Overflow

E.g. Add Two Unsigned 32-Bit Integers, With Carry, No Overflow

0101_0000_1111_0101_1000_1000_0000_0000 (1,358,268,416) + 0101_0000_1111_0101_1000_1000_0000_0000 (1,358,268,416) ----------------------------------------------------------------------------------------- 1010_0001_1110_1011_0001_0000_0000_0000 (2,716,536,832)

Sam Siewert 14

Page 15: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Programming Language Concepts ALU and Machine Code Assembly Mnemonics for Machine Code – opcode, operands – E.g. sub r1, r1, r0

Machine (assembly) code generation for C statement – a = a – b; ldr r3, [fp, #-16] /* r3=a */

ldr r2, [fp, #-20] /* r2=b */

rsb r3, r2, r3 /* r3=r3-r2 */

str r3, [fp, #-16] /* r3=a (a=a-b) */

Basic Formula Translation to basic ASM block – FORTRAN – ALGOL

Sam Siewert 15

Page 16: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

A More Detailed Look http://mercury.pr.erau.edu/~siewerts/cs332/code/cs332_code/lcm/ lcmarm.s – Hand written assembly code for ARM

– lcmx86.s – GCC machine generated assembly code – objdump –d lcmc.o – disassembly of object code using binutils

Find ASM Basic Blocks that Map to C Statements (gcc –S to kick out ASM)

Sam Siewert 16

while(a != b) { if(a > b) a = a - b; else b = b - a; }

b WHLCHK LOOPBDY: ldr r2, [fp, #-16] /* r2=a */ ldr r3, [fp, #-20] /* r3=b */ cmp r2, r3 ble ELSBR ldr r3, [fp, #-16] /* r3=a */ ldr r2, [fp, #-20] /* r2=b */ rsb r3, r2, r3 /* r3=r3-r2 */ str r3, [fp, #-16] /* r3=a (a=a-b) */ b WHLCHK ELSBR: ldr r3, [fp, #-20] ldr r2, [fp, #-16] rsb r3, r2, r3 str r3, [fp, #-20] WHLCHK: ldr r2, [fp, #-16] /* r2=a */ ldr r3, [fp, #-20] /* r3=b */ cmp r2, r3 /* is a == b? */ bne LOOPBDY

Page 17: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

ARM Register Notes

Sam Siewert 17

User Mode (Common), FIQ/IRQ (Interrupts), Supervisor EABI Defines Use for Function Calls (Nested)

Page 18: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

ARM Procedure Call Standard (EABI – Important for Linking and Debugging) BL Instruction Does a Branch and Link Key for Calling a Subroutine and Returning SP is a Pointer for Local Temporary Data (as Expected) IP is a veneer or scratch – to offset in address space FP is a Frame Pointer for Call Stack (Context of Subroutine) that forms back-trace link-list

Sam Siewert 18

Page 19: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Why the FP and SP? FP Link List forms a back-trace In Each FP structure we store context of calling procedure [PC, LR, SP, FP] Locals go Below the FP

Sam Siewert 19 http://www.heyrick.co.uk/assembler/apcsintro.html

Page 20: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

ASM Observations Based on ARM Register Use Conventions for Procedure Calls – In ARM ASM, This is APCS (ARM Procedure Call Standard) and

ABI (Application Binary Interface) – Also From Machine Generated Labels

Allows C code to call Hand Written ASM How to Use Registers (Convention)

Sam Siewert 20

gcda: mov ip, sp stmfd sp!, {fp, ip, lr, pc} sub fp, ip, #4 sub sp, sp, #8 str r0, [fp, #-16] str r1, [fp, #-20]

int gcda(int a, int b) { }

Return Addresses

(calling context)

First two arguments stored

store multiple to save calling context

Page 21: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Compiler Code Generation See Abstract Syntax Tree for GCD (p. 32 of text) Defines Organization of Basic Blocks and Scope Annotations of Types, Symbols (Attributes) This is an Intermediate Representation Between Front-End and Back-End Could We Generate Blocks of ASM from This? Would This Give Us Basic Formula Translation? (FORTRAN) Sam Siewert 21

Page 22: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Programming Language Syntax

Parsers (Regular Expressions)

Sam Siewert

22

Page 23: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Regular Expressions A regular expression is one of the following: – A character – The empty string, denoted by ∑ – Two regular expressions concatenated – Two regular expressions separated by | (“OR”) – A regular expression followed by the Kleene star (concatenation

of zero or more strings)

Use for example to Define Simple Mathematical Expressions Allowed in a Language

Sam Siewert 23

Page 24: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Context Free Grammar CFG Productions – Expression grammar with precedence and associativity – Rules make use of other rules on RHS – Can Generate a Parse Tree using Rules

Sam Siewert 24

Page 25: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Example 1 Parse tree for expression grammar (with precedence) for 3 + 4 * 5

Sam Siewert 25

Page 26: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Example 2 Parse tree for expression grammar (with left associativity) for 10 - 4 - 3

Sam Siewert 26

Page 27: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

State Machine Example A Simple Calculator (FSA – Finite State Automation)

Sam Siewert 27

Ted Hoff – faced with changing requirements, along with Stanley Mazor is considered inventor of programmable ICs, or the microprocessor! (Robert Noyce Biography)

Page 28: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

Take Away Software and in Particular Interpreters Give Us Flexible Syntax and Semantics Simpler to Change Front-end Syntax and Back-end Semantics compared to State Machine Realized on Project at Intel to Build Calculator for Japanese Customer – Intel Really Wanted to Sell Japanese Memory Chips (at the time) – Needed to Keep up with Requirements Changes on Logic – Inspired the Intel 4004, Considered First Microprocessor

Sam Siewert 28

Page 29: CS 332 Programming Language Conceptsmercury.pr.erau.edu/.../Spr-16/Lecture-Week-2.pdf · Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language] – Imperative

References 1. Stoy, Joseph E. Denotational semantics: the Scott-Strachey

approach to programming language theory. MIT press, 1977.

2. Copeland, B. Jack. "The church-turing thesis." Stanford encyclopedia of philosophy (2002).

3. Van Emden, Maarten H., and Robert A. Kowalski. "The semantics of predicate logic as a programming language." Journal of the ACM (JACM) 23.4 (1976): 733-742.

4. Hoare, Charles Antony Richard. "An axiomatic basis for computer programming." Communications of the ACM 12.10 (1969): 576-580.

Sam Siewert 29