cs 332 programming language conceptsmercury.pr.erau.edu/.../spr-16/lecture-week-2.pdf · template...
TRANSCRIPT
January 18, 2016 Sam Siewert
CS 332 Programming Language Concepts
Lecture 2 – Compilers and Binary Utilities for Code Generation
Scaffolding Assignment – Option #1 Problem – Proliferation of PLs and Selection Hypothesis for Alternative PL Selection Collaboration (Alternate PL, Primary PL You Know Well) – Declarative
Functional – Lisp, Haskell, Scheme Dataflow – Verilog, VHDL, Halide, OpenCL, CUDA Logic – Prolog Multi-Paradigm – C#, Python (Must Use Functional, Declarative Features) Template – XSLT & XML, HTML5+CSS+JavaScript [HTML is not a programming language]
– Imperative Scripted (Interpreted) – SQL (Relational Algebra), R, MATLAB Procedural – C, Fortran Strong Typed OO – Java, Smalltalk, Ada 83 or 95 OO – C++
– Other (Propose to Instructor) Literature Review of Alternate PL and Applications Application Design and Prototype A-PL, P-PL Evaluation and Experiment Design Presentation of Comparative Results
Sam Siewert 2
Scaffolding Assignment – Option #2 Problem – Build a Small Application Specific Interpreter Hypothesis for Custom Interpreter vs. Traditional Application Collaboration (Alternate Custom PL, Primary PL You Know Well) – Custom Interpreter (App Specific Operations and Features)
Language Designed for Class of Application(s) – e.g. Scheduling Custom Syntax (Command Line or Script) Custom Semantics (Execution of Interpreted Command or Script) Example in Custom Interpreted Language
– Imperative Interpreted Use Favorite Interpreted or Scripted Language (Java, Python, C#) Implement Same Example Compare to Custom Interpreter
– Other (Propose to Instructor) Literature Review of Alternate PL and Applications Application Design and Prototype Custom-PL, P-PL Evaluation and Experiment Design Presentation of Comparative Results
Sam Siewert 3
Minute Paper In Your Own Words, Define what a Programming Language is? Once Defined Informally, can you Define more Formally So we can Say A is a PL, but B is Not? Or A > B (more expressive, able to computer more)?
Sam Siewert 4
What Constitutes a PL? A programming language is - a formal constructed language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs to control the behavior of a machine or to express algorithms. – Formal vs. Operational Definition [https://en.wikipedia.org/wiki/Programming_language]
What about a Formal Computer Science Definition? – Denotational Semantics (Lambda Calculus, Alonzo Church) -
exposition of the denotational (or `mathematical' or `functional') approach to the formal semantics of programming languages (in contrast to `operational' and `axiomatic' approaches) [1]
– Turing Machine – Church-Turing Thesis [2] – Predicate Calculus - Sentences in first-order predicate logic can
be usefully interpreted as programs. [3] – Axiomatic – Build Up Programming [4]
Sam Siewert 5
PL Must Have - Syntax and Semantics Syntax – Informal - set of rules that defines the combinations of symbols that
are considered to be a correctly structured document or fragment in that language [Wikipedia – Syntax (programming languages)]
– Formal – Figure 1.3, pp 26-28: Parsing with a context-free grammar which is a set of potentially recursive rules that are used to form a parse tree (constructs including statements, expressions, subroutines, …)
Semantics – Informal - the field concerned with the rigorous mathematical study
of the meaning of programming languages. It does so by evaluating the meaning of syntactically legal strings defined by a specific programming language [Wikipedia - Semantics (computer science)]
– Formal – Code generation to implement the meaning of a program (construct), pp. 29-35
Sam Siewert 6
Defining a Language Lex and Yacc –
– http://flex.sourceforge.net/ – http://www.gnu.org/software/bison/
Lexical analysis Parsing ASM and Machine Code Generation and Interpretation - http://www.gnu.org/software/binutils/ Specifying a Programming Language
– Reference Implementation – An example compiler or interpreter – Denotational Semantics - http://www.cs.cmu.edu/~crary/819-f09/Scott71.pdf – Detailed Written Language Specifications – E.g. C99 - http://www.open-
std.org/jtc1/sc22/WG14/www/docs/n1256.pdf
1. You will learn how to build an interpreter or simple lexer/parser/code-generator (compiler) piece by piece – A reference Implementation
2. We will tear apart, analyze compilers, code generation, and implement key pieces of a compiler
3. We’ll compare and contrast languages and language features Sam Siewert 7
Anatomy of a Compiler Interpreter: Source program + Input Interpreter Output
Compiler: Source program Compiler Executable Program – Input Executable Program Output – Front End
Syntax: Lexical Analysis and Parsing, Errors Semantics: Intermediate Language (e.g. GCC RTL), Errors Implementation: Machine Code Generation Front End Back End Executable Program
Generally Compiled Programs Run Faster, but Runtime Error Diagnostics Limited Compilers Normally Link Libraries and Object Code (Incomplete Machine Code) into an Executable Program Language Runtime Object Code (e.g. crt.o) and Linker/Loader on Operating System Run an Executable
Sam Siewert 8
View of Compilation from PLP From a file to Object code Ready for Linking and Loading
Sam Siewert 9
Quick Primer on State Machines States are Circles Transitions Occur Due to an Input and Produce an Output Transitions Cause One State to be left and a new State to be Entered Input is External to the State Machine Ouput is produced external to the State Machine E.g. Switch with LED to Indicate Power State
Sam Siewert 10
Off On Power-on / LED-on
Power-off / LED-off
Example Instruction Format For the Example … – Operator (8 bits) – 256 unique instructions – Input Operand Registers (8 bits each) – Output Register (8 bits) – Add Contents of R0 and R1 and Write Back to R3
Sam Siewert 11
Operator Operand-1 Operand-2 Operand-3
8-bit 8-bit 8-bit 8-bit
add (0xD7) R0 (0x00) R1 (0x01) R3 (0x11)
1101_0111 0000_0000 0000_0001 0000_0011
The ALU as a State Machine 4 State Machine – 4 Stages of Execution for each ASM Instruction
Sam Siewert 12
Start
Ready (Written Back)
Power-on / Ready
Fetched
{Clk-1, IP} /
{Instruction}
Decoded {Clk-2, Instruction}
/ {Opcode, R0, R1, R3}
Executed
{Clk-3, EU-Select} /
Result
Clk-4 / R3
ALU States
Sam Siewert 13
R3
R0
R1
EU
Ctl
IP
Decode 1 2
3
Clk
WB 4
Execution Unit Arithmetic or Logical Operation (Combinational Logic) – Applied to Latched Operand Registers – Provides Output to Write-Back Unit – E.g. Add Unsigned Numbers with Carry and Overflow
E.g. Add Two Unsigned 32-Bit Integers, With Carry, No Overflow
0101_0000_1111_0101_1000_1000_0000_0000 (1,358,268,416) + 0101_0000_1111_0101_1000_1000_0000_0000 (1,358,268,416) ----------------------------------------------------------------------------------------- 1010_0001_1110_1011_0001_0000_0000_0000 (2,716,536,832)
Sam Siewert 14
Programming Language Concepts ALU and Machine Code Assembly Mnemonics for Machine Code – opcode, operands – E.g. sub r1, r1, r0
Machine (assembly) code generation for C statement – a = a – b; ldr r3, [fp, #-16] /* r3=a */
ldr r2, [fp, #-20] /* r2=b */
rsb r3, r2, r3 /* r3=r3-r2 */
str r3, [fp, #-16] /* r3=a (a=a-b) */
Basic Formula Translation to basic ASM block – FORTRAN – ALGOL
Sam Siewert 15
A More Detailed Look http://mercury.pr.erau.edu/~siewerts/cs332/code/cs332_code/lcm/ lcmarm.s – Hand written assembly code for ARM
– lcmx86.s – GCC machine generated assembly code – objdump –d lcmc.o – disassembly of object code using binutils
Find ASM Basic Blocks that Map to C Statements (gcc –S to kick out ASM)
Sam Siewert 16
while(a != b) { if(a > b) a = a - b; else b = b - a; }
b WHLCHK LOOPBDY: ldr r2, [fp, #-16] /* r2=a */ ldr r3, [fp, #-20] /* r3=b */ cmp r2, r3 ble ELSBR ldr r3, [fp, #-16] /* r3=a */ ldr r2, [fp, #-20] /* r2=b */ rsb r3, r2, r3 /* r3=r3-r2 */ str r3, [fp, #-16] /* r3=a (a=a-b) */ b WHLCHK ELSBR: ldr r3, [fp, #-20] ldr r2, [fp, #-16] rsb r3, r2, r3 str r3, [fp, #-20] WHLCHK: ldr r2, [fp, #-16] /* r2=a */ ldr r3, [fp, #-20] /* r3=b */ cmp r2, r3 /* is a == b? */ bne LOOPBDY
ARM Register Notes
Sam Siewert 17
User Mode (Common), FIQ/IRQ (Interrupts), Supervisor EABI Defines Use for Function Calls (Nested)
ARM Procedure Call Standard (EABI – Important for Linking and Debugging) BL Instruction Does a Branch and Link Key for Calling a Subroutine and Returning SP is a Pointer for Local Temporary Data (as Expected) IP is a veneer or scratch – to offset in address space FP is a Frame Pointer for Call Stack (Context of Subroutine) that forms back-trace link-list
Sam Siewert 18
Why the FP and SP? FP Link List forms a back-trace In Each FP structure we store context of calling procedure [PC, LR, SP, FP] Locals go Below the FP
Sam Siewert 19 http://www.heyrick.co.uk/assembler/apcsintro.html
ASM Observations Based on ARM Register Use Conventions for Procedure Calls – In ARM ASM, This is APCS (ARM Procedure Call Standard) and
ABI (Application Binary Interface) – Also From Machine Generated Labels
Allows C code to call Hand Written ASM How to Use Registers (Convention)
Sam Siewert 20
gcda: mov ip, sp stmfd sp!, {fp, ip, lr, pc} sub fp, ip, #4 sub sp, sp, #8 str r0, [fp, #-16] str r1, [fp, #-20]
int gcda(int a, int b) { }
Return Addresses
(calling context)
First two arguments stored
store multiple to save calling context
Compiler Code Generation See Abstract Syntax Tree for GCD (p. 32 of text) Defines Organization of Basic Blocks and Scope Annotations of Types, Symbols (Attributes) This is an Intermediate Representation Between Front-End and Back-End Could We Generate Blocks of ASM from This? Would This Give Us Basic Formula Translation? (FORTRAN) Sam Siewert 21
Programming Language Syntax
Parsers (Regular Expressions)
Sam Siewert
22
Regular Expressions A regular expression is one of the following: – A character – The empty string, denoted by ∑ – Two regular expressions concatenated – Two regular expressions separated by | (“OR”) – A regular expression followed by the Kleene star (concatenation
of zero or more strings)
Use for example to Define Simple Mathematical Expressions Allowed in a Language
Sam Siewert 23
Context Free Grammar CFG Productions – Expression grammar with precedence and associativity – Rules make use of other rules on RHS – Can Generate a Parse Tree using Rules
Sam Siewert 24
Example 1 Parse tree for expression grammar (with precedence) for 3 + 4 * 5
Sam Siewert 25
Example 2 Parse tree for expression grammar (with left associativity) for 10 - 4 - 3
Sam Siewert 26
State Machine Example A Simple Calculator (FSA – Finite State Automation)
Sam Siewert 27
Ted Hoff – faced with changing requirements, along with Stanley Mazor is considered inventor of programmable ICs, or the microprocessor! (Robert Noyce Biography)
Take Away Software and in Particular Interpreters Give Us Flexible Syntax and Semantics Simpler to Change Front-end Syntax and Back-end Semantics compared to State Machine Realized on Project at Intel to Build Calculator for Japanese Customer – Intel Really Wanted to Sell Japanese Memory Chips (at the time) – Needed to Keep up with Requirements Changes on Logic – Inspired the Intel 4004, Considered First Microprocessor
Sam Siewert 28
References 1. Stoy, Joseph E. Denotational semantics: the Scott-Strachey
approach to programming language theory. MIT press, 1977.
2. Copeland, B. Jack. "The church-turing thesis." Stanford encyclopedia of philosophy (2002).
3. Van Emden, Maarten H., and Robert A. Kowalski. "The semantics of predicate logic as a programming language." Journal of the ACM (JACM) 23.4 (1976): 733-742.
4. Hoare, Charles Antony Richard. "An axiomatic basis for computer programming." Communications of the ACM 12.10 (1969): 576-580.
Sam Siewert 29