Compilers
Purpose of Course showclose
Because we have compiler programs, software developers often take the process of compilation for granted. However, as a software developer, you should cultivate a solid understanding of how compilers work in order to develop the strongest code possible and fully understand its underlying language. In addition, the compilation process comprises techniques that are applicable to the development of many software applications. As such, this course will introduce you to the compilation process, present foundational topics on formal languages and outline each of the essential compiler steps: scanning, parsing, translation and semantic analysis, code generation, and optimization. By the end of the class, you will have a strong understanding of what it means to compile a program, what happens in the process of translating a higher-level language into a lower-level language, and the applicability of the steps of the compilation process to other applications.
Course Information showclose
Welcome to CS304. Below, please find some general information on the course and its requirements.
Course Designer: JM Perry
Primary Resources:
Note: The MIT, Stanford, and Berkeley resources overlap. However, they reinforce each other, have differences in presentation, examples, and topic emphasis, and provide the opportunity to review the material. Moreover, they offer a choice of perspective and style, which some students may find appealing. There are several ways to use these resources. All students should use the Mogensen text. Of the MIT, Stanford, and Berkeley resources, a student may 1) choose the one whose style is most appealing and stick with it, occasionally using the others where they are beneficial, 2) use all three resources totally, or 3) for each topic/subunit select the resource you find most helpful. Lastly, the Berkeley resource includes video lectures, which can be useful for explaining certain points of a topic.
- Open courseware from MIT: 6.035 Computer Language Engineering
- Open courseware from Stanford University: CS143 Compilers
- Open courseware from Berkeley: CS164 Programming Languages and Compilers Video Lectures and Lecture Notes.
- Torben Ægidius Mogensen’s textbook, Basics of Compiler Design
- Wikipedia
Requirements for Completion: Passing grade on quizzes and the final exam; achieve the objectives of each unit and the goals of the course; ability to apply the concepts taught in the course.
Time Requirements: This course should take approximately 146 hours to complete.
Tips/Suggestions: As you study the topics in the course, keep in mind the application of compiler techniques and tools to the development of other types of software.
Learning Outcomes showclose
- Describe the compilation process and explain the function of the components that comprise the structure of a compiler.
- Apply concepts of formal languages and finite-state machines to the translation of computer languages.
- Identify the compiler techniques, methods, and tools that are applicable to other software applications.
- Describe the challenges and state-of-the-practice of compiler theory and practice.
Course Requirements showclose
√ Have access to a computer.
√ Have continuous broadband Internet access.
√ Have the ability/permission to install plug-ins or software (e.g., Adobe Reader or Flash).
√ Have the ability to download and save files and documents to a computer.
√ Have the ability to open Microsoft files and documents (.doc, .ppt, .xls, etc.).
√ Have competency in the English language.
√ Have read the Saylor Student Handbook.
√ Have a basic mastery of the material in CS101, CS102, CS201, CS202, and CS303.Unit Outline show close
Expand All Resources Collapse All Resources
-
Unit 1: Introduction to Compilers
The compilation process is one of the stepsin executing a program. Understanding how compilerswork and what goes on “behind the scenes” will help you get better at developing software. This unit will first provide you with an introduction to the compiler, its history, compiler structure and design, and the types of compilers. By the end of this unit, you will be able to describe the steps of the compilation process.
Unit 1 Time Advisory show close
Unit 1 Learning Outcomes show close
-
1.1 System View of Compilers
Note: A compiler is a complex software system, and, as such, has a system life cycle that starts with a need, transitions to design and construction/implementation, undergoes testing, transitions to use in its intended environment, undergoes maintenance over its life, and ends with “disposal” and archival storage. The activities, various work products, and documentation that comprise the processes used during the system’s life must be carefully planned. The part of the system life cycle that deals with the design and construction of the compiler is called the development life cycle. This course focuses on the development life cycle, but the overall system life cycle must be kept in mind.
The term “compiler process” is related to, but different from the system (life cycle) process. The “compiler process” refers to the processing that a compiler does when it performs its function of translation of a source program to a target program. It also refers to the approach taken to design a compiler. Design, in fact, is one activity in the system (life cycle) process. - 1.2 Overview of Compilers
-
1.2.1 Compiler Definitions, Terminology, and Anatomy of a Compiler
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 1: “Introduction”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 1: “Introduction” (PDF)
Instructions: For a definition of a compiler and some terminology, study slides 13–26. For an anatomy of a compiler see slides 27–47. For examples of optimization see slides 48–76. These slides have good examples of compiler output for a given input and a lot of examples of optimizations. A compiler translates a high-level language to a low-level language.
Terms of Use: The resource above is released under an Attribution-NonCommercial-ShareAlike License 3.0 Unported (CC BY-NC-SA 3.0) (HTML). It is attributed to MIT and the original resource can be found here (PDF).See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 00: “Introduction to Compilers”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 00: “Introduction to Compilers” (PDF)
Instructions: For a slightly different overview of compilers, select the first link in the lecture column on the far right labeled 00: Introduction to Compilers. Please study slides 8–47. These slides make an analogy between compilation and the equivalence-preserving transformations of an electrical circuit.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Wikipedia’s “Compilers”
Link: Wikipedia’s “Compilers” (PDF)
Instructions: Read the short paragraph on Related Techniques, which defines related terms: language translator, assembler, disassembler, and decompiler.
Terms of Use: The article above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original Wikipedia version of this article here (HTML).See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 1: “Introduction”
-
1.2.2 History
- Reading: Wikipedia’s “Compiler History”
Link: Wikipedia’s “Compiler History” (PDF)
Instructions: Please read the interesting history of the early compilers and computer pioneers, such as Grace Hopper, who worked on the first COBOL compiler.
Terms of Use: The article above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original Wikipedia version of this article here (HTML).See a broken link? Please let us know!
- Reading: Wikipedia’s “Compiler History”
-
1.2.2.1 Translation
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 1: Introduction”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 1: Introduction” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please read sections 1.1, 1.2, and 1.3. Note that “compile” means to translate from a high-level language, used by humans, to a low-level language used by a computer. A high-level language uses concepts, objects, and tasks performed by a human, whereas a low-level or machine language uses concepts, objects, and tasks performed by machines. This reading also applies to the subsections 1.2.2.1.1 – 1.2.2.1.3, 1.2.2.2, and to section 1.3 of this course.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 1: Introduction”
-
1.2.2.1.1 Translate Source Language to Target Language
Note: The reading for this subsection is covered by the reading for 1.2.2.1 above. The input language to a compiler, typically a programming language, is called the source language. The output language of a compiler is called the target language or object code, typically an assembly language or machine language.
-
1.2.2.1.2 Object Code and Executables
Note: The reading for this subsection is covered by the reading for 1.2.2.1.
-
1.2.2.1.3 Platform Independent Compilers
Note: The reading for this subsection is covered by the reading for 1.2.2.1 above. In the textbook reading, the author describes seven phases of a compiler. The middle phase is called Intermediate Code Generation. The intermediate code is independent of a particular target machine. One of the back-end phases is called Machine Code Generation, which translates the machine-independent code to machine-dependent code for a particular computer.
-
1.2.2.2 Comparison between Compiler and Interpreter
Note: The reading for subsection 1.2.2.1 above covers this subsection.
-
1.2.3 Other Applications
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 1: Introduction”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 1: Introduction” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please read section 1.4, which gives reasons for studying compilers. In the discussion, the author suggests other applications for compiler techniques and tools. He specifically mentions domain-specific languages. Other applications derive from techniques and methods used in compilers. For example, if the input to a software system can be described using a formal grammar, then lexical- and syntax-phase techniques may be the most effective and efficient to use for inputting, verifying, and representing the data for processing. Front-end techniques can also be applied via software tools to the enforcement of development and programming standards and procedures. Knowledge of compiler techniques and methods helps in the design of new programming languages, new hardware architectures, generation of software applications, and the design of software tools.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 1: Introduction”
-
1.2.4 Hardware Compilation
- Reading: Wikipedia: “Compilers”
Link: Wikipedia: “Compilers” (PDF)
Instructions: Please read the short paragraph on Hardware Compilation. Note that hardware compilation refers to the translation of a software program to a hardware representation.
Terms of Use: The article above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original Wikipedia version of this article here (HTML).See a broken link? Please let us know!
- Reading: Wikipedia: “Compilers”
- 1.3 Systems Process for Development of a Compiler
-
1.3.1 Need for Compilers
Note: The reading in section 1.2.2.1 above applies to this section. The need for compilers arises from the need for high-level languages, which are more relevant for problem solving and software development by humans. Moreover, high-level languages are machine independent.
-
1.3.2 Development Life Cycle Process for a Compiler
Note: Since development of a compiler is a relatively complex system-development effort, typically having many users and developers, and will be maintained over a life of many years, a formal process should be used for its development. The development process should extend from requirements through verification and validation, and should include reviews, tests, analysis and measures, quality assurance, configuration control, and key documentation. The development process is a part of the overall system life cycle process, which additionally, includes deployment, maintenance, disposal and archival storage. The compiler development process should consist of procedures for writing and documenting the needs and requirements, the architecture and design, construction, integration, and verificaton and validation of the compiler. Documentation should also include the formal foundationsand techniques used, tradeoffs made, alternatives evaluated, and the chosen alternative for design or implementation. Full coverage of all of these indetail is beyond the scope of this course. As we proceed through this course, however, we include high-level needs, requirements, functions, performance considerations, and verification and validation issues for a compiler and its parts.
- 1.4 Compiler Design Overview
-
1.4.1 One-Pass vs. Multi-Pass
- Reading: Wikipedia: “One-pass versus multi-pass compilers”
Link: Wikipedia: “One-pass versus multi-pass compilers” (PDF)
Instructions: Please read the paragraph on one-pass and multi-pass compilers. This distinction was more relevant in the early days of computing, when computer memory was limited and processing time was much slower compared to current computers. This reading applies to all the subsections of 1.4.1.
Terms of Use: The article above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original Wikipedia version of this article here (HTML).See a broken link? Please let us know!
- Reading: Wikipedia: “One-pass versus multi-pass compilers”
-
1.4.1.1 One-Pass
Note: A “pass” refers to reading and processing the input, i.e., source or source representation. The reading for 1.4.1 gives the rationale for a one-pass design—namely, a one-pass compiler is simpler to implement. However, more sophisticated processing, such as optimizations, may require multiple passes over the source representation.
-
1.4.1.2 Multi-Pass
Note: The reading for 1.4.1 above includes information on a multi-pass design. The compiler is composed of components, which perform the steps of the compilation process. Some of the components may make a separate pass over the source representation—thus, the name of multi-pass. Multi-pass compilers are used for translating more complex languages (for example, high-level language to high-level language), and for performing more complete optimizations. Moreover, multi-pass compilers are easier to verify and validate, because testing and proof of correctness can be accomplished for the smaller components. Multi-pass compilers are also used for translation to an assembler or machine language for a theoretical machine, and for translation to an intermediate representation such as bytecode. Bytecode, or pcode (portable code), is a compact encoding that uses a one-byte instruction, and parameters for data or addresses, to represent the results of syntax and semantic analysis. The bytecode or pcode is independent of a particular machine. It is executed by a virtual machine residing on the target machine.
-
1.4.2 Structure of a Compiler
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 01: “CS143 Course Overview”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 01: “CS143 Course Overview” (PDF)
Instructions: Select the link and read the CS143 Course Overview handout (not the lecture). You have studied the lecture “Introduction to Compilers” as part of section 1.2.1 above. The handout gives an overview of the structure of a compiler with a good explanation. This handout also applies to subsections 1.4.2.1 and 1.4.2.2 below.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 01: “CS143 Course Overview”
-
1.4.2.1 Front End
Note: The Course Overview handout, referenced in section 1.4.2 above, pages 2–5, to the end of Intermediate Code Generation, describes the front-end process, or analysis stage, of the compilation process. The intermediate code used as an example is TAC, three-address code. The front end, or analysis, consists of lexical analysis, syntax analysis or parsing, semantic analysis, and intermediate code generation. Lexical analysis may be preceded by preprocessing, depending on the language and the development environment.
-
1.4.2.2 Back End
Note: The Course Overview handout, referenced in section 1.4.2 above, also applies to this subsection. Read the handout pages 5–7, up to Historical Perspective. These pages explain the back-end, or synthesis state, of the compilation process, which includes intermediate code optimization, object code generation, and object code optimization. The symbol table and error-handling topics apply to both the front end and to the back end. The section on one-pass versus multi-pass complements the previous explanation, in sections 1.4.1.1 and 1.4.1.2, of the number of “passes” that a compiler uses.
-
1.4.2.3 Some More History
Note: The Course Overview handout has a very good section on history, which helps us understand why the structure of a compiler is as described in previous sections. Read the Historical Perspective section, which shows the important influence of programming languages on the development of compilers.
-
Unit 2: Formal Languages and Formal Grammar
Formal languages and formal grammars are the theoretical foundation for computer languages and the compilation process. Formal languages are defined by their grammars, which specify the syntax of the languages.
Unit 2 Time Advisory show close
Unit 2 Learning Outcomes show close
-
2.1 Motivation for Formal Languages
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 2: “Specifying languages with regular expressions and context-free grammars”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 2: “Specifying languages with regular expressions and context-free grammars” (PDF)
Instructions: Study slides 2 – 8, and slides 38 – 41. Regular and context-free languages are introduced. Also, read slides 68 – 70.
Terms of Use: The resource above is released under an Attribution-NonCommercial-ShareAlike License 3.0 Unported (CC BY-NC-SA 3.0) (HTML). It is attributed to MIT and the original resource can be found here (PDF).See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 08: “Formal Grammars”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 08: “Formal Grammars” (PDF)
Instructions: Select the Handout on Formal Grammars and read pages 10 – 11, which give some history of FORTRAN and ALGOL.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 2: “Specifying languages with regular expressions and context-free grammars”
-
2.2 Formal Grammars
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 3: “Introduction to shift-reduce parsing”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 3: “Introduction to shift-reduce parsing” (PDF)
Instructions: Please study slides 41–67.
Terms of Use: The resource above is released under an Attribution-NonCommercial-ShareAlike License 3.0 Unported (CC BY-NC-SA 3.0) (HTML). It is attributed to MIT and the original resource can be found here (PDF).See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 03: “Lexical Analysis”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 03: “Lexical Analysis” (PDF)
Instructions: Select the Handout on Formal Grammars and read pages 1 – 10. Formal languages are defined by formal grammars. Regular and context-free grammars are applied in scanning and parsing of programming languages. Formal grammars define the syntax of a formal language.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 3: “Introduction to shift-reduce parsing”
-
2.3 Syntax of Formal Languages
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 6: “Parsing”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 6: “Parsing” (PDF)
Instructions: Select the link for Lecture 6 and study the slides.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 6: “Parsing”
-
2.4 Semantics of Formal Languages
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 16 “Static Semantics Overview”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 16 “Static Semantics Overview” (PDF)
Instructions: Select the PDF link for lecture notes 16 and study slides 1 – 7. Note that these notes are associated with lecture 17.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lecture: “Lecture 17”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lecture: “Lecture 17" (YouTube)
Instructions: Listen to a portion of the video lecture 16 at location 12:12 (12 minutes and 12 seconds into the lecture) to 15:00, where Professor Hilfinger talks about semantics.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Assessment: MIT: S. Amarasinghe and M. Rinard, 6.035 Computer Language Engineering: “Practice Quiz 1”
Link: MIT: S. Amarasinghe and M. Rinard, 6.035 Computer Language Engineering: “Practice Quiz 1” (PDF)
Instructions: Select the PDF and complete exercises 1, 2, and 6. In exercise 1, just do the regular expression part of this exercise, which will give you practice in writing regular expressions for a regular language and in forming an FSA (finite state automaton—assessment 2) to recognize the language. The exercise takes the following steps:
regular language → regular expression
For exercise 2: Again we are given a regular language and asked to derive a regular expression that defines the language. L = {0i 1j | i is even and j is odd}. Then, complete exercise 6 as listed.
The reference's solutions are given at the end of the PDF. Additional exercise and solution information is available in The Saylor Foundation’s “Answer Key to CS304 Assessment 1.” (PDF)See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 16 “Static Semantics Overview”
-
Unit 3: Finite State Machines
Finite state machines (FSM), also called finite state automata (FSA), are conceptual models for recognizing, parsing, and generating strings in a formal language. An FSM can be used to recognize (i.e., determine) whether a string adheres to the syntax of a language. Moreover, an FSM can be used to build a syntax tree, which shows the derivation (i.e., how the string was constructed) of the string. This unit introduces (or reviews) FSMs, which are covered in detail in other courses (for example, CS202: Discrete Structures).
Unit 3 Time Advisory show close
Unit 3 Learning Outcomes show close
-
3.1 Definitions
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 2: “Specifying languages with regular expressions and context-free grammars
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 2: “Specifying languages with regular expressions and context-free grammars” (PDF)
Instructions: Study slides 9 – 33. These slides describe and give examples of regular languages/regular expressions and their corresponding finite state machine.
Terms of Use: The resource above is released under an Attribution-NonCommercial-ShareAlike License 3.0 Unported (CC BY-NC-SA 3.0) (HTML). It is attributed to MIT and the original resource can be found here (PDF).See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 2 “Lexical Analysis, Regular Expressions” and Notes for Lecture 3 “FSA”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 2 “Lexical Analysis, Regular Expressions” (PDF) and Notes for Lecture 3 “FSA” (PDF)
Instructions: Select the PDF link for lecture notes 2 and study all the slides. Then, select the PDF link for lecture notes 3 and study slides 1 – 5.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 2: “Specifying languages with regular expressions and context-free grammars
-
3.2 Some Results and Examples of FSAs
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 2: “Specifying languages with regular expressions and context-free grammars”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 2: “Specifying languages with regular expressions and context-free grammars” (PDF)
Instructions: Study slides 34 – 76. These slides give results and examples of conversion from a non-deterministic finite state automaton (NFA) to a deterministic finite state automaton (DFA); regular expressions or strings; context free grammars and pushdown automata (PDA); and context sensitive grammars and Turing machines. They, also, introduce parsing and abstract syntax trees.
Terms of Use: The resource above is released under an Attribution-NonCommercial-ShareAlike License 3.0 Unported (CC BY-NC-SA 3.0) (HTML). It is attributed to MIT and the original resource can be found here (PDF).See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 3 “FSA”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 3 “FSA” (PDF)
Instructions: Select the PDF link for lecture notes 3 and study slides 6 and 7.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 2: “Specifying languages with regular expressions and context-free grammars”
-
3.3 Some Applications of FSA
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 3 “FSA”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 3 “FSA” (PDF)
Instructions: Select the PDF link for lecture notes 3 and study slides 8 to the end. These notes repeat some of the material from section 3.2 and provide additional practice in applying FSAs.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Assessment: MIT: S. Amarasinghe and M. Rinard, 6.035 Computer Language Engineering: “Practice Quiz 1”
Link: MIT: S. Amarasinghe and M. Rinard, 6.035 Computer Language Engineering: “Practice Quiz 1” (PDF)
Instructions: Select the PDF for Practice Quiz 1. In this assessment, we will extend and do exercises 1, 2. In addition, you will need to complete exercise 3. For exercise 1: Do the FSA part of this exercise. This exercise gives you practice in forming an FSA (finite state automaton) to recognize a language. The exercise takes the following steps:
regular language → regular expression → FSA that recognizes the given language
For exercise 2, give the FSA for the regular expression of this exercise. Then, complete exercise 3 as listed.
The reference's solutions are given at the end of the PDF. For the exercises on FSAs, you should use paper and pencil to draw the FSA as it is formed. Additional exercise and solution information is available in The Saylor Foundation’s “Answer Key to CS304 Assessment2.”(PDF)See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 3 “FSA”
-
Unit 4: Scanning and Lexical Analysis
Lexical analysis is performed by a scanner, one of the front-end components of a compiler. The foundation for lexical analysis is provided by regular grammars and finite state automata. This unit studies scanners and lexical analysis in terms of development process products: requirements, functions, design, construction, and test. The verification of a scanner is done through testing. Validation is based on the programming language specifications, and operation of the scanner as a component of the compiler or application system that uses it.
Unit 4 Time Advisory show close
Unit 4 Learning Outcomes show close
-
4.1 Lexical Analysis Introduction and Overview
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video: “Lecture 2”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video: “Lecture 2” (YouTube)
Instructions: Please watch from the 4-minute mark to the 30-minute mark, and from the 37-minute mark to the end. This video gives you a glimpse into lexical analysis, which will be studied in more depth in the remaining sections of this unit.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video: “Lecture 2”
-
4.2 Requirements for a Scanner
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 3 and Lecture 1: “Lexical Analysis”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 3 and Lecture 1: “Lexical Analysis” (PDF)
Instructions: Select the PDF for Handout 3 and read the Basics on pages 1–2. Also, study the lecture, slides 1 – 12. This reading also applies to section 4.3.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 3 and Lecture 1: “Lexical Analysis”
-
4.3 Functions of a Scanner
Note: The reading in 4.2 also applies to this subunit. Pages 1 and 2 of the notes includescanner functions. Also, study lecture slides 13 – 46.
-
4.4 Review of Regular Expressions, FSAs, and Regular Languages
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 2: Lexical Analysis”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 2: Lexical Analysis” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please look over Chapter 2 on Lexical Analysis. If you are comfortable with this material just review it. If you feel you are somewhat uncomfortable with this material, read or study it to become comfortable with it; it is the foundation for much of our work on compilers. This reading applies to subsections 4.4.1 – 4.4.4.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 2: Lexical Analysis”
-
4.4.1 Regular Expressions to NFAs
Note: The reading of 4.4 applies to this subsection. A regular expression has associated non-deterministic finite automata (NFA) that accept it.
-
4.4.2 NFAs to DFAs
Note: The reading of 4.4 applies to this subsection. An NFA can be converted to a DFA, which is a finite automaton where the state transition is determined by the input symbol and the current state.
-
4.4.3 Space and Speed Optimization of DFAs
Note: The reading of 4.4 applies to this subsection. The recognition of a regular expression in a regular language can be done in several ways: by operating on the expression directly, using an NFA, or using a DFA. Also, an automaton can be transformed to an equivalent having fewer states. The size (that is, the number of states) and the speed of the automaton determine the most efficient way to recognize a regular expression. Note in section 2.7 of the reading the time estimates for processing an expression by a DFA and by an NFA.
-
4.4.4 Power of Regular Languages
Note: The reading of 4.4 applies to this subsection also. Regular languages include many languages and can be used for many applications, including scanners. Further, given regular languages, their union, concatenation, repetition, set difference, and intersection are also regular languages—we say that regular languages are closed under these operations. Regular languages can be expressed, equivalently, using regular expressions, NFAs, or DFAs. A key limiting characteristic of regular languages is seen from DFAs. A DFA is a finite automaton, and, thus, can remember only a finite number of symbols. Hence, a DFA cannot recognize a string of the type an b cn, for any n (because for any n it will require an infinite number of states to remember the number of a’s). Most computer languages are not regular and we will need to use a larger formal language class to parse them.
-
4.5 Design of a Scanner
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 3 and Lecture 1: “Lexical Analysis”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 3 and Lecture 1: “Lexical Analysis” (PDF)
Instructions: Select the PDF for Handout 3 and read from page 3 to the end, including Scanner Implementation 1 and 2, and the FORTRAN I Case Study. Also, study the lecture, slides 46–216.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 3 and Lecture 1: “Lexical Analysis”
-
4.6 Construction of a Scanner
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 2 “Lexical Analysis, Regular Expressions”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 2 “Lexical Analysis, Regular Expressions” (PDF)
Instructions: Select the PDF link for lecture notes #2, Lexical Analysis and Regular Expressions. Study all of its slides.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lecture: “Lecture 2”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lecture: “Lecture 2” (YouTube)
Instructions: Select the video link and watch Lecture 2.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 4 and Lecture 2: “Introduction to FLEX”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 4 and Lecture 2: “Introduction to FLEX” (PDF)
Instructions: Select the PDF for Handout 4 and study the notes. Also, select the PDF for Lecture 2 and study the Introduction to FLEX slides. FLEX is a scanner generator. It produces a scanner, given a description of the patterns to be identified and actions to take for each token.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 2: Lexical Analysis”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 2: Lexical Analysis” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please read section 2.9.1. Scanners are usually not written anew, but are generated by tools called scanner (or parser) generators.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 4 “FLEX in a Nutshell” and Lecture 2: “Introduction to FLEX”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 4 “FLEX in a Nutshell” (PDF) and Lecture 2: “Introduction to FLEX” (PDF)
Instructions: Select the PDF for Handout 4 and also Lecture 2 and read about FLEX.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Readings Lecture 2 “Flex Manual”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Readings Lecture 2 “Flex Manual” (PDF)
Instructions: Select the PDF link for the reading for the “Flex Manual,” which contains details on the Flex scanner generator.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 2 “Lexical Analysis, Regular Expressions”
-
4.7 Verification and Validation of a Scanner
Note: Scanners for production compilers or software systems must be verified (by proof, demonstration, review, or test, that shows that the requirements, design, and performance specifications are met) and validated (also by proof, demonstration, review, or test that shows that the scanner satisfies the needs of its users and its role in a larger containing system—either compiler or system application). The amount of effort expended in verifying and validating a scanner is dependent on the purpose and intended use of the scanner. For both verification and validation, the description of the input language must be shown to be correct. If the scanner is generated, then the quality of the scanner depends on the reliability and effectiveness of the generator.
- Assessment: ePaperPress: Tom Niemann's Tutorial on LEX and YACC
Link: ePaperPress: Tom Niemann's Tutorial on Lex and Yacc (PDF)
Instructions: Click on the link above to access the ePaperPress website. Then, click on the hyperlink for the PDF format. In this exercise, you will be introduced to LEX and YACC, which stand for Lexer (short for lexical analyzer) and Yet Another Compiler-Compiler. They are generators, i.e. programs that generate other programs, in this case, a scanner and a parser. Our focus in this Unit 4 Assessment will be on LEX, actually, FLEX – which stands for Fast Lexer. Read the entire PDF tutorial on LEX; it provides a concise description of LEX and YACC. Then, please follow these instructions for each exercise:
Exercise #1: Draw a high level block diagram depicting LEX and a C compiler, together with the input and output files for each of them.
Exercise #2: Give the LEX input file for replacing 00 with the text 'error' in the language L of Exercise #1 from Assessment 1.
Exercise #3 (optional): For this exercise, we will be using several software tools: a generator for scanners (lexical analyzers), a C compiler, and text editor for creating input files. We will use a scanner generator, called FLEX for creating scanners. FLEX is based on the earlier, original LEX. There are several good compilers available, cyngwin and gcc. We will use gcc. There are, also, several editors for creating input files; we'll use Notepad++.
a) Install FLEX on your computer. Installation guidance is given below.
b) If you do not have a C compiler, install a C compiler on your computer. Instructions for installing the gcc compiler can be found on YouTube linked here.
c) Check your answer to Exercise 2 by using FLEX and your C compiler to create a scanner; test the scanner by running it on a collection of sample input strings that are in the language of Exercise #1 from Assessment 1. Also, run it on some strings that are not in the language of Exercise #1.
You can check your answers with The Saylor Foundation’s “Answer Key” (PDF). This assessment should take approximately 3 hours to complete (not including installation of Flex and the Gcc compiler).
Installation Notes for FLEX, Bison, and the Gcc (Gnu Compiler Collection) Compiler
These notes give suggestions on the installation of FLEX, Bison (open source software for the original Lex and Yacc), and the gcc compiler.
To download and install FLEX (used for Exercise #2 and Exercise #3, and Bison used for Exercise #3):
1) Use your search engine to find a free software version of Flex and Bison to download; for example, type in 'free software foundation fast lexical analyzer' or 'free software foundation software for Yacc;’ look over the results and select a copy of FLEX and Bison to investigate. Be careful to avoid risky sites.
Look for a specific version of FLEX, for example, search for 'FLEX for Windows.'
OR
2) Go to the location here for Flex, here for a Windows version for Flex, and here for Bison.
3) After selecting a 'reliable' site, such as gnu.org, look for a version of FLEX for your operating system and follow the instructions for downloading and installing it, including documentation. This should also give you Bison. After downloading FLEX, if Bison is not included, repeat the steps 1), 2), 3) for Bison.
4) For a C compiler, you may already have one installed on your machine. If not use your search engine to search for “cyngwin C compiler” or “gcc C compiler” and look for a free software version that you can download. Be caution of risky web sites. For the gcc compiler, for example, search for “codeblocks.org gcc compiler”. Some of the results may refer to mingcc (minimal gcc compiler), which will work. Installation of a compiler can be involved and the use of an installer is recommended. A good tutorial on the installation of the gcc compiler can be viewed on YouTube.
Make sure that the version of FLEX or LEX, and Yacc and Bison, you obtain are compatible.
You can check your answers with The Saylor Foundation’s “Answer Key” (PDF)
Terms of Use: Please respect the copyright and terms of use on the webpages displayed above.See a broken link? Please let us know!
- Assessment: ePaperPress: Tom Niemann's Tutorial on LEX and YACC
-
Unit 5: Parsing and Syntax Analysis
The next step of the compilation process is parsing. This step also has a foundation in formal languages and automata. Parsing takes input from the Lexical Analysis step and builds a parse tree, which will be used in future steps to develop the machine code. In this unit, we will define parsing and identify its uses. We will also discuss two parsing strategies, Top-Down Parsing and Bottom-Up Parsing, examining what it means to approach parsing from each standpoint and taking a look at an example of each. By the end of the unit, you will understand parsing techniques with regards to compilers, and be able to discuss each of the two main approaches.
Unit 5 Time Advisory show close
Unit 5 Learning Outcomes show close
-
5.1 Parser Introduction and Overview
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please read through section 3.1.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes Lecture 6 “Parsing”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes Lecture 6 “Parsing” (PDF)
Instructions: Select the PDF link for the Notes for Lecture 6 on parsing. Again, this material overlaps some of the readings later in the class. However, they add additional information and have a practical perspective.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
-
5.2 Requirements of a Parser
- Reading: Wikipedia: “Parsing”
Link: Wikipedia: “Parsing” (PDF)
Instructions: Please read the following three paragraphs: “Programming languages,” “Overview of the process,” and “Types of Parsers.” Requirements of a parser include: build internal representation of the input tokens (which come from the scanner); check that the input string of tokens is legal in the language of the compiler (i.e., that it can be derived from the start symbol); and determine the derivation steps for the input string of tokens. In addition, the parser should be reliable, efficient (how efficient depends on the intended use), user friendly (i.e., provide clear, accurate messages), and supportable (assuming that the parser will be used for a long time).
Terms of Use: The article above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original Wikipedia version of this article here (HTML).See a broken link? Please let us know!
- Reading: Wikipedia: “Parsing”
-
5.3 Functions of a Parser
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please read subsections 3.4 – 3.6. The functions of a parser include: building an internal representation of the derivation tree and related parser information, and resolving ambiguities of the language pertaining to the input string of tokens.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 3: “Top Down Parsing”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 3: “Top-Down Parsing” (PDF)
Instructions: Select the PDF for Lecture 3 and read slide 4. The first bullet is a requirement statement and the third bullet is a function statement. An additional function is the output of meaningful and accurate messages, including error messages.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
-
5.4 Formal Language Considerations
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis” (PDF)
Instructions: Note that this subsection is just a review of what has been covered in units 2 and 3. Select the link and click “Get the book contents” to download a PDF version of the book. Please read subsections 3.2 and 3.3. The main idea is that context free languages are used to build efficient parsers, but are supplemented with special techniques to resolve ambiguities.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
- 5.5 Design of a Parser
-
5.5.1 Top-Down Parsers
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please study 3.7 – 3.13, on pages 68 – 88.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Notes 9 and Lectures 3 and 4: “Top-Down Parsing”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Notes 9 and Lectures 3 and 4: “Top-Down Parsing” (PDF)
Instructions: Select the PDF for Notes 9 and Lectures 3 and 4 on Top-Down parsing.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 4: “Top-Down Parsing”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 4: “Top-Down Parsing” (PDF)
Instructions: This material overlaps some of the previous readings. However, it presents the material using examples. Scan over the slides, studying those that you feel will benefit you.
Terms of Use: The article above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original version of this article here (PDF).See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 8 “Top-down parsers.”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 8 “Top-down parsers.” (PDF)
Instructions: Select the PDF links. This material overlaps previous readings, but provides a practical view. Look over the material and study the parts you feel will benefit you.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Web Media: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 7” and “Lecture 8”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 7” (YouTube) and “Lecture 8” (YouTube)
Instructions: Please view both lectures.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
-
5.5.2 Bottom-Up Parsers
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please study from 3.14 – 3.18, on pages 88 – 104.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Notes 10 and 11 and Lectures 4, 5, 6: “Bottom-Up Parsing”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Notes 10 and 11 and Lectures 4, 5, 6: “Bottom-Up Parsing” (PDF)
Instructions: Select the PDF for Notes 10 and 11, and Lectures 4, 5 (1–263), 6 (1–47) on Bottom-Up parsing.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 3: “Introduction to shift-reduce parsing”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 3: “Introduction to shift-reduce parsing” (PDF)
Instructions: This material overlaps some of the previous readings. However, it presents the material on bottom-up parsing using examples. Scan slides 1–59, studying those that you feel will benefit you.
Terms of Use: The article above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original version of this article here (HTML).See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lectures 10 and 12 “Parsing: Earley’s Algorithm” and “Bottom-up parsing”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lectures 10 and 12 “Parsing: Earley’s Algorithm” (PDF) and “Bottom-up parsing.” (PDF)
Instructions: Select the PDF links. This material overlaps and reinforces previous readings, and, in addition, provides a practical view. Look over the material and study that which you feel will benefit you.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 10,” “Lecture 11,” “Lecture 12,” “Lecture 13,” “Lecture 14”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 10,” (YouTube) “Lecture 11,” (YouTube) “Lecture 12,” (YouTube) “Lecture 13,” (YouTube) “Lecture 14” (YouTube)
Instructions: View the entirety of the lectures above.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
-
5.6 Construction of a Parser
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please study from 3.15 to the end of Chapter 3.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 12 “Bottom-up parsing.”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 12 “Bottom-up parsing.” (PDF)
Instructions: Select the PDF links. Study slides 36 – 48. Note slide 47, which summarizes in a diagram the various parsing algorithms.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 13,” and “Lecture 14”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 13,” (YouTube) and “Lecture 14” (YouTube)
Instructions: Select the lecture 13 video and view from the 26-minute mark to the end of the video lecture. Also, view lecture 14.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Notes 12, 14, and 16 and Lectures 5, 6, and 7: “Bottom-Up Parsing”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Notes 12, 14, and 16 and Lectures 5, 6, and 7: “Bottom-Up Parsing” (PDF)
Instructions: Select the PDF for Notes 12, 14, and 16 and study them. Select lectures 5 (264 to the end), 6 (48 to the end), and 7 (91 to the end) and read the indicated slides.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 3: “Introduction to shift-reduce parsing” and “Parse table formats” and Lecture 4: “Intermediate formats”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 3: “Introduction to shift-reduce parsing” (PDF) and “Parse table construction” (PDF)
Instructions: This material overlaps, but also complements, some of the previous readings. Scan over the slides, studying those that you feel will benefit you. In particular, in the introduction to shift-reduce parsing, just look at slides 60 to the end.
Terms of Use: The article above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original version of this article here (PDF).See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 3: Syntax Analysis”
-
5.7 Verification and Validation of a Parser
Note: The note in section 4.7 on Verification and Validation (V&V) of a Scanner also applies to the V&V of a parser. If a parser is constructed using a parser generator, V&V depends on the V&V of the generator and the correctness of the input to the generator. Formal proof methods, reviews, and testing can be used to verify the correctness of a parser (i.e., that it produces the correct derivation for a legal program and that it outputs correct error messages for incorrect syntax in the program). Formal proof methods alone are not practical for current programming languages, but in conjunction with testing can be used to verify the correctness of a parser. Validation (i.e., satisfaction of the requirements of a parser) requires testing and operational use.
- Assessment: ePaperPress: Tom Niemann's “Tutorial on LEX and YACC”
Link: ePaperPress: Tom Niemann's “Tutorial on LEX and YAC” (PDF)
Instructions: Select the link above and choose the PDF format. In this exercise, you will be introduced to LEX and YACC, which stand for Lexer (short for lexical analyzer) and Yet Another Compiler-Compiler. They are generators, i.e. programs that generate other programs, in this case, a scanner and a parser. Our focus in this Assessment will be on YACC. Select the link for the PDF for Niemann's Tutorial above, and read the part of the tutorial on YACC. This tutorial explains YACC and how YACC and LEX interface. LEX and YACC are the original programs, and just as Flex is an open software version for LEX, Bison is an open software version for YACC.
Exercise #1: Add a block diagram depicting YACC, C compiler, and the connections with their input and out files, and the connections to LEX.
Exercise #2: In Assessment 3, we developed a scanner for recognizing the language L of Exercise #1 of Assessment 1. Here, instead of using LEX, we will use YACC. YACC builds a context free syntax analyzer, i.e. a parser for the language we want to analyze; LEX builds a regular expression analyzer. Because the former is more powerful than the latter, whatever LEX can do, YACC should be able to also do. For this exercise, describe L by a grammar that can be input to YACC to create a parser for recognizing the strings of L.
Exercise #3 (Optional): For this exercise, we will be build the parser for the language L, used in #3 above, to check our solution to #3. We will use several software tools: a generator for scanners (lexical analyzers), a C compiler, and text editor for creating input files, and a parser generator. We will use a scanner generator, called FLEX, based on the original LEX. The particular version of Flex we will use is Win_Flex, a FLEX scanner generator that runs on Windows. There are several good compilers available, cyngwin and gcc. We will use gcc. There are also several editors for creating input files; we will use Notepad++. For the parser generator, we will use Bison, an open source parser generator based on the original YACC. The version of Bison that we will use is Win_Bison, a version that runs on Windows. You may download the executables for Win_Flex and Win_Bison.
You can check your answers with The Saylor Foundation’s “Answer Key” (PDF). This assessment should take approximately 3 hours to complete (not including installation of Bison and the Gcc compiler).
Installation Guidance for installing Flex, Bison, and the gcc compiler were given at the end of the problem statement for Asseesment 3 Exercise #4. Additional guidance is included in the following problem statement.
a) Install Bison on your computer. (FLEX should already be installed from Exercise #3). Installation of Bison may have been done when you installed FLEX. If not, see the Installation guidance in the instructions for Assessment 3. The installation of Bison includes documentation. Good manuals can be found here for FLEX for Windows and here for Bison for Windows.
b) If you do not have a C compiler, install a C compiler on your computer. Instructions for installing the gcc compiler can be found on YouTube.
c) Check your answer to Exercise 2 by using Bison and your C compiler to generate a simple parser for language L – strings of 0's and 1's, not including 00, and ending in 1. Test the parser by running it on a collection of sample input strings that are in the language L. Also, run it on some strings that are not in L.
Terms of Use: Please respect the copyright and terms of use on the webpages displayed above.See a broken link? Please let us know!
- Assessment: ePaperPress: Tom Niemann's “Tutorial on LEX and YACC”
-
Unit 6: Syntax Directed Translation and Semantic Analysis
Semantic Analysis takes input from the parsing process and prepares the code for the code-generation step. In this unit, we will discuss this process in detail, learning about scope and type-checking, type expression, type equations, and type inference.
Unit 6 Time Advisory show close
Unit 6 Learning Outcomes show close
-
6.1 Syntax-Directed Translation and Attribute Grammars
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 5: Interpretation”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 5: Interpretation” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please look over Chapter 5 on interpretation, where execution of a program takes place as the derivation is produced.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 16 “Syntax-directed translation”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 16 “Syntax-directed translation” (PDF)
Instructions: Please study the definitions and examples. Syntax-directed translation and attribute grammars are techniques for using the parser to drive the translation directly. Attributes are properties of grammar symbols, and the attributes take on values. Rules associated with each grammar production specify how to compute the value of the attributes.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 5: Interpretation”
-
6.2 Intermediate Representation
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 5: “Intermediate formats”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 5: “Intermediate formats” (PDF)
Instructions: Please study the slides on intermediate representations. Data, as well as computations and flow of control, have intermediate representations. This reading also applies to subsections 6.2.1 and 6.2.2 below.
Terms of Use: The article above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original version of this article here (PDF).See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 5: “Intermediate formats”
-
6.2.1 Representation of Program Data—Symbol Tables
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 4: Scopes and Symbol Tables”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 4: Scopes and Symbol Tables” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please read Chapter 4 on scope of names and symbol tables.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 4: Scopes and Symbol Tables”
-
6.2.2 Representation of Program Computation
Note: The reading of section 6.2, slides 68 to the end, cover this subsection topic.
-
6.3 Functions of Semantic Analysis
Note: Semantic analysis includes checking the scope of names in a program and checking for type errors.
-
6.3.1 Scope Checking of Names in a Program
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 18 “Semantic analysis”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 18 “Semantic analysis” (PDF)
Instructions: Select the PDF for Handout 18 and read this overview of semantic analysis. See slide 10 for the functions of semantic analysis.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 8: “Semantic Analysis”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 8: “Semantic Analysis” (PDF)
Instructions: Select the PDF for Lecture 8 and study slides 5 – 103.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 18 “Semantic analysis”
-
6.3.2 Scope Checking with Inheritance
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 8: “Semantic Analysis”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 8: “Semantic Analysis” (PDF)
Instructions: Select the PDF for Lecture 8 and study slides 104 – 137 on scope checking for object-oriented languages.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage aboveSee a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 8: “Semantic Analysis”
-
6.3.3 Static vs. Dynamic Scope Checking
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 8: “Semantic Analysis”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 8: “Semantic Analysis” (PDF)
Instructions: Select the PDF for Lecture 8 and study slides 138 to the end, on scope checking at run-time. Static scope checking occurs at compile-time; dynamic scope checking occurs at run-time. The scope checking in 6.3.1 and 6.3.2 above is static scope checking.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 16 “Static Semantics Overview”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 16 “Static Semantics Overview” (PDF)
Instructions: Select the PDF link. These slides overlap the readings above, but the 10 slides give a nice review of the material.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 8: “Semantic Analysis”
-
6.3.4 Type Checking
Note: Type checking is one of the static (i.e., compile-time) checks of semantic analysis. Others are flow-of-control and uniqueness checks. Checking is done when the intermediate representation is being created.
- Reading: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 6: Type Checking”
Link: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 6: Type Checking” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please read Chapter 6 (pages 133 – 143), which presents an overview of type checking, nicely organized: symbol table environment, type checking for expressions, functions, and then, for programs.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 6: Type Checking”
-
6.3.4.1 Type Expressions, Type Equivalence, Type Inference, and What to Check
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 6: “Semantic Analysis”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 6: “Semantic Analysis” (PDF)
Instructions: Please study the slides on type systems and what to check for when building intermediate representations for various language constructs.
Terms of Use: The article above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original version of this article here (PDF).See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 17 “Types”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 17 “Types” (PDF)
Instructions: Select the PDF link. There is overlap between this reading and the above readings. Look over slides 2 through 14 and use them as a review or supplement to the above readings. Slides 15 through 31 give examples of type inference for the languages Prolog, Java, and Python.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 18,” “Lecture 19,” “Lecture 20,” and “Lecture 21”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 18,” “Lecture 19,” “Lecture 20,” and “Lecture 21” (YouTube)
Instructions: Video lectures 18 to 21 correspond to these notes and are optional. They may be helpful in understanding some of the notes.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 6: “Semantic Analysis”
-
6.3.4.2 Type Systems as Proof Systems—Type Checking as Proofs
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 9: “Type Checking”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 9: “Type Checking” (PDF)
Instructions: Select the PDF for Lecture 9 and study the very interesting presentation on type checking by proofs.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 22 “Type Inference and Unification”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 22 “Type Inference and Unification” (PDF)
Instructions: Select the PDF link. Look at slides 1 through 8. The slides are somewhat formal and present a “type calculus.” Use these slides to add to your understanding of types.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lecture: “Lecture 22”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lecture: “Lecture 22” (YouTube)
Instructions: Video lecture 22 corresponds to these notes and is optional. If it adds to the notes and helps you, view it.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 9: “Type Checking”
-
6.3.4.3 Applications of Type Proofs
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 10: “Type Checking II”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 10: “Type Checking II” (PDF)
Instructions: Select the PDF for Lecture 10 and study the application of the type proof system (introduced above) to the detection of type errors.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 10: “Type Checking II”
-
6.3.4.4 Type Equations, Unification and Binding of Type Expressions
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 22 “Type Inference and Unification”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 22 “Type Inference and Unification” (PDF)
Instructions: Select the PDF link. Look at slides 8 through 19. The slides supplement the readings above with type examples. A binding is a substitution of a type expression for a type variable.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 22 “Type Inference and Unification”
-
6.4 Verification and Validation of Semantic Analysis
Note: The note in section 5.7 on Verification and Validation (V&V) of a Parser also applies to the V&V of semantic analysis. By verification, we mean that the semantic analyzer performs a correct analysis relative to the programming language definition. Formal proof methods using a type calculus, reviews, and testing can be used to verify the correctness of the semantic analysis. Verification includes showing that the analyzer performs the required analysis, adheres to the programming language definition, and outputs correct error messages. “Valid” means that the semantic analysis satisfies its end requirements in the overall compilation process. Requirements include processing time and memory space targets, maintainability, and reliability.
- Assessment: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 6: Type Checking”
Link: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 6: Type Checking” (PDF)
Instructions: Please click on the link above, and select “Get the Book Contents” to download the PDF file of Mogensen’s “Chapter 6: Type Checking.” Complete the following:
Exercise #1: Do Exercise 6.1 in Mogensen’s “Chapter 6 Type Checking” on page 143.
Exercise #2: Do Exercise 6.2 and 6.3 in Mogensen’s “Chapter 6 Type Checking” on pages 143 and 144.
These exercises will give you some practice with semantic analysis and are based on Chapter 6, which covers type checking. Then, please check your answers against the solutions in the Saylor Foundation’s “Answer Key to Assessment 5.” (PDF)
This assessment should take you approximately 3 hours to complete.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Assessment: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 6: Type Checking”
-
Unit 7: Runtime Environment
Runtime environment considerations include organization of the compiled program, storage, building blocks of a compiled program, and different runtime configurations. This unit has some overlap with Unit 8. The emphasis in this unit is on representation of needed data structures and techniques for objects and inheritance. The emphasis of Unit 8 is on instruction generation.
Unit 7 Time Advisory show close
Unit 7 Learning Outcomes show close
-
7.1 Overview of Runtime Environments
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 22: “Runtime Environments”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 22: “Runtime Environments” (PDF)
Instructions: Select the PDF for Handout 22. Read the handout, which discusses data representations and their organization in memory for a program. This reading also applies to subunits 7.1.3 through 7.1.5 below. After semantic analysis, intermediate representations are encoded. This is the last step of the “front-end” of the compilation process. The relationship of front ends to back ends can be many-to-one or one-to-many. In the former case, a single back end is used for several languages, each handled by its own front end. In the latter case, one front end handles the input, source language, and the back end is used for several target machines, each having its own back end.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 22: “Runtime Environments”
-
7.1.1 Functions and Requirements of Intermediate Representation (IR)
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 11: “Runtime Environments”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 11: “Runtime Environments” (PDF)
Instructions: Select the PDF for Lecture 11. Read the functions of IR generation on slides 6 and 7. Some requirements for IR generation are on slide 8.
Note: This reading applies to 7.1.2. Slides 9 through 18 present a generic design for IR generation. Note: This reading also applies to 7.1.3.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 11: “Runtime Environments”
-
7.1.2 Design of the IR Step
Note: The reading of 7.1.1 applies to this subunit.
-
7.1.3 Data Structure Encoding in Memory
Note: The reading of 7.1 applies to this subunit. Encoding of primitive types and arrays are discussed. From the 7.1.1 reading, see slides 19 through 37 for additional coverage.
-
7.1.4 Scope and Extent of Variables
Note: The reading of 7.1 applies to this subunit. Scope of a variable is the lexical area of a program in which the variable can be used. Extent, or lifetime, is the period of time that a variable exists.
-
7.1.5 Runtime Stack and Parameter Passing
Note: In addition to the following reading, the reading of 7.1 applies to this subunit.
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 11: “Runtime Environments”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 11: “Runtime Environments” (PDF)
Instructions: Select the PDF for Lecture 11. See slides 38 through 118 for a discussion of the stack, activation trees, closure and coroutines, and parameter passing.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 25 “Run-time Organization”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 25 “Run-time Organization” (PDF)
Instructions: Select the PDF link for the notes for Lecture 25. This reading continues the topic of 7.1, with some differences. In this presentation, IR is treated as part of code generation, and it has a lot of detail on the run-time encoding for procedures and functions.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 25,” “Lecture 26,” “Lecture 27”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 25,” (YouTube) “Lecture 26,” (YouTube) “Lecture 27” (YouTube)
Instructions: Please view the lectures above.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 11: “Runtime Environments”
-
7.2 More Complicated Representations: Objects and Inheritance
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 12: “Runtime Environments II”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 12: “Runtime Environments II” (PDF)
Instructions: Select the PDF for Lecture 12. Read the lecture slides. Encoding of objects, inheritance, and dynamic type checking are described, including difficulties and solutions.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 12: “Runtime Environments II”
-
7.3 Encoding of Exceptions and OOP (Object-Oriented Programs)
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 28 “Exceptions, OOP”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 28 “Exceptions, OOP” (PDF)
Instructions: Select the PDF link for the notes for Lecture 28. This reading continues the topic of 7.2.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 28,” “Lecture 29,” “Lecture 30”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 28,” (YouTube) “Lecture 29,” (YouTube) “Lecture 30” (YouTube)
Instructions: Select the link for the three video lectures. Scan over these and listen to the parts that will benefit you. They correspond to the Notes for Lecture 28.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 28 “Exceptions, OOP”
-
Unit 8: Code Generation
This unit is closely related to Unit 7, where the emphasis was on representation of data structures needed for run-time. While there will be some overlap, the emphasis in this unit is on instruction-level intermediate code generation and from intermediate code to target code.
Unit 8 Time Advisory show close
The lastphase (or next to the last phase if there is a code optimization phase) of the compilation process is code generation, where the output from the previous steps is finally translated into machine code, ready to execute on the target platform. In this unit, we will start with a discussion of code generation in general before moving on to a more detailed description of the code generation process. This will include an in-depth discussion of three main areas: Instruction Selection, Instruction Scheduling, and Register Allocation. By the end of this unit, you will have a firm understanding of the code generation process.
Unit 8 Learning Outcomes show close
-
8.1 Introduction to Intermediate Code Generation
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 7: “Intermediate-Code Generation”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 7: Intermediate-Code Generation” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please read Chapter 7. Read the entire chapter, which discusses generation to a relatively low-level intermediate language. Section 7.9 overlaps some of the previous readings. This reading also applies to 8.1.1.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 7: “Intermediate-Code Generation”
-
8.1.1 Requirements and Overview
Note: The reading of 8.1 applies to this subunit. Requirements of intermediate-code generation are included in sections 7.1 and 7.2.
-
8.1.2 Examples of Abstract Syntax Tree to Intermediate-Code Generation
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 23: “Intermediate Representation”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 23: “Intermediate Representation” (PDF)
Instructions: Select the PDF for Handout 23. Intermediate representation of a source program may be part of the front end, may be part of the back end, depending on the design of the compiler and its intended use, or may be simply called the “middle part” of the compiler between the front end and back end. Handout 23 covers IR in the context of code generation.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage aboveSee a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 23: “Intermediate Representation”
- 8.2 Detailed Example: Three Address Code (TAC)
-
8.2.1 TAC Language and Generation from an AST
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 24: “TAC Examples” and Lecture 13 “TAC”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 24: “TAC Examples” (PDF) and Lecture 13 “TAC” (PDF)
Instructions: Select the PDF for Handout 24. TAC is a generic assembly language. Read Handout 24. Slides 1 through 149 give examples of TAC statements, function calls and encoding of objects.
Note: This reading also applies to 8.2.2. Slides 150 to 173 give examples of generation of TAC from an AST (abstract syntax tree). Select the link for the video, skim it and, if it is helpful to you, watch it.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 24: “TAC Examples” and Lecture 13 “TAC”
-
8.2.2 Generation of TAC
Note: The reading of 8.2.1 applies to this subunit. The lecture slides 150 through 173 describe the generation of TAC.
-
8.3 Additional Intermediate Language Generation Examples
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 31: “Code Generation”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 31: “Code Generation” (PDF)
Instructions: Select the PDF link for the notes for Lecture 31 on Code Generation. These discuss generation of intermediate code for a stack machine, stack machine with accumulator, and for a TAC machine.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lecture: “Lecture 31”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lecture: “Lecture 31” (YouTube)
Instructions: The video lecture 31 is optional, and is listed as an aid if you have any questions on the notes.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 31: “Code Generation”
- 8.4 Generation of Machine Code
-
8.4.1 Introduction: Intermediate Code to Machine Code
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 8: Intermediate-Code Generation”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 8: Machine-Code Generation” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please read Chapter 8. It describes the translation from a low-level intermediate language to machine code. This translation addresses the handling of restrictions on the machine language; for example, a finite number of registers is a restriction of a target machine and its machine language. The intermediate language assumes an unlimited number of registers, i.e., virtual register machine.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 8: Intermediate-Code Generation”
-
8.4.2 Details: Intermediate Code to Machine Code
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 34: “Registers, Functions, Parameters”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 34: “Registers, Functions, Parameters” (PDF)
Instructions: Select the PDF link for the notes for Lecture 34 on Code Generation. Read these excellent notes on generation of machine code from intermediate code.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 9: “Register Allocation” and “Chapter 10: Function Calls”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 9: Register Allocation” (PDF) and “Chapter 10: Function Calls” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please look over Chapters 9 and 10. Chapter 9 overlaps the above reading, and is a supplementary discussion of the problem of mapping a large number of variables used in an intermediate language translation into a smaller number of registers, plus memory locations available on the target machine. Chapter 10 covers function calls using a stack, activation records, and frame pointers. Look over these chapters and read the parts that will add to your understanding of register allocation and function calls.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 32,” “Lecture 33,” “Lecture 34”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lectures: “Lecture 32,” (YouTube) “Lecture 33,” (YouTube) “Lecture 34” (YouTube)
Instructions: Please watch the videos above.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lectures 17 and 18: “Register Allocation” and “Garbage Collection”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lectures 17 and 18: “Register Allocation” (PDF) and “Garbage Collection” (PDF)
Instructions: Select the PDF for Lectures 17 and 18. These are very well-done formal presentations and give a lot of detail. Register allocation, linear scan and Chaitin’s algorithm are explained. Regarding garbage collection, reference counting, mark-and-sweep, and stop-and-copy are explained.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 34: “Registers, Functions, Parameters”
-
8.5 Verification and Validation of Code Generation
Note: As with previous parts of a compiler and previous steps of the compilation process, we are interested in the verification (i.e., does the code generator work correctly) and the validation (i.e., does the code generator satisfy the requirements as part of the compiler). The same kinds of techniques used for the previous parts of a compiler apply, as well, to the code generator.
If our code generator produces intermediate code and we have a known “correct” generator from intermediate code to machine code, then our V&V efforts can focus on the translation to intermediate code. Moreover, if we have a known “correct” compiler from another source language that is similar to our source language and satisfies similar requirements, we could write a simple translator between them and show that the simple translator is correct. Again, if we want to verify intermediate code to assembler code, and we have a known “correct” generator from the same intermediate language to another assembler language, we could write a simple translator between the assembler languages and then verify that translator.
The use of an intermediate language has another benefit: it can make the use of formal proofs of correctness easier than their use on AST to target language generation. Lastly, measurements of process time and memory usage will be needed to validate time and space requirements.- Assessment: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 7: Intermediate Code Generation”
Link: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 7: Intermediate-Code Generation” (PDF)
Instructions: Please click on the link above, and then select “Get the book contents” to download a PDF version of the book. Complete the following:
Exercise #1: Do Exercise 7.1 in Mogensen’s “Chapter 7 Intermediate-Code Generation” on page 173.
Exercise #2: Do Exercise 8.2 in Mogensen’s “Chapter 8 Machine-Code Generation” on page 189.
These exercises will give you some practice with code generation and are based on Chapter 7, which covers intermediate code generation, and Chapter 8, which covers machine code generation. Intermediate code can be high level, i.e. close to the input language, or low level, i.e. close to the target machine language, and, hence, though Exercise #2 jumps to Chapter 8, the process of syntax directed translation is the same. These exercises illustrate the process for syntax directed translation to intermediate code or machine code. You can check your answers with the Saylor Foundation’s “Answer Key” (PDF).
This assessment should take you approximately 3 hours to complete.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Assessment: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 7: Intermediate Code Generation”
-
Unit 9: Code Optimization
Simply compiling and executing a program is not enough to get the most out of your code. It is the optimization process that allows your code to run as effectively and efficientlyas possible. In this unit, we will first take a look at optimization, learning what it is and why we are interested in it. Next, we will review different optimization categories, including Peephole, Local, Loop, Language Dependent, and Machine Dependent. We will conclude with a discussion of different optimization techniques. By the end of this unit, you will have a basic understanding of a wide range of optimization techniques and how they improve the effectiveness of your program.
Unit 9 Time Advisory show close
Unit 9 Learning Outcomes show close
-
9.1 The What and Why of Code Optimizations
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 11: Analysis and Optimisation”
Link: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 11: Analysis and Optimisation” (PDF)
Instructions: Select the link and click “Get the book contents” to download a PDF version of the book. Please read Chapter 11. Read the introductory section, up to 11.1.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 11: Analysis and Optimisation”
-
9.2 Fundamentals of Code Optimization
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 36: “Local Optimization”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 36: “Local Optimization” (PDF)
Instructions: Select the PDF link for the notes for Lecture 36, slides 1 through 8. This reading also applies to subsections 9.2.1 through 9.2.4.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 9: “Introduction to Program Analysis and Optimization”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lecture 9: “Introduction to Program Analysis and Optimization” (PDF)
Instructions: Please read the slides, which introduce the topics of this unit.
Terms of Use: The resource above is released under an Attribution-NonCommercial-ShareAlike License 3.0 Unported (CC BY-NC-SA 3.0) (HTML). It is attributed to MIT and the original resource can be found here (PDF).See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 36: “Local Optimization”
-
9.2.1 Basic Blocks and CFG (Control Flow Graph)
Note: The reading for 9.2 applies to this subsection.
-
9.2.2 Types of Optimizations
Note: The reading for 9.2 applies to this subsection.
-
9.3 Local Intermediate Code Optimizations: Definitions and Examples
Note: The link for 9.2 applies to this subsection. Read slides 9 through 24. Slide 27 lists some challenges. Slides 25 and 26 apply to subsection 9.5 below on Code Optimization. Peephole optimization is a technique for improving assembly code.
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 26: “Code Optimization” and Lecture 14: “IR Optimization”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 26: “Code Optimization” (PDF) and Lecture 14: “IR Optimization” (PDF)
Instructions: Select the PDF for Handout 26. Read pages 1 through 10. This material overlaps with the previous reading, but it is a nice narrative summary of local optimizations. Select the PDF for the lecture and read it. The lecture is an excellent unified formal treatment of the topic
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 26: “Code Optimization” and Lecture 14: “IR Optimization”
-
9.4 Global Intermediate Code Optimizations: Definitions and Examples
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 37: “Global Optimization”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 37: “Global Optimization” (PDF)
Instructions: Select the PDF link for the notes for Lecture 37. Read the slides. Global optimization uses forward analysis (e.g., constant propagation), which moves information forward, and backward analysis (e.g., liveness), which moves information backward. Note slide 5, which states that dynamic properties of a program are undecidable.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 26: “Code Optimization” and Lectures 15 and 16: “Global Optimization” and “Global Optimization II”
Stanford University: Keith Schwarz’s CS143 Compilers: Handout 26: “Code Optimization” (PDF) and Lectures 15 and 16: “Global Optimization” (PDF) and “Global Optimization II” (PDF)
Instructions: Select the PDF for Handout 26. Read pages 10 through 16. This material overlaps the previous reading, but it is a nice narrative summary of global optimizations. In addition, it includes machine optimizations. Note the last section, “Optimization Soup,” which mentions the limitations of optimization. Select the PDF for the lectures and read them. The lectures cover the material in more detail and present it in a unified formal way using semilattices.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lectures 10, 11 and 12: “Introduction to dataflow analysis” and “Foundations of dataflow analysis”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lectures 10, 11 and 12: “Introduction to dataflow analysis” (PDF) and “Foundations of dataflow analysis” (PDF)
Instructions: Please scan over the slides. They repeat some of the material from above readings. However, they are an excellent review, and go into detail on the formalism that underlies global optimizations.
Terms of Use: The resources above are released under an Attribution-NonCommercial-ShareAlike License 3.0 Unported (CC BY-NC-SA 3.0) (HTML). They are attributed to MIT and the original resources can be found here (PDF).See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 37: “Global Optimization”
-
9.5 Code Optimization
Note: Slides 25 and 26 of the reading for 9.3, on peephole optimization, apply to this section.
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 19: “Code Optimization”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 19: “Code Optimization” (PDF)
Instructions: Select the PDF for Lecture 19, slides 1 through 225. This chapter discusses optimizations that depend on the machine. Read about pipelining, data dependency, instruction scheduling, locality optimization, loop reordering, structure peeling, and optimizations for parallelism.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lectures 13, 14, 15, 16, 17: “Introduction to code optimization: instruction scheduling,” “Loop optimization: instruction scheduling,” “More loop optimizations,” “Register allocation,” “Parallelization”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lectures 13, 14, 15, 16, 17: “Introduction to code optimization: instruction scheduling,” (PDF) “Loop optimization: instruction scheduling,” (PDF) “More loop optimizations,” (PDF) “Register allocation,” (PDF) “Parallelization” (PDF)
Instructions: Please scan over the slides. If you find a part that is helpful and adds to your knowledge of optimizations, read it. These slides repeat some of the material from above readings. However, they are an excellent review, concise and clear, and reinforce key points. They also provide additional examples.
Terms of Use: The resources above are released under an Attribution-NonCommercial-ShareAlike License 3.0 Unported (CC BY-NC-SA 3.0) (HTML). They are attributed to MIT and the original resources can be found here (PDF).See a broken link? Please let us know!
- Reading: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lectures 18: “Memory optimization”
Link: MIT: S. Amarasinghe and M. Rinard’s 6.035 Computer Language Engineering: Lectures 18: “Memory optimization” (PDF)
Instructions: Please read the slides on memory optimization. Recall that optimization includes memory as well as speed.
Terms of Use: The resource above is released under an Attribution-NonCommercial-ShareAlike License 3.0 Unported (CC BY-NC-SA 3.0) (HTML). It is attributed to MIT and the original resource can be found here (PDF).See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 19: “Code Optimization”
-
9.6 Verification and Validation of Code Optimization
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 26: “Code Optimization” and Lecture 14: “IR Optimization”
Link: Link: Stanford University: Keith Schwarz’s Handout 26: “Code Optimization” (PDF) and Lecture 14: “IR Optimization” (PDF)
Instructions: Select the PDF for Lecture 14. Read slides 18 through 20 and slides 24 through 32. These slides essentially state that an optimization must not change the behavior of the generated code. This includes introducing errors and changing the semantics. The V&V (verification and validation) techniques mentioned for the other parts of a compiler also apply to optimization. In addition, these slides suggest that the formalisms for optimization—e.g., soundness—are directly applicable.
Terms of Use: Please respect the copyright and terms of use displayed on the webpages above.See a broken link? Please let us know!
- Assessment: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 11: Analysis and Optimization”
Link: University of Copenhagen: Torben Ægidius Mogensen’s Basics of Compiler Design: “Chapter 11: Analysis and Optimization” (PDF)
Instructions: Please click on the link above, and select “Get the book contents” to download a PDF version of the book. Complete the following:
Exercise #1: Do Exercise 11.1 parts a) and b) in Mogensen’s “Chapter 11 Analysis and Optimization” on page 254.
This exercise will give you some practice with code optimization and is based on Chapter 11, which covers analysis and optimization. The exercise uses common subexpression elimination as an example of optimization. You can check your answers with the Saylor Foundation’s “Answer Key” (PDF).
This assessment should take you approximately 3 hours to complete.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Handout 26: “Code Optimization” and Lecture 14: “IR Optimization”
-
Unit 10: Compiler Verification and Validation
The verification and validation (V&V) of a compiler consists of the V&V of its parts: scanner or lexical analyzer, syntax analyzer, semantic analyzer, Intermediate Code Generator, Intermediate Code Optimizer, Code Generator, and Code Optimizer. See the V&V notes for each of these. V&V is based on a careful plan, a well-defined compiler process (including version control or more extensive configuration control, and efficient and effective documentation), peer reviews, measurements, (including defect analysis), component tests, formal proofs, and integration tests. It can utilize known correct tools, such as parser generators, code generation templates, and other compilers and compiler parts. In addition, a formal certification of the compiler can be done by an independent organization.
Unit 10 Time Advisory show close
Remember that verification pertains to correctness, which means satisfaction of specifications; and validation pertains to user needs, which means satisfaction of user operational requirements. Verification addresses, for example, correctness of results of scanning, semantic analysis, code generation, and code optimization. Validation includes meeting operational, functional, performance, and physical requirements for processing time and memory space, and also, the “ilities”—reliability, availability, maintainability, and usability.
V&V depends on the complications of the source and target programming languages, the intended use of the compiler, number of front ends to back ends, available tools, run-time environments and machine architecture, and the system that it will interface with.
Finally, certification of a compiler—i.e., compliance to a standard—may be necessary for safety or security of critical programs.
Unit 10 Learning Outcomes show close
- Reading: Wikipedia: “ANSI C”
Link: Wikipedia: “ANSI C” (PDF)
Instructions: Select the link and read about certification of ANSI C compilers.
Terms of Use: Thearticle above is released under a Creative Commons Attribution-Share-Alike License 3.0 (HTML). You can find the original Wikipedia version of this article here (HTML).See a broken link? Please let us know!
- Assessment: The Saylor Foundation’s “Assessment 8”
Link: The Saylor Foundation’s “Assessment 8” (PDF)
Instructions: Please review subunits 4.7, 5.7, 6.4, 8.5, and 9.6 of this course. They discuss verification and validation of lexical analyzers, parsers, semantic analyzers, code generators, and code optimizers, respectively. In this unit (Unit 10) and in these exercises, we combine the ideas of those sections and look at an integrated approach to the verification and validation of the compiler as a whole.
Exercise #1: Outline an approach for verifying software that translates from one programming language to another.
Exercise #2: Outline an approach for validating software that translates from one programming language to another.
This assessmentshould take you approximately 1 hour and 30 minutes to complete.
You can check your answers with the Saylor Foundation’s “Answer Key” (PDF)See a broken link? Please let us know!
- Reading: Wikipedia: “ANSI C”
-
Unit 11: Compiler Summary and Next Steps
This unit concludes our course study of compilers. We summarize the topics we have studied in the course and look ahead to further study to build upon what we have learned in this course. The topics covered may at first seem limited in their application to the development of compilers. However, their application is much broader, and helps us write better programs. The techniques, abstractions, and data structures are applicable to many other applications, including safety, security, high-performance applications, data and control flow analysis, internal or intermediate representation and encoding of structures, and optimization for external dependencies.
Unit 11 Time Advisory show close
Unit 11 Learning Outcomes show close
-
11.1 Compiler Summary
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 19: “Code Optimization”
Link: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 19: “Code Optimization” (PDF)
Instructions: Select the PDF for Lecture 19. Read slides 226 through 234, which provide a concise outline of the topics covered in our course. The link for this reading also applies to 11.2 below.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 39: “Summary”
Link: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Notes for Lecture 39: “Summary” (PDF)
Instructions: Select the PDF link for the notes for Lecture 39. Read the slides.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Web Media: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lecture: “Lecture 39”
Link: YouTube: University of California, Berkeley: Paul Hilfinger’s CS164 Programming and Compilers: Video Lecture: “Lecture 39” (PDF)
Instructions: Select the link for the lecture and watch the video.
Terms of Use: Please respect the copyright and terms of use displayed on the webpage above.See a broken link? Please let us know!
- Reading: Stanford University: Keith Schwarz’s CS143 Compilers: Lecture 19: “Code Optimization”
-
11.2 Next Steps
Note: The link of 11.1 also applies to this subunit. See Stanford slides 235 to 239 and the last slide of the Berkeley notes.
-
Final Exam
- Final Exam: The Saylor Foundation's CS304 Final Exam
Link: The Saylor Foundation's CS304 Final Exam
Instructions: You must be logged into your Saylor Foundation School account in order to access this exam. If you do not yet have an account, you will be able to create one, free of charge, after clicking the link.See a broken link? Please let us know!
- Final Exam: The Saylor Foundation's CS304 Final Exam
Questions? Consult the FAQs!

