Course Overview
This is a graduate course on the design and implementation of
optimizing compilers for modern programming languages and
computer architectures. The use of compiler generation
tools and algorithms for tree and graph
based representations of programs are emphasized in this course.
A substantial compiler construction project is the principal
assignment in this class.
Prerequisites
Students should be familiar with the following areas (as covered
in the indicated UNC courses):
- Regular and Context-free grammars and their recognizing automata.
(COMP 181: Models of Languages and Computation)
- Modern programming language concepts (e.g. strong typing, scope
rules, user-defined datatypes)
and their implementation techniques (e.g. storage management, procedure
call disciplines). (COMP 144: Programming Language Concepts)
In addition, familiarity with the general structure and operation of
compilers, interpreters, assemblers and linkers is recommended but not
required. (COMP 140: Translators).
Text and Readings
The general text for the class will be Appel's
Modern Compiler Implementation in Java.
Two other versions of this text, respectively using C and ML as the compiler
implementation language, are interchangeable
with the Java text.
For students who already have a suitable general text on compilers,
and plan to implement advanced optimizations, there are some
suitable advanced texts available including Morgan's
Building an Optimizing Compiler
and Allen & Kennedy's
Optimizing Compilers for Modern Architectures.
Some material will be covered from supplementary readings
to be distributed in class.
Assignments and Grading
Grades will largely be based on the course project. A small number of
written assignments and class participation contribute up
to 30% of the grade.
Course Project
The standard project
is the construction of a compiler for a small
language:
Tiger, the mini-language described in Appel's text,
or
mini-Java,
a subset of the Java language.
A standard compiler should generate code for a MIPS R2000/R3000 processor
to be executed under the SPIM simulator or linked to run on one of the
SGIs remaining in the department.
The project may be performed individually or by a team of two students.
The basic compiler can be extended in a number of different ways:
- Incorporation of additional language constructs
(e.g. abstract data types, generics)
- Inclusion of non-trivial program optimizations
(e.g. register allocation,
common subexpression elimination, strength reduction)
Each extension carries a point score. An extension is expected for a
team of two students, and recommended for individual projects.
In a team project, the point score of each student is 80% of the point
score for the project.
An alternate compiler project (differing in source language,
implementation language, or target language) can be substituted with
permission of the instructor.
The project and team composition must be finalized by September 10.
The project will include intermediate milestones and deliverables.
Syllabus
Topics and approximate number of lectures to be spent on each:
- Introduction (1)
- Lexical Analysis (2)
- description of lexical structure using regular expressions
- scanner operation
- scanner generators: Lex, JFlex
- Syntactic Analysis (5)
- description of syntactic structure using context-free grammars
- top-down and bottom-up parsers
- bottom-up table-driven parsers: LR, SLR, LALR
- bottom-up parser generators: Yacc, CUP, SableCC
- Semantic Analysis (5)
- abstract syntax trees (AST) - specification and traversal
- symbol tables and identifier resoltion
- type checking and type inference
- Code Generation (5)
- intermediate representation trees, basic blocks, and traces
- ABI: memory layout, register conventions,
and parameter passing
- instruction selection
- Program Optimization (7)
- control and data flow graphs
- dataflow analysis
- register allocation
- value numbering and common-subexpression elimination
- loop optimization
- static single-assignment form
- dependence analysis
- Special topics [examples] (4)
- partial evaluation and abstract interpretation
- optimizing object-oriented languages
- optimizing for the memory hierarchy
- vectorization
This page is maintained by
prins@cs.unc.edu.
Send mail if you find problems.