Compilers

This is the web-site for the Spring 2020 offering of graduate-level compilers at the University of Virginia.

Logistics

Time
Tue/Thur 15:30-16:45
Location
Mech. Eng. 339
Reading
No required textbook, we will work through the lecture notes developed by Anders Møller and Michael Schwartzbach on Static Program Analysis (SPA), and code (e.g., LLVM APIs, TIP Scala)

Expectations

This is a graduate course that will require you to learn new algorithms and implement those algorithms in multiple settings. This may require a degree of independence that you have not have had to practice in prior studies. This is a course where you will learn both analytically, e.g., by reading, thinking, and discussing, and by doing, e.g., by implementing systems and evaluating them.

Grading

There will be three different kinds of learning experiences in the class that will be graded:
Exercises (42%)
3 programming exercises, in Scala and C++, that add functionality to the TIP compiler and implement an LLVM pass
Pass analysis (12%)
read, understand, and explain an LLVM pass
Project (46%)
a programming project based in LLVM that, e.g., adds features to the existing TIP compiler, extends the TIP language and compiler.

Syllabus

The course syllabus is linked here.

Projects

This is a list of project ideas that will provide the chance to dig in to the details of representing, analyzing, and transforming programs with LLVM. These are based, primarily, on extending the implementation of tipc.
Type Checking
Currently tipc assumes that a program has been type checked by the Scala TIP analyzer. This project would do the type checking within tipc by implementing a union-find engine and the analog of TIP's type analysis.
Using Type Annotations
Running Scala TIP type checker produces a type-annotated TIP program, with a .ttip suffix. Currently tipc does not consume these annotations, but it could be extended to do so and this would pave the way for handling the rest of the TIP language, e.g., records.
Symbolic Traces
TIP includes a ConcolicEngine which mixes execution of the program with computation of the symbolic constraints governing the execution of a path. In this project, you will compute such symbolic constraints by instrumenting LLVM bitcodes and then emitting the constraints. An external script can be used to generate new inputs from the constraints to drive exploration of new program paths.
Arrays
TIP doesn't have these, but you can add them. This will require extending the TIP grammar and type checker. One approach would be to extend Scala TIP parser and type analysis and then consuming the type annotations generated in the .ttip file to perform code generation for array constructs.
Port Analyses
TIP implements a variety of analyses, reimplement several of them as an LLVM pass applied to bitcode generated by tipc. Compare your analysis results to existing LLVM analyses targetting the same problem.
Monotone Dataflow Framework
Build a monotone data flow analysis framework, i.e., providing solver, lattice, and other capabilities as in TIP scala, but instead in LLVM. Test it by building a few different analyses and compare your results to what TIP computes.
You will propose your project and have it accepted by me prior to March 29, 2020. I welcome preliminary conversations about possible projects.

Passes

This is a list of LLVM passes that you can select to study. Interesting studies might involve comparing two related analyses, e.g., constant propagation vs. sparse conditional constant propagation. These are either pure analysis passes or transformation passes that make use of computed analysis results. There are many more examples listed here

Resources

Helpful resources will be posted here as the need arises.

Schedule

The course will be broken into three parts: ~7 weeks of coverage of static program analysis, ~3 weeks covering static analysis in LLVM, and ~1 weeks covering code generation topics. We will also spend time discussing and working on projects. We may adapt the schedule as we go.

Date Topic Reading To Do
01/14/20 Class overview - -
01/16/20 TIP read SPA 1 and 2 -
01/21/20 Types read SPA 3 HW1 type analysis
01/23/20 Lattice Theory read SPA 4 -
01/28/20 Lattice Theory - -
01/30/20 Lattice Theory - -
02/04/20 Data flow analysis read SPA 5.1-5.10 HW1 type analysis due due
02/06/20 Widening and narrowing read SPA 5.11-5.12 HW2 data flow analysis
02/11/20 Widening and narrowing - -
02/13/20 Path sensitive analysis read SPA 6 -
02/18/20 Interprocedural analysis read SPA 7 -
02/20/20 Interprocedural analysis - HW2 data flow analysis due
02/25/20 LLVM, tipc, and projects - -
02/27/20 LLVM Passes - -
03/03/20 SSA SSA is Functional Programming -
03/05/20 Pass and Project Discussion - HW3 LLVM analysis, HW4 pass analysis
03/10/20 no class (Spring Break) - -
03/12/20 no class (Spring Break) - -
03/17/20 no class - -
03/19/20 Pass and Project Discussion - -
03/24/20 CFA read SPA 8, lectures posted to collab -
03/26/20 Pointer analysis read SPA 9, lectures posted to collab Project Proposal due Friday
03/31/20 Project Meetings: Nick, Farzana - -
04/02/19 Project Meetings: Trey, Andrew, Rohit - HW3 llvm analysis due
04/07/19 no class - -
04/09/19 Pass Presentations: Trey (Sparse Conditional CP), Andrew (Alias Analysis) - -
04/14/19 Pass Presentations: Nick (Stack Safety), Farzana, Rohit (DCE) - HW4 pass analysis due
04/16/19 TBD - -
04/21/19 Project Presentations: Trey, Andrew - -
04/23/19 no class - -
04/28/19 Project Presentations: Nick, Farzana, Rohit - Project due Friday