Homework #5

CS 374 Compilers
Homework #5

Due: Tuesday 3/6 at the beginning of class

Note: This work is to be done in assigned groups. Each group will submit one assignment. Although you may divide the work, both team members should be able to present/describe their partner's work upon request.

In the previous assignment, you constructed an abstract syntax tree (AST) with the necessary methods to use the visitor programming pattern. In this assignment, you will create two visitors. The first visitor will traverse the relevant declaration portions of the AST to build a symbol table (see Figure 5.7) of all MiniJava program identifier information. The second visitor traverses the relevant code portions of the AST in order to type-check the program and verify that it is a legal program. For example, 1 + (2 < 3) will legally parse, but is an illegal attempt to add an int type to a boolean type. Also, our parser does not check for the correct number of arguments in a function call, multiply defined identifiers, undefined identifiers, etc. In this phase, we'll do all necessary checking to ensure the legality of a program, and supply reasonable error messages if there are semantic errors in the code.

0. Preparation: You'll be using the same files from the previous homework. In addition, use this new Main and add the following files to your directory tree. Review the visitor pattern of 4.3. Then review the given examples of AST visitors from the last assignment. PrettyPrintVisitor takes the abstract syntax tree and prints the corresponding MiniJava code. ASTPrintVisitor prints an abstract syntax tree in the form Node(Subnode(...), ..., Subnode(...)) with occasional line breaks. Read Chapter 5 and give special attention to Figure 5.7.

1. Building the Symbol Table: For this part of the assignment, you'll create a visitor BuildSymbolTableVisitor (BSTV) that extends DepthFirstVisitor. When accepted by a Program root, BSTV should initialize a public symbolTable field with an empty symbol table and proceed to populate it with all non-main-method classes, methods, fields, parameters, and local variables. To make this easy and reduce the amount of parameter passing, you're encouraged to keep track of the current class and method as BSTV fields. You'll only need to visit parts of the tree that are relevant to building the symbol table. For example, a variable declaration is relevant, but a variable usage is not.

Since the Ch. 5 project specifications are very loose, you are free to implement the SymbolTable class as you wish. (Mine is thrown into package visitor, but you use an alternate package.) I would recommend designing a set of classes (e.g. Class, Method, Variable) and designing a data structure that follows that of Figure 5.7. Your symbol table will consist of a HashMap (or Hashtable) mapping Strings to Class objects. Each Class object will have a table for field variables, and a table for methods. Although Section 5.1's abstract treatment of symbol tables doesn't stress this, you will at some point need to commit to an ordering of these fields when we lay out our objects in memory. (A natural ordering is the order of declaration.) Not all Dictionary or Map classes will return an Enumeration or Iteration with ordering commitments. Now or later, you'll need to commit on and record an ordering of fields. Each Method object will have a return type, a table for parameters, and a table for locals. The number and ordering of parameters must be recorded for type-checking method calls. Thus you may wish to maintain ArrayLists of identifiers alongside your tables. Store whatever information you need, and feel free to add more as necessary in later stages.

While it is true that industrial strength symbol tables facilitate efficient lookup through the use of hash tables, etc., your key design goals here are correctness and simplicity, in that order. If you don't make use of the author's Symbol package interfaces, that's fine. Also, I'll not be overly concerned about the efficiency of your code.

Finally, it will be important for you to be able to check the correctness of your symbol table. You should therefore implement a toString method that summarizes the contents of the symbol table. Example outputs are given here.

2. Type-checking: For this part of the assignment, you'll create a visitor TypeCheckVisitor (TCV) that extends TypeDepthFirstVisitor (like DepthFirstVisitor, but returning types). TCV should be constructed with the symbolTable of the BSTV. Again, to make this easy and reduce the amount of parameter passing, you're encouraged to keep track of the current class and method as BSTV fields. You'll only need to visit parts of the tree that are relevant to type-checking. For example, a variable usage is relevant, but a variable declaration is not. Here is a sampling of possible MiniJava type-checking errors:

Return expression does not match return type for method ____
If statement condition must be of type boolean
Type mismatch in assignment to ___
And right side must be of type boolean
Multiplication left side must be of type integer
Argument ___ of ___.___ must be of type ___

Each of these errors can be implemented as a "System.err.println" message followed by "System.exit(1)". Finally, you may wish to divide the classes within the visitor into two categories, ones that return Types (e.g. expressions), and ones that don't, returning null (e.g. statements).

Supplemental Chapter Comments

(These comments are to supplement your reading. Question(s) asked are for you to think about on your own and need not be turned in with the homework.)

5.1: It should be noted that symbol tables do not merely map symbols to types. Symbols are mapped to bindings which are whatever we want them to be for a given language or scope. For instance, a simple local variable i may map to the type "int", but a class C will map to a binding which keeps track of fields and methods. Later on, we'll want to (1) know how an objects fields are laid out in memory, and (2) set/get labels that indicate the beginning of the assembly code for each class method. Think of the symbol table as a data structure which keeps track of identifier information throughout the compilation process. Figure 5.7 is the most important to grasp in this chapter.

5.2: <insert flame here> It's unclear how the first-phase example Program 5.8 is related to the interface of Section 5.1. For our purposes, you'll think of the MiniJava symbol table as having the structure of Figure 5.7. Scoping is rather simple in MiniJava. Variable references occur only in method statements. In such scope, one can refer to a method's locals and parameters, classes, and their methods. One cannot refer to object fields (which could be accoessed through getter methods). All local declarations are at the beginning of a method, so there is no nesting of scopes. More details on how to type-check MiniJava are given in the assignment above.