This phase of the compiler project involves designing symbol table management routines, which will be accessed by the lexical analyzer and semantic routines.
Our compiler uses three symbol tables:
a constant table, which contains constants (represented as strings) used in the program;
a global symbol table, which contains names known throughout all scopes of the Pascal program;
a local symbol table, which contains names known in a nested block while it is being parsed.
The constant and global tables are set up at the very outset of compilation, before the first token is scanned, and are not deleted until the compilation process terminates. The global table initially contains the names of all built-in functions, in our case, read and write, the (internal) program name main, and input and output, the filenames passed to the Pascal program). The names read and write should be stored in the symbol table as procedure entries (with 0 parameters). The names input and output can be stored as variable entries. (note that input and output are never referenced after the first mention, but they need to be in the symbol table so the names are not used otherwise. The constant table is initally empty.
Local tables are instantiated when the compiler begins to parse a procedure or function and deleted when parsing of that procedure or function ends. All of the information in each of the symbol tables is accessed using hashing. Each symbol table object will have the capability to insert a new entry and lookup an entry.
NOTE: Do not use the symbol table class to create a keyword table in the lexical analyzer–make it a simple hash table. Using the symbol table class in the lexical analyzer creates a a circular dependency that Maven and IntelliJ get very upset about.
The table management routines will do the following:
insert or search for a given identifier in the symbol table.
search for a given constant, and if not found, insert the constant in the table.
The table management routines also have to know if the parser is currently dealing with the main program or a procedure or function. Identifiers declared in the main program are stored in the global table, and identifiers declared in procedures and functions are stored in the local table. Identifiers referenced in a procedure or function are searched for first in the local table, and if not found, then in the global table. After a procedure or function is scanned and parsed, the local table containing identifiers for that procedure or function is deleted. Again, without the semantic actions, we cannot determine if we are in global or local scope.
Note that the name of a procedure or function is global–it is known in the main program and therefore stored as a global identifier.
The symbol tables are used by the semantic actions, which are called at appropriate points in the parse to insert names and information associated with them (when parsing declarations) and look up names used in the body of the program (to see that they have been declared and are used appropriately according to the specifications of their declarations). The semantic actions also use information about the stage of the parse to determine whether an identfier is inserted into the local or global table.
At this point, without the semantic actions, most of the information that will eventually be stored in the symbol tables is not available. So for this assignment, you will just set up the symbol table class and the symbol table entry classes for later use and test that they work.
For this assignment, you should do the following:
- Define a SymbolTable class including:
a hashtable, capacity (size) to be passed as a parameter to the constructor; see the description of the Java HashTable class for a discussion of considerations for determining initial capacity
a lookup method
an insert method
a size method that returns the number of entries in the table
a dumpTable method that prints the contents of the table (name and the contents of each field in the entry (most empty for now)
- a public method installBuiltins that installs the following reserved names in the symbol table:
All of these entries should have isReserved (see below) set to true.
MAIN, READ, and WRITE as Procedure entries with 0 parameters
INPUT and OUTPUT as IODevice entries
- Define a SymbolTableEntry base class. This class includes the following boolean methods, all of which return false in the base class and are over-ridden within individual entries as needed (upon instantiation if possible, or later by the semantic actions):
- Define a a set of sub-classes that extend the SymbolTableEntry class, one for each of the following types:
ArrayEntry (name [string], address [int], type [TokenType], upperBound [int], lowerBound [int])
ConstantEntry (name [string], type [TokenType])
FunctionEntry (name [string], numberOfParameters [int], parameterInfo [List], result [VariableEntry])
ProcedureEntry (name [string], numberOfParameters [int], parameterInfo [List])
VariableEntry (name [string], address [int], type [TokenType])
IODeviceEntry (name [string])
Define a SymbolTable Error class. The only error at this point (and maybe ever) is an attempt to insert a name into a table where it already exists.
Depending on the type of the entry, different fields will be required. The basic fields and their types for each are listed above. In this assignment only the name field will be used for all entry types except ProcedureEntry, which will also set numberOfParameters when inserting built-in names.
NOTE: You should enter the names of built-in procedures and IO devices to conform to the way your lexical analyzer normalizes names, as either all upper or all lower case.
Please follow the Submission instructions.