Now that we have a working symbol table we can build the scanner and parser required to recognize TVI programs. I have provided several files that contain data types and class defintions that you can use as a starting point. In particular I have provided a a scanner class that recognizes most of the TVI tokens and only needs to be extended to use your symbol table to recognize keywords and opcode names.

The BaseScanner class recognizes numbers, identifiers, and the single character tokens. It does not recognize TVI keywords or opcodes and reads input from the standard input stream. The class's only interaction with the input stream is through the protected virtual function FGetChar. The Scanner class extends the base class and overrides the FGetChar method to read characters from a file rather than the standard input stream. The Scanner class also adds an Open and Close method as well as a parameterized constructor that opens a file when the scanner is created. The TVIScanner class extends the Scanner class and overrides the FCollectId method, which the BaseScanner class uses to parse an identifier from the input stream. The TVIScanner method first invokes the base class method and then consults the symbol table to recognize keywords and opcodes.

 

The parser will need to perform two passes of the input file. During the first pass the parser will enter user defined variable names, labels and function names into the symbol table. The algorithm for the first pass is:

set memory_offset = 0
get a token
while the token isn't eof do
	if token == T_code then set mode = code section
	else if token == T_data then set mode = data section
	else if mode == data section then
		parse a data declaraction
		for each variable name in the data declaration do
			add variable name to the symbol table with memory_offset
			add sizeof(variable) to the memory_offset
		end for
	else if mode == code section then
		if token == T_int and next token == T_colon then
			the T_int is a label, add it to the symbol table
		else if token == T_begin then
			the next token is a T_id that is the name of the function
			add the function name to the symbol table
		end if
	end if
	get the next token
end while

Remember that a variable may be initialized in the data section, that is:

DATA
	LONG i, j=1, k = 2, l, m, n

 

During the second pass the parser can skip over data sections and parses the instructions in the code sections. Code sections contain lists of opcodes and procedures. A procedure is a list of opcodes between PROCBEGIN and PROCEND keywords.

A complete grammar for the TVI language can be found here.

The next step will be execution and it is time to give some thought as to howthat will be done. There are two options.

  1. The parser first creates a list of instructions. When the parser is finished the list of instructions is executed. The benefits of this approach are: speed, and the parser and scanner will be simpler. The drawback is there is an added step of creating an instruction list. This is a minor drawback since creating the list is straightforward.
  2. The parser executes the opcodes as it parses them. Function calls and gotos are implemented by repositioning the read pointer of the scanner to the appropriate position in the file and continuing parsing/executing. The advantage of this approach is simplicity. Less code, less complexity, fewer bugs. The drawbacks are that it will be slower, and the second pass of the parser never completes until the program finishes executing! The parser and scanner will also need to be slightly more complicated as the program execution code is mixed in with the parsing code and the scanner needs to be able to jump to arbitrary locations in the file.

Each group must submit

Each group must not submit

back to the top