This is an old revision of the document!

CMPU-331 Compilers (Spring 2019)


This project aims to bring together most of the computer science and software development techniques you have learned into a single, semester-long project. It consists of implementing a compiler consisting of four parts: lexical analysis routines, parser, symbol table management routines, and semantic routines. Due to its size and complexity, the semantic routine subsystem is broken into four sections. Each piece is integrated with what has been completed previously, so that by the end of the semester you have created a single, complex program.

This is not a trivial challenge.

A compiler is a complex, strongly interconnected and interoperating program.

Please do not leave any part of it until the last minute.


Component Date released Date due Grade Specification
I Lexical Analyser (Lexer) Thursday, 24th January Thursday, 7th February 4 Lexical Analyser
II Parser Thursday, 7th February Thursday, 21st February 3 Parser
III Symbol Table routines
Semantic Actions I
Thursday, 21st February Tuesday, 5th March 3 Symbol Table Routines

Semantic Actions I
IV Semantic Actions II Thursday, 7th March Sunday, 31st March 4 Semantic Actions II
V Semantic Actions III Tuesday, 26th March Thursday, 11th April 5 Semantic Actions III
VI Semantic Actions IV Tuesday, 9th April Tuesday, 23rd April 5 Semantic Actions IV
Subtotal 24
VII Complete Compiler Thursday, 2nd May Sunday, 12th May 36

Programming Languages

You may write this project either in Java or Python.

  • If you wish to code in Java, I recommend IntelliJ IDEA. (Click 'Download' and then select 'Community'.)
  • If you use Python, I recommend PyCharm. (Click 'Download Now' and then select 'Community'.)

Both of these IDEs are available on the CS department Linux systems.

Please choose a language with which you are familiar and can use confidently. It is absolutely not recommended to change languages part-way through the project.


Lexical Analyser

The lexical analyser (“lexer”) is designed to isolate tokens from an input source file containing a program written in a language it is programmed to recognise.

Full specification.


The function of the parser is to ensure that the stream of tokens conforms to the rules of the language; that is, that the input is syntactically correct.

Full specification.


Submission Guidelines

All components are due by 11.59:59 pm on the date specified. Late work will be subject to the late policy as described.

Please note that the due date for the complete compiler, Sunday, 12th May, 2019, is to ensure that you have at least two full days of the Study Period free from requirements for this class. Consequently, no late work will be accepted for the final project submission (except in the case of extenuating circumstances).

General Instructions

  • You should submit your assignments using the CS Dropbox facility.
  • Supply one test file with your code.
  • You must include a README.txt file with your submission, detailing any necessary information (e.g. how to run the code, any issues with the program). This file should also list any noteworthy changes to code for previous phases included in the current submission.

CS Dropbox

To submit your work, ensure that you are logged into your Vassar CS account and use the following command:

submit331 [assignment_ID] [directory_to_submit]

If you have any problems with this, please contact me or Mr Jerry Bailie, the CS IT manager, as soon as possible.

Directory Names

The following directory names should be used to submit your code:

Component Assignment_ID
Lexical Analyser 01_Lexer
Parser 02_Parser
Symbol Tables and
Semantic Actions (part 1)
Semantic Actions (part 2) 04_SemAct2
Semantic Actions (part 3) 05_SemAct3
Semantic Actions (part 4) 06_SemAct4
Complete compiler 07_Complete

As well as permitting an overview of the development of your compiler, this also ensures that there is at least one backup of your work.

Late Policy

General Policy

  • Work submitted late will be subject to a 10% penalty per day or part day late. This is a flat, not multiplicative, penalty: for example, work submitted 2 days late will incur a penalty of 20% subtracted from the base assessed grade.
  • Work submitted more than 4 days late will receive a grade of zero.
  • This policy does not apply in the case of established academic easement (mitigating circumstances, athletic fixtures, etc.).
  • This policy is in abeyance for the duration of any Slip Days claimed for that phase.

Slip Days

Slip Days (sometimes known as 'Grace Days') are days which may be invoked for any project submission if a student feels that they need more time to complete the work for that phase.

  1. A Slip Day is used to 'forgive' one (1) day's lateness for a project submission, no questions asked. There are two exceptions:
    1. Slip Days may not be applied to the paper exam.
    2. Slip Days may not be applied to the final submission.
  2. Each student has five (5) Slip Days for the whole semester.
  3. To claim Slip Days, you must request them by email no later than 12 hours before the deadline (i.e. by 12 noon on the due date).
  4. No single phase of the project may have more than two (2) Slip Days applied to it.
  5. The final submission (the completed project) may not have Slip Days applied to it.
  6. Once Slip Days are claimed, they are considered spent and cannot be recovered.
  7. If work is submitted late, it is subject to the usual penalty at the rate which would have been incurred had the Slip Day(s) not been applied. (i.e. if you ask for two Slip Days but submit the day after the second Slip Day concludes, the work is 3 days late and will be subject to the standard 30% penalty.)