class: center, middle # Symbol Tables _CMPU 331 - Compilers_ --- # Recap: Stages of Compilation 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization 5. Code Generation ![multistage pipeline](/~cs331/images/lectures/multistage.png) (source: _Language Implementation Patterns_ by Terence Parr) --- # Recap: Lexical Analysis First step is to recognize the tokens (words): ``` (+ 3 4) ``` We could identify the useful tokens as: ``` OPENPAREN ADDOP INTEGER CLOSEPAREN ``` --- # Recap: Parsing (Syntactic Analysis) Second step is to understand the structure (syntax): ![scheme diagram](/~cs331/images/lectures/scheme_diagram.png) --- # Recap: Semantic Analysis Third step is to understand the "meaning" For compilers this means a limited form of analysis to catch inconsistencies --- # Recap: Semantic Analysis Examples: > _Jack said Jerry left his assignment at home._ Who does "his" refer to? ```c int jack = 3; { int jack = 4; print(jack); } ``` Which `jack` gets printed? 3 or 4? --- # Semantic Analysis Why separate semantic analysis? * Parsing can't catch some errors * Not all language constructs are context-free --- # Semantic Analysis Many possible kinds of checks: * Identifiers are declared before use * Types * Reserved identifiers (keywords) are not misused * Functions defined only once * Classes defined only once * Methods in a class defined only once * Inheritance relationships * And others... The requirements depend on the language --- # Scope Matching identifier declarations with uses * The _scope_ of an identifier is the portion of a program where that identifier is accessible. * The same identifier may refer to different things in different parts of the programs. * An identifier may have restricted scope. --- # Scope in DL DL identifiers are introduced by: * Variable declarations (introduce variable names) * Function declarations (introduce function names and parameter names) ```c /* Declare a function named 'timestwo', with one parameter. */ timestwo(x); int y; { y = 2; return(x * y) } /* Call 'timestwo' in the main body of the program. */ int z; { z = timestwo(3) } ``` --- # Scope in Other Languages In other languages identifiers might be introduced by: * Variable declarations (introduce variable names) * Function declarations (introduce function names and parameter names) * Class declarations (introduce class names) * Method definitions (introduce method names) * etc... --- # Semantic Analysis * Much of semantic analysis can be expressed as a recursive descent of an AST * Before: start processing an AST node _n_ * Recurse: process the children of _n_ * After: finish processing the AST node _n_ * When performing semantic analysis on a portion of the AST, we need to know which identifiers are defined --- # Symbol Tables * A _symbol table_ is a data structure that tracks the current bindings of identifiers ```c timestwo(x); int y; { y = 2; return(x * y) } ``` name | type | scope | use ---- | ---- | ----- | --- `x` | `INT` | local | parameter `y` | `INT` | local | variable * How do we build a symbol table? * Before: add definitions of `x` and `y` to function scope * Recurse: to function body * After: remove definitions of `x` and `y` from function scope --- # Symbol Tables * First approximation, implement symbol table as simple stack * Operations: * `add_symbol(x)` push `x` and associated info on the stack * `find_symbol(x)` search stack, starting from top, return first `x` found, or false if none found * `remove_symbol()` pop a symbol off the stack * Works as long as declarations are perfectly nested * Can't tell the difference between a variable declared in an outer scope (ok), and a variable redeclared in the same scope (error) --- # Symbol Tables * Second approximation, implement symbol table as nested data structure * Operations: * `enter_scope()` start a new nested scope * `add_symbol(x)` add a symbol `x` to the current scope * `find_symbol(x)` search current scope first, then outer scopes, return first `x` found, or false if none found * `check_local(x)` true if `x` defined in current scope * `exit_scope()` exit current scope The semantic analysis project template will include a symbol table manager.