class: center, middle # Optimization _CMPU 331 - Compilers_ --- # Recap - Classification of Optimizations For languages like C, there are three granularities of optimizations. 1. Local optimizations * Apply to a basic block in isolation 2. Global optimizations * Apply to a control-flow graph (for a function/method body) in isolation 3. Inter-procedural optimizations * Apply across function/method boundaries Most compilers do (1), many do (2), few do (3) --- # Recap - Basic Blocks and Control-Flow Graph The body of a function (or method) can be represented as a control-flow graph. ![basic blocks](/~cs331/images/lectures/basic_blocks.png) _(source: "Code generation for LLVM" by Magnus Myreen)_ --- # Local Optimization Recall the simple basic-block optimizations * Constant propagation * Dead code elimination ```c x = 3 x = 3 y = z * w y = z * w => y = z * w => q = 3 + y q = x + y q = 3 + y ``` --- # Global Optimization These optimizations can be extended to an entire control-flow graph ![global optimization](/~cs331/images/lectures/control_flow_start.png) --- # Global Optimization These optimizations can be extended to an entire control-flow graph ![copy propagation](/~cs331/images/lectures/control_flow_copy_prop.png) --- # Correctness * How do we know it is OK to globally propagate constants? * There are situations where it is incorrect: ![no copy propagation](/~cs331/images/lectures/control_flow_no_copy_prop.png) --- # Correctness To replace a use of `x` by a constant `k` we must know that: * On every path to the use of `x`, the last assignment to `x` is `x = k` ![copy propagation](/~cs331/images/lectures/control_flow_copy_prop.png) --- # Correctness To replace a use of `x` by a constant `k` we must know that: * On every path to the use of `x`, the last assignment to `x` is `x = k` ![no copy propagation](/~cs331/images/lectures/control_flow_no_copy_prop.png) --- # Correctness * The correctness condition is not trivial to check * "All paths" includes paths around loops and through branches of conditionals * Checking the condition requires global analysis of the entire control-flow graph --- # Global Analysis Global optimization tasks share several traits: * The optimization depends on knowing a property _X_ at a particular point in program execution * Proving _X_ at any point requires knowledge of the entire program * It is OK to be conservative. If the optimization requires _X_ to be true, then want to know either * _X_ is definitely true * Don't know if _X_ is true * It is always safe to say "don't know" --- # Global Analysis * _Global dataflow analysis_ is a standard technique for solving problems with these characteristics * Global constant propagation is one example of an optimization that requires global dataflow analysis --- # Global Constant Propagation * Global constant propagation can be performed at any point where the property _P_ holds: > _P_ = On every path to the use of `x`, the last assignment to `x` is `x = k` * Consider the case of computing _P_ for a single variable `x` at all program points --- # Global Constant Propagation * To make the problem precise, we associate one of the following values with `x` at every program point * ⊥ means "this statement never executes (or we don't know yet)" * _C_ means "`x` is constant _C_" * ⊤ means "`x` is not a constant" ![constant annotation](/~cs331/images/lectures/control_flow_constant_annotation.png) --- # Global Constant Propagation Given global constant information, it is easy to perform the optimization * Simply inspect the property `x=?` associated with a statement that uses `x` * If `x` is constant at that point replace that use of `x` by the constant But how do we compute the properties `x=?` * The idea: The analysis of a complicated program can be expressed as a combination of simple rules relating the change in information between adjacent statements. --- # Global Constant Propagation * The idea is to "push" or "transfer" information from one statement to the next * For each statement _s_, we compute information about the value of `x` immediately before and after _s_ * Define a _transfer_ function that transfers information from one statement to another --- # Global Constant Propagation * Rule 1: if `x=`⊤ after any statements immediately before _s_, then `x=`⊤ before _s_. * Rule 2: if `x` has different constant values _C_, _D_... after some statements immediately before _s_, then `x=`⊤ before _s_. * Rule 3: if `x=`⊥ after some statements immediately before _s_, but has a single constant value _C_ after all the other statements, then `x=`_C_ before _s_. * Rule 4: if `x=`⊥ after all statements before _s_, then `x=`⊥ before _s_. --- # Global Constant Propagation * Rules 1-4 relate the **after** of one statement to the **before** of the next statement * Now we need rules relating the **before** of a statement to the **after** of the same statement --- # Global Constant Propagation * Rule 5: if `x=`⊥ before _s_, then `x=`⊥ after _s_. * Rule 6: if the statement _s_ assigns a constant _C_ to `x`, then `x=`_C_ after _s_. * Rule 7: if the statement _s_ assigns the result of a function to `x`, then `x=`⊤ after _s_. * Rule 8: if the statement _s_ assigns a value to any variable other than `x`, then `x=?` after _s_ is the same as it was before _s_. --- # Global Constant Propagation 1. For every entry _s_ to the program, set `x=`⊤ 2. Set `x=`⊥ everywhere else 3. Pick some statement not satisfying rules 1-8, and update using the appropriate rule 4. Repeat (3) until all points satisfy rules 1-8 --- # Liveness Analysis Once constants have been globally propagated, we would like to eliminate dead code * After constant propagation, `x = 3` is dead, if `x` isn't used elsewhere: ![copy propagation](/~cs331/images/lectures/control_flow_copy_prop.png) --- # Liveness A variable `x` is _live_ at statement _s_ if: * Some statement _s'_ uses `x` * There is a path from _s_ to _s'_ * That path has no intervening assignment to `x` --- # Global Dead Code Elimination * A statement `x = `... is dead code if `x` is dead after the assignment * Dead statements can be deleted from the program * But, we need liveness information to know if `x` is dead --- # Liveness Analysis * We can express liveness in terms of information transferred between adjacent statements, just like copy propagation * Liveness is simpler than constant propagation, since it is a boolean property (true or false) --- # Liveness Analysis * Rule 1: `x` is live after statement _p_, if `x` is live before any statement _s_ that immediately follows _p_. * Rule 2: `x` is live before statement _s_, if _s_ uses `x` in the right-hand side of the statement (like `y = x + 5` or `y = f(x)`). * Rule 3: `x` is not live before statement _s_, if _s_ assigns some other value to `x` (like `x = 5` or `x = e`). * Rule 4: if statement _s_ doesn't use `x` at all, then whatever liveness value `x` had after `x`, is still the same before _s_. --- # Liveness Analysis 1. Initially, set all liveness properties to false 2. Pick some statement where one of the rules 1-4 does not hold, and update it using the appropriate rule 3. Repeat (2) until all the statements satisfy rules 1-4 --- # Liveness Analysis * The liveness property can change from false to true, but not the other way around * Each value can change only once, so termination is guaranteed * Once the analysis is computed, it is simple to eliminate dead code --- # Forward and Backward Analysis We've seen two kinds of analysis: * Constant propagation is a forward analysis: information is pushed from **before** to **after** a statement * Liveness is a backward analysis: information is pushed from **after** a statement back towards **before** the statement There are many other global flow analyses: * Most can be classified as either forward or backward * Most also follow the methodology of local rules relating information between adjacent program points