#### Script generated by TTT Title: Petter: Compilerbau (06.07.2015) Date: Mon Jul 06 14:18:07 CEST 2015 Duration: 98:11 min Pages: 63 ### Overloading and Coercion Some operators such as + are *overloaded*: - + has several possible types for example: int +(int,int), float +(float, float) but also float\* +(float\*, int),int\* +(int, int\*) - depending on the type, the operator + has a different implementation - determining which implementation should be used is based on the arguments only Coercion: allow the application of + to int and float. - instead of defining + for all possible combinations of types, the arguments are automatically coerced - this coercion may generate code (e.g. conversion from int to float) - conversion is usually done towards more general types i.e. 5+0.5 has type **float** (since **float** $\geq$ **int**) ### Overloading and Coercion Some operators such as + are *overloaded*: - has several possible types for example: int +(int,int), float +(float, float) but also float\* +(float\*, int),int\* +(int, int\*) - depending on the type, the operator + has a different implementation - determining which implementation should be used is based on the arguments only 225/288 # Coercion of Integer-Types in C: Promotion C defines special conversion rules for integers: promotion ... where a conversion has to happen via all intermediate types. # Coercion of Integer-Types in C: Promotion C defines special conversion rules for integers: promotion ``` \begin{array}{ll} \text{unsigned char} & \leq & \text{unsigned short} \\ \text{signed char} & \leq & \text{signed short} \end{array} \leq \text{int} \leq \text{unsigned int} ``` ... where a conversion has to happen via all intermediate types. subtle errors possible! Compute the character distribution of str: ``` char* str = "..."; int dist[256]; memset(dist, 0, sizeof(dist)); while (*str) { dist[(unsigned) *str] ++; str++; }; ``` Note: unsigned is shorthand for unsigned int. ### Subtypes - on the arithmetic basic types char, int, long, etc. there exists a rich subtype hierarchy - here $t_1 \le t_2$ means that the values of type $t_1$ - form a subset of the values of type $t_2$ ; - 2 can be converted into a value of type $t_2$ ; - $\bigcirc$ fulfill the requirements of type $t_2$ . 227/288 #### 226/288 # **Subtypes** - on the arithmetic basic types char, int, long, etc. there exists a rich subtype hierarchy - here $t_1 \le t_2$ , means that the values of type $t_1$ - form a subset of the values of type $t_2$ ; - 2 can be converted into a value of type $t_2$ ; - of tulfill the requirements of type $t_2$ . Example: assign smaller type (fewer values) to larger type ``` t_1 \quad x; t_2 \quad y; y = x; ``` # Subtypes - on the arithmetic basic types char, int, long, etc. there exists a rich subtype hierarchy - here $t_1 \le t_2$ , means that the values of type $t_1$ - form a subset of the values of type $t_2$ ; - 2 can be converted into a value of type $t_2$ ; - $\bigcirc$ fulfill the requirements of type $t_2$ . Example: assign smaller type (fewer values) to larger type $\begin{aligned} t_1 & x; \\ t_2 & y; \\ y &= x; \end{aligned}$ extend the subtype relationship to more complex types # **Example: Subtyping** #### Observe: ``` string extractInfo( struct { string info; } x) { return x.info; } ``` - we would like extractInfo to be applicable to all argument records that contain a field string info - use deduction rules to describe when $t_1 \leq t_2$ should hold - the idea of subtyping on values is related to subtyping as implemented in object-oriented languages ### Example: Subtyping #### Observe: ``` string extractInfo( struct { string info; } x) { return x.info; } ``` - we would like extractInfo to be applicable to all argument records that contain a field string info - use deduction rules to describe when $t_1 < t_2$ should hold - the idea of subtyping on values is related to subtyping as implemented in object-oriented languages 888 # Rules for Well-Typedness of Subtyping # Rules and Examples for Subtyping #### Examples: ``` \begin{array}{lll} \text{struct } \{\text{int } a; \text{ int } b; \} & \text{struct } \{\text{float } a; \} \\ \text{int (int)} & \text{float (float)} \\ \text{int (float)} & \text{float (int)} \end{array} ``` # Co- and Contra Variance #### Definition Given two function types in subtype relation $s_0(s_1, \ldots s_n) \le t_0(t_1, \ldots t_n)$ then we have - co-variance of the return type $s_0 \le t_0$ and - contra-variance of the arguments $s_i \ge t_i$ für $1 < i \le n$ # Co- and Contra Variance #### Definition Given two function types in subtype relation $s_0(s_1, \ldots s_n) \le t_0(t_1, \ldots t_n)$ then we have - co-variance of the return type $s_0 \le t_0$ and - contra-variance of the arguments $s_i \geq t_i$ für $1 < i \leq n$ Example from functional languages: $$| int \rightarrow float$$ $\rightarrow | int$ $\leq | int \rightarrow int$ $\rightarrow | float$ 1/288 # Subtypes: Application of Rules (I) Check if $S_1 \leq R_1$ : ``` \begin{array}{rcl} R_1 & = & \mathbf{struct} \; \{ \mathbf{int} \; a; \; R_1 \left( R_1 \right) \; f; \} \\ S_1 & = & \mathbf{struct} \; \{ \mathbf{int} \; a; \; \mathbf{int} \; b; \; S_1 \left( S_1 \right) \; f; \} \\ R_2 & = & \mathbf{struct} \; \{ \mathbf{int} \; a; \; R_2 \left( S_2 \right) \; f; \} \\ S_2 & = & \mathbf{struct} \; \{ \mathbf{int} \; a; \; \mathbf{int} \; b; \; S_2 \left( R_2 \right) \; f; \} \end{array} ``` # Subtypes: Application of Rules (I) Check if $S_1 \leq R_1$ : ### Subtypes: Application of Rules (II) Check if $S_2 \leq S_1$ : ``` egin{array}{lll} R_1 & = & {f struct} \left\{ {f int} \ a; \ R_1 \left( R_1 ight) f; ight\} \ S_1 & = & {f struct} \left\{ {f int} \ a; \ {f int} \ b; \ S_1 \left( S_1 ight) f; ight\} \ R_2 & = & {f struct} \left\{ {f int} \ a; \ R_2 \left( S_2 ight) f; ight\} \ S_2 & = & {f struct} \left\{ {f int} \ a; \ {f int} \ b; \ S_2 \left( R_2 ight) f; ight\} \ \end{array} ``` 222/200 ### Subtypes: Application of Rules (II) Check if $S_2 \leq S_1$ : ``` R_1 = \text{ struct } \{ \text{int } a; \ R_1 \, (R_1) \, f; \} S_1 = \text{ struct } \{ \text{int } \underline{a}; \ \text{int } \underline{b}; \ S_1 \, (S_1) \, f; \} R_2 = \text{ struct } \{ \text{int } \underline{a}; \ R_2 \, (S_2) \, f; \} S_2 = \text{ struct } \{ \text{int } \underline{a}; \ \text{int } b; \ S_2 \, (R_2) \, f; \} S_2 \, | S_1 | a, b f S_2 \, | S_1 | S_2 \, | S_1 | S_1 \, | S_1 \, | S_2 \, | S_2 | a S_1 \, | S_2 \, | S_2 | a S_1 \, | S_2 \, | S_2 | S_1 \, | S_2 \, | S_2 | S_1 \, | S_2 \, | S_2 | S_1 \, | S_2 \, | S_2 | ``` 233/288 # Generating Code: Overview We inductively generate instructions from the AST: - there is a rule stating how to generate code for each non-terminal of the grammar - the code is merely another attribute in the syntax tree - code generation makes use of the already computed attributes # Generating Code: Overview We inductively generate instructions from the AST: - there is a rule stating how to generate code for each non-terminal of the grammar - the code is merely another attribute in the syntax tree - code generation makes use of the already computed attributes In order to specify the code generation, we require - a semantics of the language we are compiling (here: C standard) - a semantics of the machine instructions #### Generating Code: Overview We inductively generate instructions from the AST: - there is a rule stating how to generate code for each non-terminal of the grammar - the code is merely another attribute in the syntax tree - code generation makes use of the already computed attributes In order to specify the code generation, we require - a semantics of the language we are compiling (here: C standard) - a semantics of the machine instructions - → we commence by specifying machine instruction semantics ### The Register C-Machine (R-CMa) We generate Code for the Register C-Machine. The Register C-Machine is a virtual machine (VM). - there exists no processor that can execute its instructions - ... but we can build an interpreter for it - we provide a visualization environment for the R-CMa - the R-CMa has no double, float, char, short or long types - the R-CMa has no instructions to communicate with the operating system - the R-CMa has an unlimited supply of registers 239/28 #### 23//288 # The Register C-Machine (R-CMa) We generate Code for the Register C-Machine. The Register C-Machine is a virtual machine (VM). - there exists no processor that can execute its instructions - ... but we can build an interpreter for it - we provide a visualization environment for the R-CMa - the R-CMa has no double, float, char, short or long types - the R-CMa has no instructions to communicate with the operating system - the R-CMa has an unlimited supply of registers The R-CMa is more realistic than it may seem: - the mentioned restrictions can easily be lifted - the Dalvik VM or the LLVM are similar to the R-CMa - an interpreter of R-CMa can run on any platform #### Virtual Machines A virtual machines has the following ingredients: - any virtual machine provides a set of instructions - instructions are executed on virtual hardware - the virtual hardware is a collection of data structures that is accessed and modified by the VM instructions - ... and also by other components of the run-time system namely functions that go beyond the instruction semantics - the interpreter is part of the run-time system Consider Java as an example: A virtual machine such as the Dalvik VM has the following structure: - S: the data store a memory region in which cells can be stored in LIFO order → stack. - SP: ( $\hat{=}$ stack pointer) pointer to the last used cell in S - beyond S follows the memory containing the heap 241/288 #### Components of a Virtual Machine Consider Java as an example: A virtual machine such as the Dalvik VM has the following structure: - E: the data store a memory region in which cells can be stored in LIFO order → stack. - SP: (≘ stack pointer) pointer to the last used cell in S - beyond S follows the memory containing the heap - C is the memory storing code - each cell of C holds exactly one virtual instruction - C can only be read - PC (\hat{=} program counter) address of the instruction that is to be executed next - PC contains 0 initially 241/288 # Executing a Program - the machine loads an instruction from C[PC] into the instruction register IR in order to execute it - before evaluating the instruction, the PC is incremented by one ``` while true { IR = C[PC]; PC++; execute (IR); } ``` - node: the PC must be incremented before the execution, since an instruction may modify the PC - the loop is exited by evaluating a halt instruction that returns directly to the operating system # Simple Expressions and Assignments in R-CMa Task: evaluate the expression (1+7)\*3 that is, generate an instruction sequence that - computes the value of the expression and - keeps its value accessible in a reproducable way 242/28 # Simple Expressions and Assignments in R-CMa Task: evaluate the expression (1+7)\*3 that is, generate an instruction sequence that - computes the value of the expression and - keeps its value accessible in a reproducable way #### Idea: - first compute the value of the sub-expressions - store the intermediate result in a temporary register - apply the operator - loop # The Register Sets of the R-CMa The two register sets have the following purpose: - the *local* registers $R_i$ - save temporary results - store the contents of local variables of a function - can efficiently be stored and restored from the stack ### Principles of the R-CMa The R-CMa is composed of a stack, heap and a code segment, just like the JVM; it additionally has register sets: - *local* registers are $R_1, R_2, \dots R_i, \dots$ - *global* register are $R_0, R_{-1}, \dots R_i, \dots$ 244/288 # The Register Sets of the R-CMa The two register sets have the following purpose: - the *local* registers $R_i$ - save temporary results - store the contents of local variables of a function - can efficiently be stored and restored from the stack - - save the parameters of a function - store the result of a function #### Note: for now, we only use registers to store temporary computations Idea for the translation: use a register counter i: - registers $R_i$ with j < i are in use - registers $R_j$ with $j \ge i$ are available ### Translation of Simple Expressions Using variables stored in registers; loading constants: ``` \begin{array}{ll} \text{instruction} & \text{semantics} & \text{intuition} \\ \text{loadc } R_i \, c & R_i = c & \text{load constant} \\ \text{move } R_i \, R_j & R_i = R_j & \text{copy } R_j \text{ to } R_i \end{array} ``` ### Translation of Simple Expressions Using variables stored in registers; loading constants: ``` \begin{array}{lll} \text{instruction} & \text{semantics} & \text{intuition} \\ \text{loadc } R_i \ c & R_i = c & \text{load constant} \\ \text{move } R_i \ R_i & R_i = R_j & \text{copy } R_j \text{ to } R_i \end{array} ``` We define the following translation schema (with $\rho x = a$ ): $$\begin{array}{rcl} \operatorname{code_{R}^{i}} c \; \rho & = & \operatorname{loadc} R_{i} \; c \\ \operatorname{code_{R}^{i}} x \; \rho & = & \operatorname{move} R_{i} \; R_{a} \\ \operatorname{code_{R}^{i}} x = e \; \rho & = & \operatorname{code_{R}^{i}} e \; \rho \\ \hline & \operatorname{move} R_{a} \; R_{i} \end{array}$$ 247/288 ### Translation of Simple Expressions Using variables stored in registers; loading constants: ``` \begin{array}{lll} \text{instruction} & \text{semantics} & \text{intuition} \\ \text{loadc } R_i \ c & R_i = c & \text{load constant} \\ \text{move } R_i \ R_j & R_i = R_j & \text{copy } R_j \text{ to } R_i \end{array} ``` We define the following translation schema (with $\rho x = a$ ): $$\operatorname{code}_{\mathrm{R}}^{i} c \rho = \operatorname{loadc} R_{i} c$$ $\operatorname{code}_{\mathrm{R}}^{i} x \rho = \operatorname{move} R_{i} R_{a}$ $\operatorname{code}_{\mathrm{R}}^{i} x = e \rho = \operatorname{code}_{\mathrm{R}}^{i} e \rho$ $\operatorname{move} R_{a} R_{i}$ Note: all instructions use the Intel convention (in contrast to the AT&T convention): op dst $src_1 \dots src_n$ . ### Translation of Expressions Let $op = \{add, \ sub, \ div, \ mul, \ mod, \ le, \ gr, \ eq, \ leq, \ geq, \ and, \ or\}.$ The R-CMa provides an instruction for each operator op. op $$R_i R_j R_k$$ where $R_i$ is the target register, $R_j$ the first and $R_k$ the second argument. Correspondingly, we generate code as follows: $$\operatorname{code}_{\mathbf{R}}^{i} e_{1} \operatorname{op} e_{2} \rho = \operatorname{code}_{\mathbf{R}}^{i} e_{1} \rho$$ $$\operatorname{code}_{\mathbf{R}}^{i+1} e_{2} \rho$$ $$\operatorname{op}_{R_{i}}^{i+1} R_{i} R_{i+1}$$ 247/288 248/28 #### Translation of Expressions Let $op = \{add, sub, div, mul, mod, le, gr, eq, leq, geq, and, or\}$ . The R-CMa provides an instruction for each operator op. op $$R_i R_j R_k$$ where $R_i$ is the target register, $R_j$ the first and $R_k$ the second argument. Correspondingly, we generate code as follows: $$\operatorname{code}_{\mathbf{R}}^{i} e_{1} \operatorname{op} e_{2} \rho = \operatorname{code}_{\mathbf{R}}^{i} e_{1} \rho$$ $$\operatorname{code}_{\mathbf{R}}^{i+1} e_{2} \rho$$ $$\operatorname{op} R_{i} R_{i} R_{i+1}$$ Example: Translate 3 \* 4 with i = 4: #### Translation of Expressions Let op = $\{add, sub, div, mul, mod, le, gr, eq, leq, geq, and, or\}$ . The R-CMa provides an instruction for each operator op. op $$R_i R_j R_k$$ where $R_i$ is the target register, $R_j$ the first and $R_k$ the second argument. Correspondingly, we generate code as follows: $$\operatorname{code}_{\mathbf{R}}^{i} e_{1} \operatorname{op} e_{2} \rho = \operatorname{code}_{\mathbf{R}}^{i} e_{1} \rho$$ $$\operatorname{code}_{\mathbf{R}}^{i+1} e_{2} \rho$$ $$\operatorname{op} R_{i} R_{i} R_{i+1}$$ Example: Translate 3 \* 4 with i = 4: $$\operatorname{code}_{\mathbf{R}}^{4} 3 * 4 \rho = \operatorname{loadc} R_{4} 3$$ $\operatorname{loadc} R_{5} 4$ $\operatorname{mul} R_{4} R_{4} R_{5}$ 248/288 # Managing Temporary Registers Observe that temporary registers are re-used: translate 3 \* 4 + 3 \* 4 with t = 4: $$\operatorname{code}_{R}^{4} 3*4+3*4 \rho = \operatorname{code}_{R}^{4} 3*4 \rho$$ $$\operatorname{code}_{R}^{5} 3*4 \rho$$ add $R_{4} R_{4} R_{5}$ where $$\operatorname{code}_{\mathbf{R}}^{i} \ 3 \star 4 \ \rho = \operatorname{loadc}_{R}^{i} 3$$ $$\operatorname{loadc}_{R}^{i} I_{i+1}^{i+1} 4$$ $$\operatorname{mul}_{R}^{i} R_{i} R_{i}^{i} R_{i}$$ we obtain $$code_{R}^{4} 3*4+3*4 \rho =$$ # Managing Temporary Registers Observe that temporary registers are re-used: translate 3 \* 4 + 3 \* 4 with t = 4: $$\operatorname{code}_{R}^{4} 3 * 4 + 3 * 4 \rho = \operatorname{code}_{R}^{4} 3 * 4 \rho$$ $$\operatorname{code}_{R}^{5} 3 * 4 \rho$$ add $R_{4} R_{4} R_{5}$ where $$code_{R}^{i} 3*4 \rho = loadc R_{i} 3$$ $$loadc R_{i+1} 4$$ $$mul R_{i} R_{i} R_{i+1}$$ we obtain //288 # **Semantics of Operators** The operators have the following semantics: ``` add R_i R_j R_k R_i = R_i + R_k sub R_i R_i R_k R_i = R_i - R_k \operatorname{div} R_i R_i R_k R_i = R_i/R_k \operatorname{mul} R_i \stackrel{\circ}{R}_i R_k R_i = R_i * R_k \operatorname{mod} R_i R_i R_k R_i = sgn(R_k)k wobei |R_i| = n|R_k| + k \land n \ge 0, 0 \le k < |R_k| R_i = \text{if } R_i < R_k \text{ then } 1 \text{ else } 0 le R_i R_j R_k \operatorname{gr} R_i R_j R_k R_i = \text{if } R_i > R_k \text{ then } 1 \text{ else } 0 R_i = \text{if } R_i = R_k \text{ then } 1 \text{ else } 0 \operatorname{eq} R_i R_i R_k R_i = \text{if } R_i \leq R_k \text{ then } 1 \text{ else } 0 leq R_i R_i R_k geq R_i R_i R_k R_i = \text{if } R_i \geq R_k \text{ then } 1 \text{ else } 0 and R_i R_i R_k R_i = R_i \& R_k // bit-wise and R_i = R_i \mid R_k or R_i R_j R_k // bit-wise or ``` # Translation of Unary Operators Unary operators op = $\{neg, not\}$ take only two registers: $$\operatorname{code}_{\mathbf{R}}^{i} \operatorname{op} e \rho = \operatorname{code}_{\mathbf{R}}^{i} e \rho$$ $$\operatorname{op} R_{i} R_{i}$$ 50/288 ### Translation of Unary Operators Unary operators op = $\{neg, not\}$ take only two registers: $$\operatorname{code}_{\mathbf{R}}^{i} \operatorname{op} e \rho = \operatorname{code}_{\mathbf{R}}^{i} e \rho$$ $$\operatorname{op} R_{i} R_{i}$$ Note: We use the same register. Example: Translate -4 into $R_5$ : $$\operatorname{code}_{R}^{5} \boxed{-4} \rho = \operatorname{code}_{R}^{5} 4 \rho$$ $$\operatorname{neg} R_{5} R_{5}$$ # Applying Translation Schema for Expressions Suppose the following function void f(void) { - Let $\rho = \{x \mapsto 1, y \mapsto 2, z \mapsto 3\}$ be the address environment. - Let $R_4$ be the first free register, that is, i = 4. $$\operatorname{code}^{4} x = y + z * 3 \rho = \operatorname{code}_{R}^{4} y + z * 3 \rho$$ $$\operatorname{move}_{R_{1} R_{4}}^{R}$$ 251/288 252/288 # Applying Translation Schema for Expressions Suppose the following function void f(void) { - Let $\rho = \{x \mapsto 1, y \mapsto 2, z \mapsto 3\}$ be the address environment. - Let $R_4$ be the first free register, that is, i = 4. $\sim$ the assignment x=y+z\*3 is translated as move $R_4$ $R_2$ ; move $R_5$ $R_3$ ; loadc $R_6$ 3; mul $R_5$ $R_6$ ; add $R_4$ $R_4$ $R_5$ ; move $R_1$ $R_4$ Code Synthesis # Chapter 3: Statements and Control Structures 53/288 #### **About Statements and Expressions** General idea for translation: $\mathrm{code}^i \ s \ ho$ : generate code for statement s $\mathrm{code}^i_{\mathrm{R}} \ e \ ho$ : generate code for expression e into $R_i$ Throughout: $i, i+1, \ldots$ are free (unused) registers # About Statements and Expressions General idea for translation: $\operatorname{code}^i s \, \rho$ : generate code for statement s $\operatorname{code}^i_{\mathrm{R}} e \, \rho$ : generate code for expression e into $R_i$ Throughout: $i, i+1, \ldots$ are free (unused) registers For an *expression* x = e with $\rho x = a$ we defined: $$\operatorname{code}_{\mathbf{R}}^{i} x = e \ \rho = \left[ \begin{array}{c} \operatorname{code}_{\mathbf{R}}^{i} e \ \rho \\ \operatorname{move} R_{a} R_{i} \end{array} \right]$$ However, x = e; is also an *expression statement*: #### **About Statements and Expressions** General idea for translation: generate code for statement s $code^i s \rho$ generate code for expression e into $R_i$ $code_{R}^{i} e \rho$ Throughout: $i, i + 1, \ldots$ are free (unused) registers For an *expression* x = e with $\rho x = a$ we defined: $$\begin{array}{rcl} \operatorname{code}_{\mathbf{R}}^{i} \boxed{x = e} \, \rho & = & \operatorname{code}_{\mathbf{R}}^{i} \, e \, \rho \\ & & \operatorname{move} \left[ R_{a} \, \underline{R_{i}} \right] \end{array}$$ However, x = e; is also an *expression statement*. Define: $$x = a = 2 = 4$$ $$\operatorname{code}^{i} e_{1} = e_{2}; \ \rho = \operatorname{code}_{R}^{i} e_{1} = e_{2} \ \rho$$ The temporary register $R_i$ is ignored here. More general: $$code^i e$$ ; $\rho = code^i_R e \rho$ ### Translation of Statement Sequences The code for a sequence of statements is the concatenation of the instructions for each statement in that sequence: $$code^{i} (s ss) \rho = code^{i} s \rho$$ $$code^{i} \varepsilon \rho = code^{i} ss \rho$$ $$code^{i} \varepsilon \rho = mpty sequence of instructions$$ Note here: *s* is a statement, *ss* is a sequence of statements # **Jumps** In order to diverge from the linear sequence of execution, we need jumps: PC $$PC = A;$$ # **Conditional Jumps** A conditional jump branches depending on the value in $R_i$ : if $$(R_i == 0) PC = A;$$ #### Management of Control Flow In order to translate statements with control flow, we need to emit jump instructions. • during the translation of an if (c) construct, it is not yet clear where to jump to in case that c is false # Management of Control Flow In order to translate statements with control flow, we need to emit jump instructions. - during the translation of an if (c) construct, it is not yet clear where to jump to in case that c is false - instruction sequences may be arranged in a different order - minimize the number of unconditional jumps - minimize in a way so that fewer jumps are executed inside loops - replace far jumps through near jumps (if applicable) 258/ #### Management of Control Flow In order to translate statements with control flow, we need to emit jump instructions. - during the translation of an if (c) construct, it is not yet clear where to jump to in case that c is false - instruction sequences may be arranged in a different order - minimize the number of unconditional jumps - minimize in a way so that fewer jumps are executed inside loops - replace far jumps through near jumps (if applicable) - organize instruction sequence into blocks without jumps To this end, we define: #### Definition A basic block consists of - a sequence of statements ss that does not contain a jump - a set of outgoing edges to other basic blocks - where each edge may be labelled with a condition ### Basic Blocks and the Register C-Machine The R-CMa features only a single conditional jump, namely jumpz. Outgoing edges must have the following form: #### Formalizing the Translation Involving Control Flow For simplicity of defining translations of instructions involving control flow, we use *symbolic jump targets*. This translation can be used in practice, but a second run through the emitted instructions is necessary to resolve the symbolic addresses to actual addresses. Alternatively, we can emit *relative* jumps without a second pass: - relative jumps have targets that are offsets to the current PC - sometime relative jumps only possible for small offsets ( → near jumps) - if all jumps are relative: the code becomes position independent (PIC), that is, it can be moved to a different address - the generated code can be loaded without relocating absolute jumps generating a graph of basic blocks is useful for *program optimization* where the statements inside basic blocks are simplified Simple Conditional We first consider $s \equiv if$ ( c ) ss. ...and present a translation without basic blocks. #### Idea: - ullet emit the code of c and ss in sequence - insert a jump instruction in-between, so that correct control flow is ensured 261/288 ### **General Conditional** Translation of if (c) tt else ee. ### Example for if-statement Let $\rho = \{x \mapsto 4, y \mapsto 7\}$ and let s be the statement Then $code^i s \rho$ yields: 262/288 # Example for if-statement Let $\rho = \{x \mapsto 4, y \mapsto 7\}$ and let s be the statement Then $code^i s \rho$ yields: ### **Iterating Statements** We only consider the loop $s \equiv \mathbf{while}\ (e)\ s'.$ For this statement we define: 264/288 263/288 # for-Loops The **for**-loop $s \equiv$ **for** $(e_1; e_2; e_3)$ s' is equivalent to the statement sequence $e_1$ ; while $(e_2)$ $\{s'$ $e_3$ ; $\}$ – as long as s' does not contain a **continue** statement. Thus, we translate: $$\operatorname{code}^{i}\operatorname{for}(e_{1};e_{2}|e_{3}) \ s \ ho = \operatorname{code}_{R}^{i}|e_{1}| \ ho$$ $$A : \operatorname{code}_{R}^{i}|e_{2}| \ ho$$ $$\operatorname{jumpz}_{R_{i}} B$$ $$\operatorname{code}_{R}^{i}|e_{3}| \ ho$$ $$\operatorname{code}_{R}^{i}|e_{3}| \ ho$$ $$\operatorname{jump}_{A} A$$ $$VB :$$ #### The switch-Statement #### Idea: - Suppose choosing from multiple options in constant time if possible - use a *jump table* that, at the *i*th position, holds a jump to the *i*th alternative - in order to realize this idea, we need an *indirect jump* instruction