9 years ago · d1a8441caf
--- a/book.tex
+++ b/book.tex
@@ -1175,15 +1175,20 @@ an intermediate language named $C_0$, roughly half-way between $S_0$
 
															 and x86, to provide a rest stop along the way. We name the language
														
 
															 $C_0$ because it is vaguely similar to the $C$
														
 
															 language~\citep{Kernighan:1988nx}. The differences \#4 and \#1,
														
 
															-regarding variables and nested expressions, are handled by the passes
														
 
															-\textsf{uniquify} and \textsf{flatten} that bring us to $C_0$.
														
 
															+regarding variables and nested expressions, will be handled by two
														
 
															+steps, \key{uniquify} and \key{flatten}, which bring us to
														
 
															+$C_0$.
														
 
															 \[\large
														
 
															 \xymatrix@=50pt{
														
 
															-  S_0 \ar@/^/[r]^-{\textsf{uniquify}} & 
														
 
															-  S_0 \ar@/^/[r]^-{\textsf{flatten}} &
														
 
															+  S_0 \ar@/^/[r]^-{\key{uniquify}} & 
														
 
															+  S_0 \ar@/^/[r]^-{\key{flatten}} &
														
 
															   C_0 
														
 
															 }
														
 
															 \]
														
 
															+Each of these steps in the compiler is implemented by a function,
														
 
															+typically a structurally recursive function that translates an input
														
 
															+AST into an output AST. We refer to such a function as a \emph{pass}
														
 
															+because it makes a pass over the AST.
														
 
															 The syntax for $C_0$ is defined in Figure~\ref{fig:c0-syntax}.  The
														
 
															 $C_0$ language supports the same operators as $S_0$ but the arguments
														
@@ -1215,21 +1220,21 @@ To get from $C_0$ to x86-64 assembly requires three more steps, which
 
															 we discuss below.
														
 
															 \[\large
														
 
															 \xymatrix@=50pt{
														
 
															-  C_0 \ar@/^/[r]^-{\textsf{select\_instr.}}
														
 
															-  & \text{x86}^{*} \ar@/^/[r]^-{\textsf{assign\_homes}} 
														
 
															-  & \text{x86}^{*} \ar@/^/[r]^-{\textsf{patch\_instr.}}
														
 
															+  C_0 \ar@/^/[r]^-{\key{select\_instr.}}
														
 
															+  & \text{x86}^{*} \ar@/^/[r]^-{\key{assign\_homes}} 
														
 
															+  & \text{x86}^{*} \ar@/^/[r]^-{\key{patch\_instr.}}
														
 
															   & \text{x86}
														
 
															 }
														
 
															 \]
														
 
															 We handle difference \#1, concerning the format of arithmetic
														
 
															-instructions, in the \textsf{select\_instructions} pass.  The result
														
 
															+instructions, in the \key{select\_instructions} pass.  The result
														
 
															 of this pass produces programs consisting of x86-64 instructions that
														
 
															 use variables.
														
 
															 %
														
 
															 As there are only 16 registers, we cannot always map variables to
														
 
															-registers (difference \#3). Fortunately, the stack can grow quite, so
														
 
															-we can map variables to locations on the stack. This is handled in the
														
 
															-\textsf{assign\_homes} pass. The topic of
														
 
															+registers (difference \#3). Fortunately, the stack can grow quite
														
 
															+large, so we can map variables to locations on the stack. This is
														
 
															+handled in the \key{assign\_homes} pass. The topic of
														
 
															 Chapter~\ref{ch:register-allocation} is implementing a smarter
														
 
															 approach in which we make a best-effort to map variables to registers,
														
 
															 resorting to the stack only when necessary.
														
@@ -1238,15 +1243,15 @@ The final pass in our journey to x86 handles an indiosycracy of x86
 
															 assembly. Many x86 instructions have two arguments but only one of the
														
 
															 arguments may be a memory reference. Because we are mapping variables
														
 
															 to stack locations, many of our generated instructions will violate
														
 
															-this restriction. The purpose of the \textsf{patch\_instructions} pass
														
 
															-is to fix this problem by replacing every bad instruction with a short
														
 
															-sequence of instructions that use the \key{rax} register.
														
 
															+this restriction. The purpose of the \key{patch\_instructions} pass
														
 
															+is to fix this problem by replacing every violating instruction with a
														
 
															+short sequence of instructions that use the \key{rax} register.
														
 
															 \section{Uniquify Variables}
														
 
															 \label{sec:uniquify-s0}
														
 
															 The purpose of this pass is to make sure that each \key{let} uses a
														
 
															-unique variable name. For example, the \textsf{uniquify} pass could
														
 
															+unique variable name. For example, the \key{uniquify} pass could
														
 
															 translate
														
 
															 \[
														
 
															 \LET{x}{32}{ \BINOP{+}{ \LET{x}{10}{x} }{ x } }
														
@@ -1256,21 +1261,71 @@ to
 
															 \LET{x.1}{32}{ \BINOP{+}{ \LET{x.2}{10}{x.2} }{ x.1 } }
														
 
															 \]
														
 
															-We recommend implementing \textsf{uniquify} as a recursive function
														
 
															-that mostly just copies the input program. However, when encountering
														
 
															-a \key{let}, it should generate a unique name for the variable (the
														
 
															+We recommend implementing \key{uniquify} as a recursive function that
														
 
															+mostly just copies the input program. However, when encountering a
														
 
															+\key{let}, it should generate a unique name for the variable (the
														
 
															 Racket function \key{gensym} is handy for this) and associate the old
														
 
															 name with the new unique name in an association list. The
														
 
															-\textsf{uniquify} function will need to access this association list
														
 
															-when it gets to a variable reference, so we add another paramter to
														
 
															-\textsf{uniquify} for the association list.
														
 
															+\key{uniquify} function will need to access this association list when
														
 
															+it gets to a variable reference, so we add another paramter to
														
 
															+\key{uniquify} for the association list. It is quite common for a
														
 
															+compiler pass to need a map to store extra information about
														
 
															+variables. Such maps are often called \emph{symbol tables}.
														
 
															+
														
 
															+The skeleton of the \key{uniquify} function is shown in
														
 
															+Figure~\ref{fig:uniquify-s0}.  The function is curried so that it is
														
 
															+convenient to partially apply it to an association list and then apply
														
 
															+it to different expressions, as in the last clause for primitive
														
 
															+operations in Figure~\ref{fig:uniquify-s0}.
														
 
															+
														
 
															+\begin{exercise}
														
 
															+Complete the \key{uniquify} pass by filling in the blanks, that is,
														
 
															+implement the clauses for variables and for the \key{let} construct.
														
 
															+\end{exercise}
														
 
															+
														
 
															+\begin{figure}[tbp]
														
 
															+\begin{lstlisting}
														
 
															+   (define uniquify
														
 
															+     (lambda (alist)
														
 
															+       (lambda (e)
														
 
															+         (match e
														
 
															+           [(? symbol?) ___]
														
 
															+           [(? integer?) e]
														
 
															+           [`(let ([,x ,e]) ,body) ___]
														
 
															+           [`(program ,info ,e)
														
 
															+            `(program ,info ,((uniquify alist) e))]
														
 
															+           [`(,op ,es ...)
														
 
															+            `(,op ,@(map (uniquify alist) es))]
														
 
															+           ))))
														
 
															+\end{lstlisting}
														
 
															+\caption{Skeleton for the \key{uniquify} pass.}
														
 
															+\label{fig:uniquify-s0}
														
 
															+\end{figure}
														
 
															+
														
 
															+\begin{exercise}
														
 
															+Test your \key{uniquify} pass by creating three example $S_0$ programs
														
 
															+and checking whether the output programs produce the same result as
														
 
															+the input programs. The $S_0$ programs should be designed to test the
														
 
															+most interesting parts of the \key{uniquify} pass, that is, the
														
 
															+programs should include \key{let} constructs, variables, and variables
														
 
															+that overshadow eachother.  
														
 
															+
														
 
															+[to do: explain the test-compiler function and interpret-S0.]
														
 
															+
														
 
															+%% You can use the interpreter \key{interpret-S0} defined in the
														
 
															+%% \key{interp.rkt} file. The entire sequence of tests should be a short
														
 
															+%% Racket program so you can re-run all the tests by running the Racket
														
 
															+%% program. We refer to this as the \emph{regression test} program.
														
 
															+\end{exercise}
														
 
															+
														
 
															 \section{Flatten Expressions}
														
 
															 \label{sec:flatten-s0}
														
 
															-The purpose of the \textsf{flatten} pass is to get rid of nested
														
 
															-expressions, such as the $\UNIOP{-}{10}$ in the following program,
														
 
															-without changing the behavior of the program.
														
 
															+The \key{flatten} pass will transform $S_0$ programs into $C_0$
														
 
															+programs. In particular, the purpose of the \key{flatten} pass is to
														
 
															+get rid of nested expressions, such as the $\UNIOP{-}{10}$ in the
														
 
															+following program.
														
 
															 \[
														
 
															 \BINOP{+}{52}{ \UNIOP{-}{10} }
														
 
															 \]
														
@@ -1281,14 +1336,19 @@ translated to the following one.
 
															 \[
														
 
															 \begin{array}{l}
														
 
															 \ASSIGN{ \itm{x} }{ \UNIOP{-}{10} } \\
														
 
															-\RETURN{ \BINOP{+}{52}{ \itm{x} } }
														
 
															+\ASSIGN{ \itm{y} }{ \BINOP{+}{52}{ \itm{x} } } \\
														
 
															+\RETURN{ y }
														
 
															 \end{array}
														
 
															 \]
														
 
															-We recommend implementing \textsf{flatten} as a recursive function
														
 
															-that returns two things, 1) the newly flattened expression, and 2) a
														
 
															-list of assignment statements, one for each of the new variables
														
 
															-introduced while flattening the expression.
														
 
															+We recommend implementing \key{flatten} as a structurally recursive
														
 
															+function that returns two things, 1) the newly flattened expression,
														
 
															+and 2) a list of assignment statements, one for each of the new
														
 
															+variables introduced while flattening the expression. You can return
														
 
															+multiple things from a function using the \key{values} form and you
														
 
															+can receive multiple things from a function call using the
														
 
															+\key{define-values} form. If you are not familiar with these
														
 
															+constructs, the Racket documentation will be of help.
														
 
															 Take special care for programs such as the following that initialize
														
 
															 variables with integers or other variables.
														
@@ -1302,7 +1362,7 @@ This program should be translated to
 
															 \RETURN{b}
														
 
															 \]
														
 
															 and not the following, which could result from a naive implementation
														
 
															-of \textsf{flatten}.
														
 
															+of \key{flatten}.
														
 
															 \[
														
 
															 \ASSIGN{x.1}{42}\;
														
 
															 \ASSIGN{a}{x.1}\;
														
@@ -1311,13 +1371,22 @@ of \textsf{flatten}.
 
															 \RETURN{b}
														
 
															 \]
														
 
															+\begin{exercise}
														
 
															+Implement the \key{flatten} pass and test it on all of the example
														
 
															+programs that you created to test the \key{uniquify} pass and create
														
 
															+three new example programs that are designed to exercise all of the
														
 
															+interesting code in the \key{flatten} pass. 
														
 
															+[to do: add to the test-compiler stuff]
														
 
															+\end{exercise}
														
 
															+
														
 
															+
														
 
															 \section{Select Instructions}
														
 
															 \label{sec:select-s0}
														
 
															-In the \textsf{select\_instructions} pass we begin the work of
														
 
															+In the \key{select\_instructions} pass we begin the work of
														
 
															 translating from $C_0$ to x86. The target language of this pass is a
														
 
															 pseudo-x86 language that still uses variables, so we add an AST node
														
 
															-of the form $\VAR{\itm{var}}$.  The \textsf{select\_instructions} pass
														
 
															+of the form $\VAR{\itm{var}}$.  The \key{select\_instructions} pass
														
 
															 deals with the differing format of arithmetic operations. For example,
														
 
															 in $C_0$ an addition operation could take the following form:
														
 
															 \[
														
@@ -1349,9 +1418,9 @@ procedure.
 
															 \label{sec:assign-s0}
														
 
															 As discussed in Section~\ref{sec:plan-s0-x86}, the
														
 
															-\textsf{assign\_homes} pass places all of the variables on the stack.
														
 
															+\key{assign\_homes} pass places all of the variables on the stack.
														
 
															 Consider again the example $S_0$ program $\BINOP{+}{52}{ \UNIOP{-}{10} }$,
														
 
															-which after \textsf{select\_instructions} looks like the following.
														
 
															+which after \key{select\_instructions} looks like the following.
														
 
															 \[
														
 
															 \begin{array}{l}
														
 
															 (\key{mov}\;\INT{10}\; \VAR{x})\\
														
@@ -1361,7 +1430,7 @@ which after \textsf{select\_instructions} looks like the following.
 
															 \end{array}
														
 
															 \]
														
 
															 The one and only variable $x$ is assigned to stack location
														
 
															-\key{-8(\%rbp)}, so the \textsf{assign\_homes} pass translates the
														
 
															+\key{-8(\%rbp)}, so the \key{assign\_homes} pass translates the
														
 
															 above to
														
 
															 \[
														
 
															 \begin{array}{l}
														
@@ -1388,7 +1457,7 @@ Consider again the following example.
 
															 \[
														
 
															 \LET{a}{42}{ \LET{b}{a}{ b }}
														
 
															 \]
														
 
															-After \textsf{assign\_homes} pass, the above has been translated to
														
 
															+After \key{assign\_homes} pass, the above has been translated to
														
 
															 \[
														
 
															 \begin{array}{l}
														
 
															 (\key{mov} \;\INT{42}\; \STACKLOC{{-}8})\\
														
@@ -1823,11 +1892,11 @@ shown in Figure~\ref{fig:reg-alloc-passes}.
 
															 \begin{figure}[tbp]
														
 
															 \[
														
 
															 \xymatrix{
														
 
															-  C_0 \ar@/^/[r]^-{\textsf{select\_instr.}}
														
 
															-    & \text{x86}^{*} \ar[d]^-{\textsf{uncover\_live}} \\
														
 
															-    & \text{x86}^{*} \ar[d]^-{\textsf{build\_interference}} \\
														
 
															-    & \text{x86}^{*} \ar[d]_-{\textsf{allocate\_register}} \\
														
 
															-    & \text{x86}^{*} \ar@/^/[r]^-{\textsf{patch\_instr.}} 
														
 
															+  C_0 \ar@/^/[r]^-{\key{select\_instr.}}
														
 
															+    & \text{x86}^{*} \ar[d]^-{\key{uncover\_live}} \\
														
 
															+    & \text{x86}^{*} \ar[d]^-{\key{build\_interference}} \\
														
 
															+    & \text{x86}^{*} \ar[d]_-{\key{allocate\_register}} \\
														
 
															+    & \text{x86}^{*} \ar@/^/[r]^-{\key{patch\_instr.}} 
														
 
															     & \text{x86} 
														
 
															 }
														
 
															 \]