9 éve · d1a8441caf
--- a/book.tex
+++ b/book.tex
@@ -1175,15 +1175,20 @@ an intermediate language named $C_0$, roughly half-way between $S_0$
 
				 and x86, to provide a rest stop along the way. We name the language
			
 
				 $C_0$ because it is vaguely similar to the $C$
			
 
				 language~\citep{Kernighan:1988nx}. The differences \#4 and \#1,
			
 
				-regarding variables and nested expressions, are handled by the passes
			
 
				-\textsf{uniquify} and \textsf{flatten} that bring us to $C_0$.
			
 
				+regarding variables and nested expressions, will be handled by two
			
 
				+steps, \key{uniquify} and \key{flatten}, which bring us to
			
 
				+$C_0$.
			
 
				 \[\large
			
 
				 \xymatrix@=50pt{
			
 
				-  S_0 \ar@/^/[r]^-{\textsf{uniquify}} & 
			
 
				-  S_0 \ar@/^/[r]^-{\textsf{flatten}} &
			
 
				+  S_0 \ar@/^/[r]^-{\key{uniquify}} & 
			
 
				+  S_0 \ar@/^/[r]^-{\key{flatten}} &
			
 
				   C_0 
			
 
				 }
			
 
				 \]
			
 
				+Each of these steps in the compiler is implemented by a function,
			
 
				+typically a structurally recursive function that translates an input
			
 
				+AST into an output AST. We refer to such a function as a \emph{pass}
			
 
				+because it makes a pass over the AST.
			
 
				 
			
 
				 The syntax for $C_0$ is defined in Figure~\ref{fig:c0-syntax}.  The
			
 
				 $C_0$ language supports the same operators as $S_0$ but the arguments
			
@@ -1215,21 +1220,21 @@ To get from $C_0$ to x86-64 assembly requires three more steps, which
 
				 we discuss below.
			
 
				 \[\large
			
 
				 \xymatrix@=50pt{
			
 
				-  C_0 \ar@/^/[r]^-{\textsf{select\_instr.}}
			
 
				-  & \text{x86}^{*} \ar@/^/[r]^-{\textsf{assign\_homes}} 
			
 
				-  & \text{x86}^{*} \ar@/^/[r]^-{\textsf{patch\_instr.}}
			
 
				+  C_0 \ar@/^/[r]^-{\key{select\_instr.}}
			
 
				+  & \text{x86}^{*} \ar@/^/[r]^-{\key{assign\_homes}} 
			
 
				+  & \text{x86}^{*} \ar@/^/[r]^-{\key{patch\_instr.}}
			
 
				   & \text{x86}
			
 
				 }
			
 
				 \]
			
 
				 We handle difference \#1, concerning the format of arithmetic
			
 
				-instructions, in the \textsf{select\_instructions} pass.  The result
			
 
				+instructions, in the \key{select\_instructions} pass.  The result
			
 
				 of this pass produces programs consisting of x86-64 instructions that
			
 
				 use variables.
			
 
				 %
			
 
				 As there are only 16 registers, we cannot always map variables to
			
 
				-registers (difference \#3). Fortunately, the stack can grow quite, so
			
 
				-we can map variables to locations on the stack. This is handled in the
			
 
				-\textsf{assign\_homes} pass. The topic of
			
 
				+registers (difference \#3). Fortunately, the stack can grow quite
			
 
				+large, so we can map variables to locations on the stack. This is
			
 
				+handled in the \key{assign\_homes} pass. The topic of
			
 
				 Chapter~\ref{ch:register-allocation} is implementing a smarter
			
 
				 approach in which we make a best-effort to map variables to registers,
			
 
				 resorting to the stack only when necessary.
			
@@ -1238,15 +1243,15 @@ The final pass in our journey to x86 handles an indiosycracy of x86
 
				 assembly. Many x86 instructions have two arguments but only one of the
			
 
				 arguments may be a memory reference. Because we are mapping variables
			
 
				 to stack locations, many of our generated instructions will violate
			
 
				-this restriction. The purpose of the \textsf{patch\_instructions} pass
			
 
				-is to fix this problem by replacing every bad instruction with a short
			
 
				-sequence of instructions that use the \key{rax} register.
			
 
				+this restriction. The purpose of the \key{patch\_instructions} pass
			
 
				+is to fix this problem by replacing every violating instruction with a
			
 
				+short sequence of instructions that use the \key{rax} register.
			
 
				 
			
 
				 \section{Uniquify Variables}
			
 
				 \label{sec:uniquify-s0}
			
 
				 
			
 
				 The purpose of this pass is to make sure that each \key{let} uses a
			
 
				-unique variable name. For example, the \textsf{uniquify} pass could
			
 
				+unique variable name. For example, the \key{uniquify} pass could
			
 
				 translate
			
 
				 \[
			
 
				 \LET{x}{32}{ \BINOP{+}{ \LET{x}{10}{x} }{ x } }
			
@@ -1256,21 +1261,71 @@ to
 
				 \LET{x.1}{32}{ \BINOP{+}{ \LET{x.2}{10}{x.2} }{ x.1 } }
			
 
				 \]
			
 
				 
			
 
				-We recommend implementing \textsf{uniquify} as a recursive function
			
 
				-that mostly just copies the input program. However, when encountering
			
 
				-a \key{let}, it should generate a unique name for the variable (the
			
 
				+We recommend implementing \key{uniquify} as a recursive function that
			
 
				+mostly just copies the input program. However, when encountering a
			
 
				+\key{let}, it should generate a unique name for the variable (the
			
 
				 Racket function \key{gensym} is handy for this) and associate the old
			
 
				 name with the new unique name in an association list. The
			
 
				-\textsf{uniquify} function will need to access this association list
			
 
				-when it gets to a variable reference, so we add another paramter to
			
 
				-\textsf{uniquify} for the association list.
			
 
				+\key{uniquify} function will need to access this association list when
			
 
				+it gets to a variable reference, so we add another paramter to
			
 
				+\key{uniquify} for the association list. It is quite common for a
			
 
				+compiler pass to need a map to store extra information about
			
 
				+variables. Such maps are often called \emph{symbol tables}.
			
 
				+
			
 
				+The skeleton of the \key{uniquify} function is shown in
			
 
				+Figure~\ref{fig:uniquify-s0}.  The function is curried so that it is
			
 
				+convenient to partially apply it to an association list and then apply
			
 
				+it to different expressions, as in the last clause for primitive
			
 
				+operations in Figure~\ref{fig:uniquify-s0}.
			
 
				+
			
 
				+\begin{exercise}
			
 
				+Complete the \key{uniquify} pass by filling in the blanks, that is,
			
 
				+implement the clauses for variables and for the \key{let} construct.
			
 
				+\end{exercise}
			
 
				+
			
 
				+\begin{figure}[tbp]
			
 
				+\begin{lstlisting}
			
 
				+   (define uniquify
			
 
				+     (lambda (alist)
			
 
				+       (lambda (e)
			
 
				+         (match e
			
 
				+           [(? symbol?) ___]
			
 
				+           [(? integer?) e]
			
 
				+           [`(let ([,x ,e]) ,body) ___]
			
 
				+           [`(program ,info ,e)
			
 
				+            `(program ,info ,((uniquify alist) e))]
			
 
				+           [`(,op ,es ...)
			
 
				+            `(,op ,@(map (uniquify alist) es))]
			
 
				+           ))))
			
 
				+\end{lstlisting}
			
 
				+\caption{Skeleton for the \key{uniquify} pass.}
			
 
				+\label{fig:uniquify-s0}
			
 
				+\end{figure}
			
 
				+
			
 
				+\begin{exercise}
			
 
				+Test your \key{uniquify} pass by creating three example $S_0$ programs
			
 
				+and checking whether the output programs produce the same result as
			
 
				+the input programs. The $S_0$ programs should be designed to test the
			
 
				+most interesting parts of the \key{uniquify} pass, that is, the
			
 
				+programs should include \key{let} constructs, variables, and variables
			
 
				+that overshadow eachother.  
			
 
				+
			
 
				+[to do: explain the test-compiler function and interpret-S0.]
			
 
				+
			
 
				+%% You can use the interpreter \key{interpret-S0} defined in the
			
 
				+%% \key{interp.rkt} file. The entire sequence of tests should be a short
			
 
				+%% Racket program so you can re-run all the tests by running the Racket
			
 
				+%% program. We refer to this as the \emph{regression test} program.
			
 
				+\end{exercise}
			
 
				+
			
 
				 
			
 
				 \section{Flatten Expressions}
			
 
				 \label{sec:flatten-s0}
			
 
				 
			
 
				-The purpose of the \textsf{flatten} pass is to get rid of nested
			
 
				-expressions, such as the $\UNIOP{-}{10}$ in the following program,
			
 
				-without changing the behavior of the program.
			
 
				+The \key{flatten} pass will transform $S_0$ programs into $C_0$
			
 
				+programs. In particular, the purpose of the \key{flatten} pass is to
			
 
				+get rid of nested expressions, such as the $\UNIOP{-}{10}$ in the
			
 
				+following program.
			
 
				 \[
			
 
				 \BINOP{+}{52}{ \UNIOP{-}{10} }
			
 
				 \]
			
@@ -1281,14 +1336,19 @@ translated to the following one.
 
				 \[
			
 
				 \begin{array}{l}
			
 
				 \ASSIGN{ \itm{x} }{ \UNIOP{-}{10} } \\
			
 
				-\RETURN{ \BINOP{+}{52}{ \itm{x} } }
			
 
				+\ASSIGN{ \itm{y} }{ \BINOP{+}{52}{ \itm{x} } } \\
			
 
				+\RETURN{ y }
			
 
				 \end{array}
			
 
				 \]
			
 
				 
			
 
				-We recommend implementing \textsf{flatten} as a recursive function
			
 
				-that returns two things, 1) the newly flattened expression, and 2) a
			
 
				-list of assignment statements, one for each of the new variables
			
 
				-introduced while flattening the expression.
			
 
				+We recommend implementing \key{flatten} as a structurally recursive
			
 
				+function that returns two things, 1) the newly flattened expression,
			
 
				+and 2) a list of assignment statements, one for each of the new
			
 
				+variables introduced while flattening the expression. You can return
			
 
				+multiple things from a function using the \key{values} form and you
			
 
				+can receive multiple things from a function call using the
			
 
				+\key{define-values} form. If you are not familiar with these
			
 
				+constructs, the Racket documentation will be of help.
			
 
				 
			
 
				 Take special care for programs such as the following that initialize
			
 
				 variables with integers or other variables.
			
@@ -1302,7 +1362,7 @@ This program should be translated to
 
				 \RETURN{b}
			
 
				 \]
			
 
				 and not the following, which could result from a naive implementation
			
 
				-of \textsf{flatten}.
			
 
				+of \key{flatten}.
			
 
				 \[
			
 
				 \ASSIGN{x.1}{42}\;
			
 
				 \ASSIGN{a}{x.1}\;
			
@@ -1311,13 +1371,22 @@ of \textsf{flatten}.
 
				 \RETURN{b}
			
 
				 \]
			
 
				 
			
 
				+\begin{exercise}
			
 
				+Implement the \key{flatten} pass and test it on all of the example
			
 
				+programs that you created to test the \key{uniquify} pass and create
			
 
				+three new example programs that are designed to exercise all of the
			
 
				+interesting code in the \key{flatten} pass. 
			
 
				+[to do: add to the test-compiler stuff]
			
 
				+\end{exercise}
			
 
				+
			
 
				+
			
 
				 \section{Select Instructions}
			
 
				 \label{sec:select-s0}
			
 
				 
			
 
				-In the \textsf{select\_instructions} pass we begin the work of
			
 
				+In the \key{select\_instructions} pass we begin the work of
			
 
				 translating from $C_0$ to x86. The target language of this pass is a
			
 
				 pseudo-x86 language that still uses variables, so we add an AST node
			
 
				-of the form $\VAR{\itm{var}}$.  The \textsf{select\_instructions} pass
			
 
				+of the form $\VAR{\itm{var}}$.  The \key{select\_instructions} pass
			
 
				 deals with the differing format of arithmetic operations. For example,
			
 
				 in $C_0$ an addition operation could take the following form:
			
 
				 \[
			
@@ -1349,9 +1418,9 @@ procedure.
 
				 \label{sec:assign-s0}
			
 
				 
			
 
				 As discussed in Section~\ref{sec:plan-s0-x86}, the
			
 
				-\textsf{assign\_homes} pass places all of the variables on the stack.
			
 
				+\key{assign\_homes} pass places all of the variables on the stack.
			
 
				 Consider again the example $S_0$ program $\BINOP{+}{52}{ \UNIOP{-}{10} }$,
			
 
				-which after \textsf{select\_instructions} looks like the following.
			
 
				+which after \key{select\_instructions} looks like the following.
			
 
				 \[
			
 
				 \begin{array}{l}
			
 
				 (\key{mov}\;\INT{10}\; \VAR{x})\\
			
@@ -1361,7 +1430,7 @@ which after \textsf{select\_instructions} looks like the following.
 
				 \end{array}
			
 
				 \]
			
 
				 The one and only variable $x$ is assigned to stack location
			
 
				-\key{-8(\%rbp)}, so the \textsf{assign\_homes} pass translates the
			
 
				+\key{-8(\%rbp)}, so the \key{assign\_homes} pass translates the
			
 
				 above to
			
 
				 \[
			
 
				 \begin{array}{l}
			
@@ -1388,7 +1457,7 @@ Consider again the following example.
 
				 \[
			
 
				 \LET{a}{42}{ \LET{b}{a}{ b }}
			
 
				 \]
			
 
				-After \textsf{assign\_homes} pass, the above has been translated to
			
 
				+After \key{assign\_homes} pass, the above has been translated to
			
 
				 \[
			
 
				 \begin{array}{l}
			
 
				 (\key{mov} \;\INT{42}\; \STACKLOC{{-}8})\\
			
@@ -1823,11 +1892,11 @@ shown in Figure~\ref{fig:reg-alloc-passes}.
 
				 \begin{figure}[tbp]
			
 
				 \[
			
 
				 \xymatrix{
			
 
				-  C_0 \ar@/^/[r]^-{\textsf{select\_instr.}}
			
 
				-    & \text{x86}^{*} \ar[d]^-{\textsf{uncover\_live}} \\
			
 
				-    & \text{x86}^{*} \ar[d]^-{\textsf{build\_interference}} \\
			
 
				-    & \text{x86}^{*} \ar[d]_-{\textsf{allocate\_register}} \\
			
 
				-    & \text{x86}^{*} \ar@/^/[r]^-{\textsf{patch\_instr.}} 
			
 
				+  C_0 \ar@/^/[r]^-{\key{select\_instr.}}
			
 
				+    & \text{x86}^{*} \ar[d]^-{\key{uncover\_live}} \\
			
 
				+    & \text{x86}^{*} \ar[d]^-{\key{build\_interference}} \\
			
 
				+    & \text{x86}^{*} \ar[d]_-{\key{allocate\_register}} \\
			
 
				+    & \text{x86}^{*} \ar@/^/[r]^-{\key{patch\_instr.}} 
			
 
				     & \text{x86} 
			
 
				 }
			
 
				 \]