6 лет назад · 0a14e37bac
--- a/book.tex
+++ b/book.tex
@@ -1326,156 +1326,181 @@ x86_0 &::= & (\key{program} \;\itm{info} \; ((\itm{label} \,\key{.}\, \Block)^{+
 
				 \label{sec:plan-s0-x86}
			
 
				 
			
 
				 To compile one language to another it helps to focus on the
			
 
				-differences between the two languages. It is these differences that
			
 
				-the compiler will need to bridge. What are the differences between
			
 
				-$R_1$ and x86 assembly? Here we list some of the most important ones.
			
 
				+differences between the two languages because the compiler will need
			
 
				+to bridge them. What are the differences between $R_1$ and x86
			
 
				+assembly? Here we list some of the most important ones.
			
 
				 
			
 
				 \begin{enumerate}
			
 
				 \item[(a)] x86 arithmetic instructions typically have two arguments
			
 
				   and update the second argument in place. In contrast, $R_1$
			
 
				   arithmetic operations take two arguments and produce a new value.
			
 
				   An x86 instruction may have at most one memory-accessing argument.
			
 
				-  Some instructions place custom restrictions on the kinds of
			
 
				-  their arguments.
			
 
				+  Furthermore, some instructions place special restrictions on their
			
 
				+  arguments.
			
 
				 
			
 
				 \item[(b)] An argument to an $R_1$ operator can be any expression,
			
 
				   whereas x86 instructions restrict their arguments to be \emph{simple
			
 
				     expressions} like integers, registers, and memory locations.  (All
			
 
				-  other kinds of expressions are called \emph{complex}.)
			
 
				+  the other kinds are called \emph{complex expressions}.)
			
 
				 
			
 
				 \item[(c)] The order of execution in x86 is explicit in the syntax: a
			
 
				-  sequence of instructions, whereas in $R_1$ it is a left-to-right
			
 
				-  depth-first traversal of the abstract syntax tree.
			
 
				+  sequence of instructions and jumps to labeled positions, whereas in
			
 
				+  $R_1$ it is a left-to-right depth-first traversal of the abstract
			
 
				+  syntax tree.
			
 
				 
			
 
				 \item[(d)] An $R_1$ program can have any number of variables whereas
			
 
				-  x86 has only 16 registers.
			
 
				+  x86 has 16 registers and the procedure calls stack.
			
 
				 
			
 
				 \item[(e)] Variables in $R_1$ can overshadow other variables with the
			
 
				   same name. The registers and memory locations of x86 all have unique
			
 
				-  names.
			
 
				+  names or addresses.
			
 
				 \end{enumerate}
			
 
				 
			
 
				 We ease the challenge of compiling from $R_1$ to x86 by breaking down
			
 
				 the problem into several steps, dealing with the above differences one
			
 
				-at a time. We begin by giving a sketch about how we might accomplish
			
 
				-each of these steps, and give each step a name.  We shall then figure
			
 
				-out an ordering of the steps. Finally, to implement the compiler, we
			
 
				-shall write one function, typically recursive, per step. Each function
			
 
				-is called a \emph{pass} of the compiler, because it traverses (passes
			
 
				-over) the entire AST of the program.
			
 
				+at a time.  Each of these steps is called a \emph{pass} of the
			
 
				+compiler, because step traverses (passes over) the AST of the program.
			
 
				+%
			
 
				+We begin by giving a sketch about how we might implement each pass,
			
 
				+and give them names.  We shall then figure out an ordering of the
			
 
				+passes and the input/output language for each pass. The very first
			
 
				+pass has $R_1$ as its input language and the last pass has x86 as its
			
 
				+output language. In between we can choose whichever language is most
			
 
				+convenient for expressing the output of each pass, whether that be
			
 
				+$R_1$, x86, or new \emph{intermediate languages} of our own design.
			
 
				+Finally, to implement the compiler, we shall write one function,
			
 
				+typically a structural recursive function, per pass.
			
 
				 
			
 
				 \begin{description}
			
 
				-\item[\key{select-instructions}] To handle the difference between
			
 
				+\item[Pass \key{select-instructions}] To handle the difference between
			
 
				   $R_1$ operations and x86 instructions we shall convert each $R_1$
			
 
				   operation to a short sequence of instructions that accomplishes the
			
 
				   same task.
			
 
				 
			
 
				-\item[\key{remove-complex-opera*}] To ensure that each argument of an
			
 
				-  operation is a simple expression, we shall introduce temporary
			
 
				-  variables to hold the results of complex subexpressions.
			
 
				+\item[Pass \key{remove-complex-opera*}] To ensure that each
			
 
				+  subexpression (i.e. operator and operand, and hence \key{opera*}) is
			
 
				+  a simple expression, we shall introduce temporary variables to hold
			
 
				+  the results of subexpressions.
			
 
				   
			
 
				-\item[\key{explicate-control}] To make the execution order of the
			
 
				+\item[Pass \key{explicate-control}] To make the execution order of the
			
 
				   program explicit, we shall convert from the abstract syntax tree
			
 
				   representation into a graph representation in which each node
			
 
				   contains a sequence of actions and the edges say where to go after
			
 
				   the sequence is complete.
			
 
				 
			
 
				-\item[\key{assign-homes}] To handle the difference between the
			
 
				+\item[Pass \key{assign-homes}] To handle the difference between the
			
 
				   variables in $R_1$ versus the registers and stack location in x86,
			
 
				   we shall come up with an assignment of each variable to its
			
 
				-  ``home'', that is, to a register or stack location.
			
 
				+  \emph{home}, that is, to a register or stack location.
			
 
				 
			
 
				-\item[\key{uniquify}] This pass deals with the shadowing of variables
			
 
				+\item[Pass \key{uniquify}] This pass deals with the shadowing of variables
			
 
				   by renaming every variable to a unique name, so that shadowing no
			
 
				   longer occurs.
			
 
				 
			
 
				 \end{description}
			
 
				 
			
 
				+The next question is: in what order should we apply these passes? This
			
 
				+question can be a challenging one to answer because it is difficult to
			
 
				+know ahead of time which orders will be better (easier to implement,
			
 
				+produce more efficient code, etc.) so often some trial-and-error is
			
 
				+involved. Nevertheless, we can try to plan ahead and make educated
			
 
				+choices regarding the orderings.
			
 
				+
			
 
				+Let us consider the ordering of \key{uniquify} and
			
 
				+\key{remove-complex-opera*}. The assignment of subexpressions to
			
 
				+temporary variables involving moving subexpressions, which might
			
 
				+change the shadowing of variables an inadvertently change the program.
			
 
				+But if we apply \key{uniquify} first, this will not be an issue. Of
			
 
				+course, this means that in \key{remove-complex-opera*}, we need to
			
 
				+ensure that the new temporary variables are unique.
			
 
				+
			
 
				+Next we shall consider the ordering of the \key{explicate-control}
			
 
				+pass and \key{select-instructions}. It is clear that
			
 
				+\key{explicate-control} must come first because the control-flow graph
			
 
				+that it generates is needed when determing where to place the x86
			
 
				+label and jump instructions.
			
 
				+%
			
 
				+Regarding the ordering of \key{explicate-control} with respect to
			
 
				+\key{uniquify} and \key{remove-complex-opera*}, it perhaps does not
			
 
				+matter very much, but it seems to work well to place
			
 
				+\key{explicate-control} after these other two passes.
			
 
				+
			
 
				+The \key{assign-homes} pass should come after
			
 
				+\key{remove-complex-opera*} and \key{explicate-control}.  The
			
 
				+\key{remove-complex-opera*} pass generates temporary variables, which
			
 
				+also need to be assigned homes, so \key{assign-homes} needs to come
			
 
				+after. Regarding \key{explicate-control}, this pass deletes \emph{dead
			
 
				+  code} (branches that will never be executed), which can remove
			
 
				+variables. Thus it is beneficial to place \key{explicate-control}
			
 
				+prior to \key{assign-homes} so that there are fewer variables that
			
 
				+need to be assigned homes. This is important because the
			
 
				+\key{assign-homes} pass has the highest time complexity.
			
 
				+
			
 
				+Last, we need to decide on the ordering of \key{select-instructions}
			
 
				+and \key{assign-homes}.  These two issues are intertwined, creating a
			
 
				+bit of a Gordian Knot. To do a good job of assigning homes, it is
			
 
				+helpful to have already determined which instructions will be used,
			
 
				+because x86 instructions have restrictions about which of their
			
 
				+arguments can be registers versus stack locations. For example, one
			
 
				+can give preferential treatment to variables that occur in
			
 
				+register-argument positions. On the other hand, it may turn out to be
			
 
				+impossible to make sure that all such variables are assigned to
			
 
				+registers, and then one must redo the selection of instructions. Some
			
 
				+compilers handle this problem by iteratively repeating these two
			
 
				+passes until a good solution is found.  We shall suggest a simpler
			
 
				+approach in which \key{select-instructions} come first, followed by
			
 
				+the \key{assign-homes}, followed by a third pass, named
			
 
				+\key{patch-instructions}, that uses a reserved register (\key{rax}) to
			
 
				+patch-up any outstanding problems regarding instructions that involve
			
 
				+too many memory accesses.
			
 
				+
			
 
				+Figure~\ref{fig:R1-passes} presents the ordering of the compiler
			
 
				+passes in the form of a graph. Each pass is an edge and the
			
 
				+input/output language of each pass is a node.
			
 
				+
			
 
				 UNDER CONSTRUCTION
			
 
				 
			
 
				-The main question then becomes: in what order do we tackle these
			
 
				-differences? This can be a challenging question for a compiler writer
			
 
				-to answer because some orderings may be much more difficult to
			
 
				-implement than others. It is difficult to know ahead of time which
			
 
				-orders will be better so often some trial-and-error is
			
 
				-involved. However, we can try to plan ahead and choose the orderings
			
 
				-based on this planning.
			
 
				-
			
 
				-% (e) uniquify
			
 
				-% (b) rco
			
 
				-% (c) explicate-control
			
 
				-% (a) instr-sel.
			
 
				-% (d) assign-homes (register allocation)
			
 
				-
			
 
				-% (e) -> (b)
			
 
				-
			
 
				-For example, to handle difference (b) (nested expressions), we shall
			
 
				-introduce temporary variables to hold the intermediate results of each
			
 
				-complex subexpression.  To deal with (e) (variable overshadowing) we
			
 
				-shall renaming variables to make sure they have unique names.  The
			
 
				-plan for (b) involves moving expressions, which could change the
			
 
				-shadowing of variables.  However, if we deal with (e) first, then
			
 
				-shadowing will not be an issue. Of course, this means that during (b),
			
 
				-when we insert temporary variables, we need to make sure that they are
			
 
				-unique.
			
 
				-
			
 
				-% (c) -> (a)
			
 
				-To handle difference (c) (order of execution), we shall transform the
			
 
				-program into a control flow graph: each vertex is a basic block,
			
 
				-within which the order of execution is sequential. At the end of each
			
 
				-block there is a jump to one or two other blocks, which form the edges
			
 
				-of the graph. We need to handle this difference prior to (a)
			
 
				-(operations vs. instructions) because it will determine where we need
			
 
				-to generate x86 labels and jump instructions.
			
 
				-% (e),(b) -> (c)
			
 
				-With respect to (e) and (b), it perhaps does not matter very much
			
 
				-whether (c) comes before or after them. We find it convenient to place
			
 
				-(c) after (e) and (b).
			
 
				-
			
 
				-% (b) -> (d), (c) -> (d)
			
 
				-To deal with difference (d) we replace variables with registers and
			
 
				-stack locations. Thus, it makes sense to deal with (b) before (d) so
			
 
				-that (d) can replace both the original variables and the temporary
			
 
				-variables generated in dealing with (b). Also, it's good to handle (c)
			
 
				-before (d) because while analyzing the control flow, we sometimes
			
 
				-notice that some code and the variables it uses are unnecessary, so we
			
 
				-can remove them which speeds up (d).
			
 
				-
			
 
				-% (a) -> (d)
			
 
				-Last but not least, we need to decide on the ordering of (a)
			
 
				-(selecting instructions) and (d) (mapping variables to stack locations
			
 
				-and registers). These two issues are intertwined, creating a bit of a
			
 
				-Gordian Knot. To handle difference (d), we need to map some variables
			
 
				-to registers (there are only 16 registers) and the remaining variables
			
 
				-to locations on the stack (which is unbounded). But recall that x86
			
 
				-instructions have restrictions about which of their arguments can be
			
 
				-registers versus memory accesses (stack locations).  So to make good
			
 
				-decisions regarding this mapping, it is helpful to know which
			
 
				-instructions use which variables. On the other hand, 
			
 
				-
			
 
				-
			
 
				-
			
 
				-We cut this knot by doing an optimistic selection of instructions in
			
 
				-the \key{select-instructions} pass, followed by the \key{assign-homes}
			
 
				-pass to map variables to registers or stack locations, and conclude by
			
 
				-finalizing the instruction selection in the \key{patch-instructions}
			
 
				-pass.
			
 
				 
			
 
				+\begin{figure}[tbp]
			
 
				+\begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				+\node (R1) at (0,2)  {\large $R_1$};
			
 
				+\node (R1-2) at (3,2)  {\large $R_1$};
			
 
				+\node (R1-3) at (6,2)  {\large $R_1$};
			
 
				+\node (C0-1) at (6,0)  {\large $C_0$};
			
 
				+\node (C0-2) at (3,0)  {\large $C_0$};
			
 
				 
			
 
				+\node (x86-2) at (3,-2)  {\large $\text{x86}^{*}_0$};
			
 
				+\node (x86-3) at (6,-2)  {\large $\text{x86}^{*}_0$};
			
 
				+\node (x86-4) at (9,-2) {\large $\text{x86}_0$};
			
 
				+\node (x86-5) at (12,-2) {\large $\text{x86}^{\dagger}_0$};
			
 
				 
			
 
				-[ordering of reg. alloc versus instr. sel? -jeremy]
			
 
				-\[
			
 
				-\begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				-\foreach \i/\p in {1/1,2/2,3/3,4/4,5/5,6/6}
			
 
				-{
			
 
				-  \node (\i) at (\p*1.5,0) {$\bullet$};
			
 
				-}
			
 
				-\foreach \x/\y/\lbl in {1/2/e,2/3/b,3/4/c,4/5/a,5/6/d}
			
 
				-{
			
 
				-  \path[->,bend left=15] (\x) edge [above] node {\small\lbl} (\y);
			
 
				-}
			
 
				+\path[->,bend left=15] (R1) edge [above] node {\ttfamily\footnotesize uniquify} (R1-2);
			
 
				+\path[->,bend left=15] (R1-2) edge [above] node {\ttfamily\footnotesize remove-complex.} (R1-3);
			
 
				+\path[->,bend left=15] (R1-3) edge [right] node {\ttfamily\footnotesize explicate-control} (C0-1);
			
 
				+\path[->,bend right=15] (C0-1) edge [above] node {\ttfamily\footnotesize uncover-locals} (C0-2);
			
 
				+\path[->,bend right=15] (C0-2) edge [left] node {\ttfamily\footnotesize select-instr.} (x86-2);
			
 
				+\path[->,bend left=15] (x86-2) edge [above] node {\ttfamily\footnotesize assign-homes} (x86-3);
			
 
				+\path[->,bend left=15] (x86-3) edge [above] node {\ttfamily\footnotesize patch-instr.} (x86-4);
			
 
				+\path[->,bend left=15] (x86-4) edge [above] node {\ttfamily\footnotesize print-x86} (x86-5);
			
 
				 \end{tikzpicture}
			
 
				-\]
			
 
				+
			
 
				+\caption{Overview of the passes for compiling $R_1$. }
			
 
				+\label{fig:R1-passes}
			
 
				+\end{figure}
			
 
				+
			
 
				+
			
 
				+%% \[
			
 
				+%% \begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				+%% \foreach \i/\p in {1/1,2/2,3/3,4/4,5/5,6/6}
			
 
				+%% {
			
 
				+%%   \node (\i) at (\p*1.5,0) {$\bullet$};
			
 
				+%% }
			
 
				+%% \foreach \x/\y/\lbl in {1/2/e,2/3/b,3/4/c,4/5/a,5/6/d}
			
 
				+%% {
			
 
				+%%   \path[->,bend left=15] (\x) edge [above] node {\small\lbl} (\y);
			
 
				+%% }
			
 
				+%% \end{tikzpicture}
			
 
				+%% \]
			
 
				 We further simplify the translation from $R_1$ to x86 by identifying
			
 
				 an intermediate language named $C_0$, roughly half-way between $R_1$
			
 
				 and x86, to provide a rest stop along the way. We name the language
			
@@ -1484,22 +1509,18 @@ language~\citep{Kernighan:1988nx}. The differences (e) and (a),
 
				 regarding variables and nested expressions, will be handled by two
			
 
				 steps, \key{uniquify} and \key{flatten}, which bring us to
			
 
				 $C_0$.
			
 
				-\[
			
 
				-\begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				-\foreach \i/\p in {R_1/1,R_1/2,C_0/3}
			
 
				-{
			
 
				-  \node (\p) at (\p*3,0) {\large $\i$};
			
 
				-}
			
 
				-\foreach \x/\y/\lbl in {1/2/uniquify,2/3/flatten}
			
 
				-{
			
 
				- \path[->,bend left=15] (\x) edge [above] node {\ttfamily\footnotesize \lbl} (\y);
			
 
				-}
			
 
				-\end{tikzpicture}
			
 
				-\]
			
 
				-Each of these steps in the compiler is implemented by a function,
			
 
				-typically a structurally recursive function that translates an input
			
 
				-AST into an output AST. We refer to such a function as a \emph{pass}
			
 
				-because it makes a pass over, i.e. it traverses, the entire AST.
			
 
				+%% \[
			
 
				+%% \begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				+%% \foreach \i/\p in {R_1/1,R_1/2,C_0/3}
			
 
				+%% {
			
 
				+%%   \node (\p) at (\p*3,0) {\large $\i$};
			
 
				+%% }
			
 
				+%% \foreach \x/\y/\lbl in {1/2/uniquify,2/3/flatten}
			
 
				+%% {
			
 
				+%%  \path[->,bend left=15] (\x) edge [above] node {\ttfamily\footnotesize \lbl} (\y);
			
 
				+%% }
			
 
				+%% \end{tikzpicture}
			
 
				+%% \]
			
 
				 
			
 
				 The syntax for $C_0$ is defined in Figure~\ref{fig:c0-syntax}.  The
			
 
				 $C_0$ language supports the same operators as $R_1$ but the arguments
			
@@ -1551,18 +1572,18 @@ this knot by doing an optimistic selection of instructions in the
 
				 pass to map variables to registers or stack locations, and conclude by
			
 
				 finalizing the instruction selection in the \key{patch-instructions}
			
 
				 pass.
			
 
				-\[
			
 
				-\begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				-\node (1) at (0,0)  {\large $C_0$};
			
 
				-\node (2) at (3,0)  {\large $\text{x86}^{*}_0$};
			
 
				-\node (3) at (6,0)  {\large $\text{x86}^{*}_0$};
			
 
				-\node (4) at (9,0) {\large $\text{x86}_0$};
			
 
				-
			
 
				-\path[->,bend left=15] (1) edge [above] node {\ttfamily\footnotesize select-instr.} (2);
			
 
				-\path[->,bend left=15] (2) edge [above] node {\ttfamily\footnotesize assign-homes} (3);
			
 
				-\path[->,bend left=15] (3) edge [above] node {\ttfamily\footnotesize patch-instr.} (4);
			
 
				-\end{tikzpicture}
			
 
				-\]
			
 
				+%% \[
			
 
				+%% \begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				+%% \node (1) at (0,0)  {\large $C_0$};
			
 
				+%% \node (2) at (3,0)  {\large $\text{x86}^{*}_0$};
			
 
				+%% \node (3) at (6,0)  {\large $\text{x86}^{*}_0$};
			
 
				+%% \node (4) at (9,0) {\large $\text{x86}_0$};
			
 
				+
			
 
				+%% \path[->,bend left=15] (1) edge [above] node {\ttfamily\footnotesize select-instr.} (2);
			
 
				+%% \path[->,bend left=15] (2) edge [above] node {\ttfamily\footnotesize assign-homes} (3);
			
 
				+%% \path[->,bend left=15] (3) edge [above] node {\ttfamily\footnotesize patch-instr.} (4);
			
 
				+%% \end{tikzpicture}
			
 
				+%% \]
			
 
				 
			
 
				 The \key{select-instructions} pass is optimistic in the sense that it
			
 
				 treats variables as if they were all mapped to registers. The
			
@@ -1589,6 +1610,10 @@ instructions that use the \key{rax} register. Once we have implemented
 
				 a good register allocator (Chapter~\ref{ch:register-allocation}), the
			
 
				 need to patch instructions will be relatively rare.
			
 
				 
			
 
				+The x86$^{*}$ language extends x86
			
 
				+with variables and looser rules regarding instruction arguments. The
			
 
				+x86$^{\dagger}$ language is the concrete syntax (string) for x86.
			
 
				+
			
 
				 
			
 
				 \section{Uniquify Variables}
			
 
				 \label{sec:uniquify-s0}
			
@@ -2102,37 +2127,6 @@ programs.
 
				 %valid code for Unix machines.
			
 
				 \end{exercise}
			
 
				 
			
 
				-\begin{figure}[p]
			
 
				-\begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				-\node (R1) at (0,2)  {\large $R_1$};
			
 
				-\node (R1-2) at (3,2)  {\large $R_1$};
			
 
				-\node (R1-3) at (6,2)  {\large $R_1$};
			
 
				-\node (C0-1) at (6,0)  {\large $C_0$};
			
 
				-\node (C0-2) at (3,0)  {\large $C_0$};
			
 
				-
			
 
				-\node (x86-2) at (3,-2)  {\large $\text{x86}^{*}_0$};
			
 
				-\node (x86-3) at (6,-2)  {\large $\text{x86}^{*}_0$};
			
 
				-\node (x86-4) at (9,-2) {\large $\text{x86}_0$};
			
 
				-\node (x86-5) at (12,-2) {\large $\text{x86}^{\dagger}_0$};
			
 
				-
			
 
				-\path[->,bend left=15] (R1) edge [above] node {\ttfamily\footnotesize uniquify} (R1-2);
			
 
				-\path[->,bend left=15] (R1-2) edge [above] node {\ttfamily\footnotesize remove-complex.} (R1-3);
			
 
				-\path[->,bend left=15] (R1-3) edge [right] node {\ttfamily\footnotesize explicate-control} (C0-1);
			
 
				-\path[->,bend right=15] (C0-1) edge [above] node {\ttfamily\footnotesize uncover-locals} (C0-2);
			
 
				-\path[->,bend right=15] (C0-2) edge [left] node {\ttfamily\footnotesize select-instr.} (x86-2);
			
 
				-\path[->,bend left=15] (x86-2) edge [above] node {\ttfamily\footnotesize assign-homes} (x86-3);
			
 
				-\path[->,bend left=15] (x86-3) edge [above] node {\ttfamily\footnotesize patch-instr.} (x86-4);
			
 
				-\path[->,bend left=15] (x86-4) edge [above] node {\ttfamily\footnotesize print-x86} (x86-5);
			
 
				-\end{tikzpicture}
			
 
				-
			
 
				-\caption{Overview of the passes for compiling $R_1$. }
			
 
				-\label{fig:R1-passes}
			
 
				-\end{figure}
			
 
				-
			
 
				-Figure~\ref{fig:R1-passes} provides an overview of all the compiler
			
 
				-passes described in this Chapter.  The x86$^{*}$ language extends x86
			
 
				-with variables and looser rules regarding instruction arguments. The
			
 
				-x86$^{\dagger}$ language is the concrete syntax (string) for x86.
			
 
				 
			
 
				 \margincomment{\footnotesize To do: add a challenge section. Perhaps
			
 
				   extending the partial evaluation to $R_0$? \\ --Jeremy}