Jeremy Siek 6 лет назад
Родитель
Сommit
0a14e37bac
1 измененных файлов с 156 добавлено и 162 удалено
  1. 156 162
      book.tex

+ 156 - 162
book.tex

@@ -1326,156 +1326,181 @@ x86_0 &::= & (\key{program} \;\itm{info} \; ((\itm{label} \,\key{.}\, \Block)^{+
 \label{sec:plan-s0-x86}
 
 To compile one language to another it helps to focus on the
-differences between the two languages. It is these differences that
-the compiler will need to bridge. What are the differences between
-$R_1$ and x86 assembly? Here we list some of the most important ones.
+differences between the two languages because the compiler will need
+to bridge them. What are the differences between $R_1$ and x86
+assembly? Here we list some of the most important ones.
 
 \begin{enumerate}
 \item[(a)] x86 arithmetic instructions typically have two arguments
   and update the second argument in place. In contrast, $R_1$
   arithmetic operations take two arguments and produce a new value.
   An x86 instruction may have at most one memory-accessing argument.
-  Some instructions place custom restrictions on the kinds of
-  their arguments.
+  Furthermore, some instructions place special restrictions on their
+  arguments.
 
 \item[(b)] An argument to an $R_1$ operator can be any expression,
   whereas x86 instructions restrict their arguments to be \emph{simple
     expressions} like integers, registers, and memory locations.  (All
-  other kinds of expressions are called \emph{complex}.)
+  the other kinds are called \emph{complex expressions}.)
 
 \item[(c)] The order of execution in x86 is explicit in the syntax: a
-  sequence of instructions, whereas in $R_1$ it is a left-to-right
-  depth-first traversal of the abstract syntax tree.
+  sequence of instructions and jumps to labeled positions, whereas in
+  $R_1$ it is a left-to-right depth-first traversal of the abstract
+  syntax tree.
 
 \item[(d)] An $R_1$ program can have any number of variables whereas
-  x86 has only 16 registers.
+  x86 has 16 registers and the procedure calls stack.
 
 \item[(e)] Variables in $R_1$ can overshadow other variables with the
   same name. The registers and memory locations of x86 all have unique
-  names.
+  names or addresses.
 \end{enumerate}
 
 We ease the challenge of compiling from $R_1$ to x86 by breaking down
 the problem into several steps, dealing with the above differences one
-at a time. We begin by giving a sketch about how we might accomplish
-each of these steps, and give each step a name.  We shall then figure
-out an ordering of the steps. Finally, to implement the compiler, we
-shall write one function, typically recursive, per step. Each function
-is called a \emph{pass} of the compiler, because it traverses (passes
-over) the entire AST of the program.
+at a time.  Each of these steps is called a \emph{pass} of the
+compiler, because step traverses (passes over) the AST of the program.
+%
+We begin by giving a sketch about how we might implement each pass,
+and give them names.  We shall then figure out an ordering of the
+passes and the input/output language for each pass. The very first
+pass has $R_1$ as its input language and the last pass has x86 as its
+output language. In between we can choose whichever language is most
+convenient for expressing the output of each pass, whether that be
+$R_1$, x86, or new \emph{intermediate languages} of our own design.
+Finally, to implement the compiler, we shall write one function,
+typically a structural recursive function, per pass.
 
 \begin{description}
-\item[\key{select-instructions}] To handle the difference between
+\item[Pass \key{select-instructions}] To handle the difference between
   $R_1$ operations and x86 instructions we shall convert each $R_1$
   operation to a short sequence of instructions that accomplishes the
   same task.
 
-\item[\key{remove-complex-opera*}] To ensure that each argument of an
-  operation is a simple expression, we shall introduce temporary
-  variables to hold the results of complex subexpressions.
+\item[Pass \key{remove-complex-opera*}] To ensure that each
+  subexpression (i.e. operator and operand, and hence \key{opera*}) is
+  a simple expression, we shall introduce temporary variables to hold
+  the results of subexpressions.
   
-\item[\key{explicate-control}] To make the execution order of the
+\item[Pass \key{explicate-control}] To make the execution order of the
   program explicit, we shall convert from the abstract syntax tree
   representation into a graph representation in which each node
   contains a sequence of actions and the edges say where to go after
   the sequence is complete.
 
-\item[\key{assign-homes}] To handle the difference between the
+\item[Pass \key{assign-homes}] To handle the difference between the
   variables in $R_1$ versus the registers and stack location in x86,
   we shall come up with an assignment of each variable to its
-  ``home'', that is, to a register or stack location.
+  \emph{home}, that is, to a register or stack location.
 
-\item[\key{uniquify}] This pass deals with the shadowing of variables
+\item[Pass \key{uniquify}] This pass deals with the shadowing of variables
   by renaming every variable to a unique name, so that shadowing no
   longer occurs.
 
 \end{description}
 
+The next question is: in what order should we apply these passes? This
+question can be a challenging one to answer because it is difficult to
+know ahead of time which orders will be better (easier to implement,
+produce more efficient code, etc.) so often some trial-and-error is
+involved. Nevertheless, we can try to plan ahead and make educated
+choices regarding the orderings.
+
+Let us consider the ordering of \key{uniquify} and
+\key{remove-complex-opera*}. The assignment of subexpressions to
+temporary variables involving moving subexpressions, which might
+change the shadowing of variables an inadvertently change the program.
+But if we apply \key{uniquify} first, this will not be an issue. Of
+course, this means that in \key{remove-complex-opera*}, we need to
+ensure that the new temporary variables are unique.
+
+Next we shall consider the ordering of the \key{explicate-control}
+pass and \key{select-instructions}. It is clear that
+\key{explicate-control} must come first because the control-flow graph
+that it generates is needed when determing where to place the x86
+label and jump instructions.
+%
+Regarding the ordering of \key{explicate-control} with respect to
+\key{uniquify} and \key{remove-complex-opera*}, it perhaps does not
+matter very much, but it seems to work well to place
+\key{explicate-control} after these other two passes.
+
+The \key{assign-homes} pass should come after
+\key{remove-complex-opera*} and \key{explicate-control}.  The
+\key{remove-complex-opera*} pass generates temporary variables, which
+also need to be assigned homes, so \key{assign-homes} needs to come
+after. Regarding \key{explicate-control}, this pass deletes \emph{dead
+  code} (branches that will never be executed), which can remove
+variables. Thus it is beneficial to place \key{explicate-control}
+prior to \key{assign-homes} so that there are fewer variables that
+need to be assigned homes. This is important because the
+\key{assign-homes} pass has the highest time complexity.
+
+Last, we need to decide on the ordering of \key{select-instructions}
+and \key{assign-homes}.  These two issues are intertwined, creating a
+bit of a Gordian Knot. To do a good job of assigning homes, it is
+helpful to have already determined which instructions will be used,
+because x86 instructions have restrictions about which of their
+arguments can be registers versus stack locations. For example, one
+can give preferential treatment to variables that occur in
+register-argument positions. On the other hand, it may turn out to be
+impossible to make sure that all such variables are assigned to
+registers, and then one must redo the selection of instructions. Some
+compilers handle this problem by iteratively repeating these two
+passes until a good solution is found.  We shall suggest a simpler
+approach in which \key{select-instructions} come first, followed by
+the \key{assign-homes}, followed by a third pass, named
+\key{patch-instructions}, that uses a reserved register (\key{rax}) to
+patch-up any outstanding problems regarding instructions that involve
+too many memory accesses.
+
+Figure~\ref{fig:R1-passes} presents the ordering of the compiler
+passes in the form of a graph. Each pass is an edge and the
+input/output language of each pass is a node.
+
 UNDER CONSTRUCTION
 
-The main question then becomes: in what order do we tackle these
-differences? This can be a challenging question for a compiler writer
-to answer because some orderings may be much more difficult to
-implement than others. It is difficult to know ahead of time which
-orders will be better so often some trial-and-error is
-involved. However, we can try to plan ahead and choose the orderings
-based on this planning.
-
-% (e) uniquify
-% (b) rco
-% (c) explicate-control
-% (a) instr-sel.
-% (d) assign-homes (register allocation)
-
-% (e) -> (b)
-
-For example, to handle difference (b) (nested expressions), we shall
-introduce temporary variables to hold the intermediate results of each
-complex subexpression.  To deal with (e) (variable overshadowing) we
-shall renaming variables to make sure they have unique names.  The
-plan for (b) involves moving expressions, which could change the
-shadowing of variables.  However, if we deal with (e) first, then
-shadowing will not be an issue. Of course, this means that during (b),
-when we insert temporary variables, we need to make sure that they are
-unique.
-
-% (c) -> (a)
-To handle difference (c) (order of execution), we shall transform the
-program into a control flow graph: each vertex is a basic block,
-within which the order of execution is sequential. At the end of each
-block there is a jump to one or two other blocks, which form the edges
-of the graph. We need to handle this difference prior to (a)
-(operations vs. instructions) because it will determine where we need
-to generate x86 labels and jump instructions.
-% (e),(b) -> (c)
-With respect to (e) and (b), it perhaps does not matter very much
-whether (c) comes before or after them. We find it convenient to place
-(c) after (e) and (b).
-
-% (b) -> (d), (c) -> (d)
-To deal with difference (d) we replace variables with registers and
-stack locations. Thus, it makes sense to deal with (b) before (d) so
-that (d) can replace both the original variables and the temporary
-variables generated in dealing with (b). Also, it's good to handle (c)
-before (d) because while analyzing the control flow, we sometimes
-notice that some code and the variables it uses are unnecessary, so we
-can remove them which speeds up (d).
-
-% (a) -> (d)
-Last but not least, we need to decide on the ordering of (a)
-(selecting instructions) and (d) (mapping variables to stack locations
-and registers). These two issues are intertwined, creating a bit of a
-Gordian Knot. To handle difference (d), we need to map some variables
-to registers (there are only 16 registers) and the remaining variables
-to locations on the stack (which is unbounded). But recall that x86
-instructions have restrictions about which of their arguments can be
-registers versus memory accesses (stack locations).  So to make good
-decisions regarding this mapping, it is helpful to know which
-instructions use which variables. On the other hand, 
-
-
-
-We cut this knot by doing an optimistic selection of instructions in
-the \key{select-instructions} pass, followed by the \key{assign-homes}
-pass to map variables to registers or stack locations, and conclude by
-finalizing the instruction selection in the \key{patch-instructions}
-pass.
 
+\begin{figure}[tbp]
+\begin{tikzpicture}[baseline=(current  bounding  box.center)]
+\node (R1) at (0,2)  {\large $R_1$};
+\node (R1-2) at (3,2)  {\large $R_1$};
+\node (R1-3) at (6,2)  {\large $R_1$};
+\node (C0-1) at (6,0)  {\large $C_0$};
+\node (C0-2) at (3,0)  {\large $C_0$};
 
+\node (x86-2) at (3,-2)  {\large $\text{x86}^{*}_0$};
+\node (x86-3) at (6,-2)  {\large $\text{x86}^{*}_0$};
+\node (x86-4) at (9,-2) {\large $\text{x86}_0$};
+\node (x86-5) at (12,-2) {\large $\text{x86}^{\dagger}_0$};
 
-[ordering of reg. alloc versus instr. sel? -jeremy]
-\[
-\begin{tikzpicture}[baseline=(current  bounding  box.center)]
-\foreach \i/\p in {1/1,2/2,3/3,4/4,5/5,6/6}
-{
-  \node (\i) at (\p*1.5,0) {$\bullet$};
-}
-\foreach \x/\y/\lbl in {1/2/e,2/3/b,3/4/c,4/5/a,5/6/d}
-{
-  \path[->,bend left=15] (\x) edge [above] node {\small\lbl} (\y);
-}
+\path[->,bend left=15] (R1) edge [above] node {\ttfamily\footnotesize uniquify} (R1-2);
+\path[->,bend left=15] (R1-2) edge [above] node {\ttfamily\footnotesize remove-complex.} (R1-3);
+\path[->,bend left=15] (R1-3) edge [right] node {\ttfamily\footnotesize explicate-control} (C0-1);
+\path[->,bend right=15] (C0-1) edge [above] node {\ttfamily\footnotesize uncover-locals} (C0-2);
+\path[->,bend right=15] (C0-2) edge [left] node {\ttfamily\footnotesize select-instr.} (x86-2);
+\path[->,bend left=15] (x86-2) edge [above] node {\ttfamily\footnotesize assign-homes} (x86-3);
+\path[->,bend left=15] (x86-3) edge [above] node {\ttfamily\footnotesize patch-instr.} (x86-4);
+\path[->,bend left=15] (x86-4) edge [above] node {\ttfamily\footnotesize print-x86} (x86-5);
 \end{tikzpicture}
-\]
+
+\caption{Overview of the passes for compiling $R_1$. }
+\label{fig:R1-passes}
+\end{figure}
+
+
+%% \[
+%% \begin{tikzpicture}[baseline=(current  bounding  box.center)]
+%% \foreach \i/\p in {1/1,2/2,3/3,4/4,5/5,6/6}
+%% {
+%%   \node (\i) at (\p*1.5,0) {$\bullet$};
+%% }
+%% \foreach \x/\y/\lbl in {1/2/e,2/3/b,3/4/c,4/5/a,5/6/d}
+%% {
+%%   \path[->,bend left=15] (\x) edge [above] node {\small\lbl} (\y);
+%% }
+%% \end{tikzpicture}
+%% \]
 We further simplify the translation from $R_1$ to x86 by identifying
 an intermediate language named $C_0$, roughly half-way between $R_1$
 and x86, to provide a rest stop along the way. We name the language
@@ -1484,22 +1509,18 @@ language~\citep{Kernighan:1988nx}. The differences (e) and (a),
 regarding variables and nested expressions, will be handled by two
 steps, \key{uniquify} and \key{flatten}, which bring us to
 $C_0$.
-\[
-\begin{tikzpicture}[baseline=(current  bounding  box.center)]
-\foreach \i/\p in {R_1/1,R_1/2,C_0/3}
-{
-  \node (\p) at (\p*3,0) {\large $\i$};
-}
-\foreach \x/\y/\lbl in {1/2/uniquify,2/3/flatten}
-{
- \path[->,bend left=15] (\x) edge [above] node {\ttfamily\footnotesize \lbl} (\y);
-}
-\end{tikzpicture}
-\]
-Each of these steps in the compiler is implemented by a function,
-typically a structurally recursive function that translates an input
-AST into an output AST. We refer to such a function as a \emph{pass}
-because it makes a pass over, i.e. it traverses, the entire AST.
+%% \[
+%% \begin{tikzpicture}[baseline=(current  bounding  box.center)]
+%% \foreach \i/\p in {R_1/1,R_1/2,C_0/3}
+%% {
+%%   \node (\p) at (\p*3,0) {\large $\i$};
+%% }
+%% \foreach \x/\y/\lbl in {1/2/uniquify,2/3/flatten}
+%% {
+%%  \path[->,bend left=15] (\x) edge [above] node {\ttfamily\footnotesize \lbl} (\y);
+%% }
+%% \end{tikzpicture}
+%% \]
 
 The syntax for $C_0$ is defined in Figure~\ref{fig:c0-syntax}.  The
 $C_0$ language supports the same operators as $R_1$ but the arguments
@@ -1551,18 +1572,18 @@ this knot by doing an optimistic selection of instructions in the
 pass to map variables to registers or stack locations, and conclude by
 finalizing the instruction selection in the \key{patch-instructions}
 pass.
-\[
-\begin{tikzpicture}[baseline=(current  bounding  box.center)]
-\node (1) at (0,0)  {\large $C_0$};
-\node (2) at (3,0)  {\large $\text{x86}^{*}_0$};
-\node (3) at (6,0)  {\large $\text{x86}^{*}_0$};
-\node (4) at (9,0) {\large $\text{x86}_0$};
-
-\path[->,bend left=15] (1) edge [above] node {\ttfamily\footnotesize select-instr.} (2);
-\path[->,bend left=15] (2) edge [above] node {\ttfamily\footnotesize assign-homes} (3);
-\path[->,bend left=15] (3) edge [above] node {\ttfamily\footnotesize patch-instr.} (4);
-\end{tikzpicture}
-\]
+%% \[
+%% \begin{tikzpicture}[baseline=(current  bounding  box.center)]
+%% \node (1) at (0,0)  {\large $C_0$};
+%% \node (2) at (3,0)  {\large $\text{x86}^{*}_0$};
+%% \node (3) at (6,0)  {\large $\text{x86}^{*}_0$};
+%% \node (4) at (9,0) {\large $\text{x86}_0$};
+
+%% \path[->,bend left=15] (1) edge [above] node {\ttfamily\footnotesize select-instr.} (2);
+%% \path[->,bend left=15] (2) edge [above] node {\ttfamily\footnotesize assign-homes} (3);
+%% \path[->,bend left=15] (3) edge [above] node {\ttfamily\footnotesize patch-instr.} (4);
+%% \end{tikzpicture}
+%% \]
 
 The \key{select-instructions} pass is optimistic in the sense that it
 treats variables as if they were all mapped to registers. The
@@ -1589,6 +1610,10 @@ instructions that use the \key{rax} register. Once we have implemented
 a good register allocator (Chapter~\ref{ch:register-allocation}), the
 need to patch instructions will be relatively rare.
 
+The x86$^{*}$ language extends x86
+with variables and looser rules regarding instruction arguments. The
+x86$^{\dagger}$ language is the concrete syntax (string) for x86.
+
 
 \section{Uniquify Variables}
 \label{sec:uniquify-s0}
@@ -2102,37 +2127,6 @@ programs.
 %valid code for Unix machines.
 \end{exercise}
 
-\begin{figure}[p]
-\begin{tikzpicture}[baseline=(current  bounding  box.center)]
-\node (R1) at (0,2)  {\large $R_1$};
-\node (R1-2) at (3,2)  {\large $R_1$};
-\node (R1-3) at (6,2)  {\large $R_1$};
-\node (C0-1) at (6,0)  {\large $C_0$};
-\node (C0-2) at (3,0)  {\large $C_0$};
-
-\node (x86-2) at (3,-2)  {\large $\text{x86}^{*}_0$};
-\node (x86-3) at (6,-2)  {\large $\text{x86}^{*}_0$};
-\node (x86-4) at (9,-2) {\large $\text{x86}_0$};
-\node (x86-5) at (12,-2) {\large $\text{x86}^{\dagger}_0$};
-
-\path[->,bend left=15] (R1) edge [above] node {\ttfamily\footnotesize uniquify} (R1-2);
-\path[->,bend left=15] (R1-2) edge [above] node {\ttfamily\footnotesize remove-complex.} (R1-3);
-\path[->,bend left=15] (R1-3) edge [right] node {\ttfamily\footnotesize explicate-control} (C0-1);
-\path[->,bend right=15] (C0-1) edge [above] node {\ttfamily\footnotesize uncover-locals} (C0-2);
-\path[->,bend right=15] (C0-2) edge [left] node {\ttfamily\footnotesize select-instr.} (x86-2);
-\path[->,bend left=15] (x86-2) edge [above] node {\ttfamily\footnotesize assign-homes} (x86-3);
-\path[->,bend left=15] (x86-3) edge [above] node {\ttfamily\footnotesize patch-instr.} (x86-4);
-\path[->,bend left=15] (x86-4) edge [above] node {\ttfamily\footnotesize print-x86} (x86-5);
-\end{tikzpicture}
-
-\caption{Overview of the passes for compiling $R_1$. }
-\label{fig:R1-passes}
-\end{figure}
-
-Figure~\ref{fig:R1-passes} provides an overview of all the compiler
-passes described in this Chapter.  The x86$^{*}$ language extends x86
-with variables and looser rules regarding instruction arguments. The
-x86$^{\dagger}$ language is the concrete syntax (string) for x86.
 
 \margincomment{\footnotesize To do: add a challenge section. Perhaps
   extending the partial evaluation to $R_0$? \\ --Jeremy}