Jeremy Siek před 4 roky
rodič
revize
9d684238ca
1 změnil soubory, kde provedl 116 přidání a 86 odebrání
  1. 116 86
      book.tex

+ 116 - 86
book.tex

@@ -2127,18 +2127,18 @@ followed by a move from \code{rax} to the left-hand side.  The move
 from \code{rax} is needed because the return value from
 \code{read\_int} goes into \code{rax}, as is the case in general.  \\
 \begin{tabular}{lll}
-\begin{minipage}{0.4\textwidth}
+\begin{minipage}{0.3\textwidth}
 \begin{lstlisting}
-|$\itm{lhs}$| = (read);
+|$\itm{var}$| = (read);
 \end{lstlisting}
 \end{minipage}
 &
 $\Rightarrow$
 &
-\begin{minipage}{0.4\textwidth}
+\begin{minipage}{0.3\textwidth}
 \begin{lstlisting}
 callq read_int
-movq %rax, |$\itm{lhs}$|
+movq %rax, |$\itm{var}$|
 \end{lstlisting}
 \end{minipage}
 \end{tabular} \\
@@ -3800,9 +3800,9 @@ One small challenge is that x86 does not provide an instruction that
 directly implements logical negation (\code{not} in $R_2$ and $C_1$).
 However, the \code{xorq} instruction can be used to encode \code{not}.
 The \key{xorq} instruction takes two arguments, performs a pairwise
-exclusive-or operation on each bit of its arguments, and writes the
-results into its second argument.  Recall the truth table for
-exclusive-or:
+exclusive-or ($\mathrm{XOR}$) operation on each bit of its arguments,
+and writes the results into its second argument.  Recall the truth
+table for exclusive-or:
 \begin{center}
 \begin{tabular}{l|cc}
    & 0 & 1 \\ \hline
@@ -3810,14 +3810,19 @@ exclusive-or:
 1  & 1 & 0
 \end{tabular}
 \end{center}
-For example, $0011 \mathrel{\mathrm{XOR}} 0101 = 0110$.  Notice that
-in the row of the table for the bit $1$, the result is the opposite of the
-second bit.  Thus, the \code{not} operation can be implemented by
-\code{xorq} with $1$ as the first argument:
-\begin{align*}
-  0001 \mathrel{\mathrm{XOR}} 0000 &= 0001\\
-  0001 \mathrel{\mathrm{XOR}} 0001 &= 0000
-\end{align*}
+For example, applying $\mathrm{XOR}$ to each bit of the binary numbers
+$0011$ and $0101$ yields $0110$. Notice that in the row of the table
+for the bit $1$, the result is the opposite of the second bit.  Thus,
+the \code{not} operation can be implemented by \code{xorq} with $1$ as
+the first argument:
+\[
+\Var~ \key{=}~ \LP\key{not}~\Arg\RP\key{;}
+\qquad\Rightarrow\qquad
+\begin{array}{l}
+\key{movq}~ \Arg\key{,} \Var\\
+\key{xorq}~ \key{\$1,} \Var
+\end{array}
+\]
 
 \begin{figure}[tp]
 \fbox{
@@ -3828,12 +3833,12 @@ second bit.  Thus, the \code{not} operation can be implemented by
 \Arg &::=&  \gray{\IMM{\Int} \mid \REG{\code{'}\Reg} \mid \DEREF{\Reg}{\Int}} 
      \mid \BYTEREG{\code{'}\Reg} \\
 \itm{cc} & ::= & \key{e} \mid \key{l} \mid \key{le} \mid \key{g} \mid \key{ge} \\
-\Instr &::=& \gray{\BININSTR{\code{'addq}}{\Arg}{\Arg}} 
-       \mid \gray{\BININSTR{\code{'subq}}{\Arg}{\Arg}} \\
-       &\mid& \gray{\BININSTR{\code{'movq}}{\Arg}{\Arg}} 
-       \mid \gray{\UNIINSTR{\code{'negq}}{\Arg}} \\
-       &\mid& \gray{\CALLQ{\itm{label}} \mid \RETQ{}} 
-       \mid \gray{\PUSHQ{\Arg} \mid \POPQ{\Arg}} \\
+\Instr &::=& \gray{ \BININSTR{\code{'addq}}{\Arg}{\Arg} 
+       \mid \BININSTR{\code{'subq}}{\Arg}{\Arg} } \\
+       &\mid& \gray{ \BININSTR{\code{'movq}}{\Arg}{\Arg} 
+       \mid \UNIINSTR{\code{'negq}}{\Arg} } \\
+       &\mid& \gray{ \CALLQ{\itm{label}} \mid \RETQ{} 
+       \mid \PUSHQ{\Arg} \mid \POPQ{\Arg} } \\
        &\mid& \BININSTR{\code{'xorq}}{\Arg}{\Arg}
        \mid \BININSTR{\code{'cmpq}}{\Arg}{\Arg}\\
        &\mid& \BININSTR{\code{'set}}{\code{'}\itm{cc}}{\Arg} 
@@ -3871,40 +3876,65 @@ which is part of the \code{rax} register.  Thankfully, the
 \key{movzbq} instruction can then be used to move from a single byte
 register to a normal 64-bit register.
 
-For compiling the \key{if} expression, the x86 instructions for
-jumping are relevant. The \key{Jmp} instruction updates the program
+The x86 instructions for jumping are relevant to the compilation of
+\key{if} expressions. The \key{Jmp} instruction updates the program
 counter to point to the instruction after the indicated label.  The
 \key{JmpIf} instruction updates the program counter to point to the
 instruction after the indicated label depending on whether the result
 in the EFLAGS register matches the condition code \itm{cc}, otherwise
-the \key{JmpIf} instruction falls through to the next
-instruction. Because the \key{JmpIf} instruction relies on the EFLAGS
-register, it is quite common for the \key{JmpIf} to be immediately
-preceded by a \key{cmpq} instruction, to set the EFLAGS register.
-Our abstract syntax for \key{JmpIf} differs from the concrete syntax
-for x86 to separate the instruction name from the condition code. For
-example, \code{(JmpIf le foo)} corresponds to \code{jle foo}.
+the \key{JmpIf} instruction falls through to the next instruction
+\footnote{The abstract syntax for \key{JmpIf} differs from the
+  concrete syntax for x86 in that it separates the instruction name
+  from the condition code. For example, \code{(JmpIf le foo)}
+  corresponds to \code{jle foo}.}.  Because the \key{JmpIf}
+instruction relies on the EFLAGS register, it is common for the
+\key{JmpIf} to be immediately preceded by a \key{cmpq} instruction to
+set the EFLAGS register.
 
 
 \section{The $C_1$ Intermediate Language}
 \label{sec:c1}
 
-As with $R_1$, we shall compile $R_2$ to a C-like intermediate
-language, but we need to grow that intermediate language to handle the
-new features in $R_2$: Booleans and conditional expressions.
-Figure~\ref{fig:c1-syntax} shows the new features of $C_1$; we add
-logic and comparison operators to the $\Exp$ non-terminal, the
-literals \key{\#t} and \key{\#f} to the $\Arg$ non-terminal.
-Regarding control flow, $C_1$ differs considerably from $R_2$.
-Instead of \key{if} expressions, $C_1$ has goto's and conditional
-goto's in the grammar for $\Tail$. This means that a sequence of
-statements may now end with a \code{goto} or a conditional
-\code{goto}, which jumps to one of two labeled pieces of code
+As with $R_1$, we compile $R_2$ to a C-like intermediate language, but
+we need to grow that intermediate language to handle the new features
+in $R_2$: Booleans and conditional expressions.
+Figure~\ref{fig:c1-concrete-syntax} defines the concrete syntax of
+$C_1$ and Figure~\ref{fig:c1-syntax} defines the abstract syntax.  In
+particular, we add logical and comparison operators to the $\Exp$
+non-terminal and the literals \key{\#t} and \key{\#f} to the $\Arg$
+non-terminal.  Regarding control flow, $C_1$ differs considerably from
+$R_2$.  Instead of \key{if} expressions, $C_1$ has \key{goto} and
+conditional \key{goto} in the grammar for $\Tail$. This means that a
+sequence of statements may now end with a \code{goto} or a conditional
+\code{goto}. The conditional \code{goto} jumps to one of two labels
 depending on the outcome of the comparison. In
 Section~\ref{sec:explicate-control-r2} we discuss how to translate
 from $R_2$ to $C_1$, bridging this gap between \key{if} expressions
 and \key{goto}'s.
 
+\begin{figure}[tbp]
+\fbox{
+\begin{minipage}{0.96\textwidth}
+\small    
+\[
+\begin{array}{lcl}
+\Atm &::=& \gray{ \Int \mid \Var } \mid \itm{bool} \\
+\itm{cmp} &::= & \key{eq?} \mid \key{<}  \\
+\Exp &::=& \gray{ \Atm \mid \key{(read)} \mid \key{(-}~\Atm\key{)} \mid \key{(+}~\Atm~\Atm\key{)} } \\
+   &::=& \LP \key{not}~\Atm \RP \mid \LP \itm{cmp}~\Atm~\Atm\RP \\
+\Stmt &::=& \gray{ \Var~\key{=}~\Exp\key{;} } \\
+\Tail &::= & \gray{ \key{return}~\Exp\key{;} \mid \Stmt~\Tail } \\
+   &\mid& \key{goto}~\itm{label}\key{;}\\
+   &\mid& \key{if}~\LP \itm{cmp}~\Atm~\Atm \RP~ \key{goto}~\itm{label}\key{;} ~\key{else}~\key{goto}~\itm{label}\key{;} \\
+C_1 & ::= & \gray{ (\itm{label}\key{:}~ \Tail)^{+} }
+\end{array}
+\]
+\end{minipage}
+}
+\caption{The concrete syntax of the $C_1$ intermediate language.}
+\label{fig:c1-concrete-syntax}
+\end{figure}
+
 \begin{figure}[tp]
 \fbox{
 \begin{minipage}{0.96\textwidth}
@@ -3913,10 +3943,10 @@ and \key{goto}'s.
 \begin{array}{lcl}
 \Atm &::=& \gray{\INT{\Int} \mid \VAR{\Var}} \mid \BOOL{\itm{bool}} \\
 \itm{cmp} &::= & \key{eq?} \mid \key{<}  \\
-\Exp &::= & \gray{\Atm \mid \READ{} \mid \NEG{\Atm} }\\
-     &\mid& \gray{ \ADD{\Atm}{\Atm} } 
-     \mid \UNIOP{\key{'not}}{\Atm} \\
-     &\mid& \BINOP{'\itm{cmp}}{\Atm}{\Atm} \\
+\Exp &::= & \gray{ \Atm \mid \READ{} }\\
+     &\mid& \gray{ \NEG{\Atm} \mid \ADD{\Atm}{\Atm} } \\
+     &\mid& \UNIOP{\key{'not}}{\Atm} 
+     \mid \BINOP{\key{'}\itm{cmp}}{\Atm}{\Atm} \\
 \Stmt &::=& \gray{ \ASSIGN{\VAR{\Var}}{\Exp} } \\
 \Tail &::= & \gray{\RETURN{\Exp} \mid \SEQ{\Stmt}{\Tail} } \\
     &\mid& \GOTO{\itm{label}} \\
@@ -3936,7 +3966,7 @@ C_1 & ::= & \gray{\PROGRAM{\itm{info}}{\CFG{\key{(}\itm{label}\,\key{.}\,\Tail\k
 
 Recall that the purpose of \code{explicate-control} is to make the
 order of evaluation explicit in the syntax of the program.  With the
-addition of \key{if} in $R_2$, things get more interesting.
+addition of \key{if} in $R_2$ this get more interesting.
 
 As a motivating example, consider the following program that has an
 \key{if} expression nested in the predicate of another \key{if}.
@@ -4004,8 +4034,8 @@ Following the order of evaluation in the output of
 \code{remove-complex-opera*}, we first have the \code{(read)} and
 comparison to \code{1} from the predicate of the inner \key{if}.  In
 the output of \code{explicate-control}, in the \code{start} block,
-this becomes a \code{(read)} followed by a conditional goto to either
-\code{block61} or \code{block62}. Each of these contains the
+this becomes a \code{(read)} followed by a conditional \key{goto} to
+either \code{block61} or \code{block62}. Each of these contains the
 translations of the code \code{(eq? (read) 0)} and \code{(eq? (read)
   1)}, respectively. Regarding \code{block61}, we start with the
 \code{(read)} and comparison to \code{0} and then have a conditional
@@ -4098,11 +4128,14 @@ new kind of context to deal with: the predicate position of the
 an $R_2$ expression and two pieces of $C_1$ code (two $\Tail$'s) for
 the then-branch and else-branch. The output of \code{explicate-pred}
 is a $C_1$ $\Tail$ and a list of formerly \key{let}-bound variables.
-However, these three functions also need to
-construct the control-flow graph, which we recommend they do via
-updates to a global variable (be careful!). Next we consider the
-specific additions to the tail and assign functions, and some of cases
-for the pred function.
+
+Note that the three explicate functions need to construct a
+control-flow graph, which we recommend they do via updates to a global
+variable.
+
+In the following paragraphs we consider the specific additions to the
+\code{explicate-tail} and \code{explicate-assign} functions, and some
+of cases for the \code{explicate-pred} function.
 
 The \code{explicate-tail} function needs an additional case for
 \key{if}. The branches of the \key{if} inherit the current context, so
@@ -4119,17 +4152,16 @@ $\itm{cnd}$ and the blocks $B_1$ and $B_2$.
 Next we consider the case for \key{if} in the \code{explicate-assign}
 function. The context of the \key{if} is an assignment to some
 variable $x$ and then the control continues to some block $B_1$.  The
-code that we generate for the $\itm{thn}$ and $\itm{els}$ branches
-needs to continue to $B_1$, so we add $B_1$ to the control flow graph
-with a fresh label $\ell_1$.  Again, the branches of the \key{if}
-inherit the current context, so that are in assignment positions.  Let
-$B_2$ be the result of applying \code{explicate-assign} to the
-$\itm{thn}$ branch, variable $x$, and the block \GOTO{$\ell_1$}.  Let
-$B_3$ be the result of applying \code{explicate-assign} to the
-$\itm{else}$ branch, variable $x$, and the block \GOTO{$\ell_1$}. The
-\key{if} translates to the block $B_4$ which is the result of applying
-\code{explicate-pred} to the predicate $\itm{cnd}$ and the blocks
-$B_2$ and $B_3$.
+code that we generate for the ``then'' and ``else'' branches needs to
+continue to $B_1$, so we add $B_1$ to the control flow graph with a
+fresh label $\ell_1$.  Again, the branches of the \key{if} inherit the
+current context, so that are in assignment positions.  Let $B_2$ be
+the result of applying \code{explicate-assign} to the ``then'' branch,
+variable $x$, and the block \GOTO{$\ell_1$}.  Let $B_3$ be the result
+of applying \code{explicate-assign} to the ``else'' branch, variable
+$x$, and the block \GOTO{$\ell_1$}. The \key{if} translates to the
+block $B_4$ which is the result of applying \code{explicate-pred} to
+the predicate $\itm{cnd}$ and the blocks $B_2$ and $B_3$.
 \[
 (\key{if}\; \itm{cnd}\; \itm{thn}\; \itm{els}) \quad\Rightarrow\quad B_4
 \]
@@ -4137,22 +4169,21 @@ $B_2$ and $B_3$.
 The function \code{explicate-pred} will need a case for every
 expression that can have type \code{Boolean}. We detail a few cases
 here and leave the rest for the reader. The input to this function is
-an expression and two blocks, $B_1$ and $B_2$, for the branches of the
-enclosing \key{if}. Suppose the expression is the Boolean \code{\#t}.
-Then we can perform a kind of partial evaluation and translate it to the
-``then'' branch $B_1$. Likewise, we translate
+an expression and two blocks, $B_1$ and $B_2$, for the two branches of
+the enclosing \key{if}. Suppose the expression is the Boolean
+\code{\#t}.  Then we can perform a kind of partial evaluation and
+translate it to the ``then'' branch $B_1$. Likewise, we translate
 \code{\#f} to the ``else`` branch $B_2$.
 \[
 \key{\#t} \quad\Rightarrow\quad B_1,
 \qquad\qquad\qquad
 \key{\#f} \quad\Rightarrow\quad B_2
 \]
-Next, suppose the
-expression is a less-than comparison. We translate it to a conditional
-goto. We need labels for the two branches $B_1$ and $B_2$, so we add
-those blocks to the control flow graph and obtain some labels $\ell_1$
-and $\ell_2$. The translation of the less-than comparison is as
-follows.
+Next, suppose the expression is a less-than comparison. We translate
+it to a conditional \code{goto}. We need labels for the two branches
+$B_1$ and $B_2$, so we add those blocks to the control flow graph and
+obtain some labels $\ell_1$ and $\ell_2$. The translation of the
+less-than comparison is as follows.
 \[
 (\key{<}~e_1~e_2) \quad\Rightarrow\quad
 \begin{array}{l}
@@ -4164,18 +4195,17 @@ follows.
 \]
 
 The case for \key{if} in \code{explicate-pred} is particularly
-illuminating, as it deals with the challenges that we discussed above
+illuminating as it deals with the challenges that we discussed above
 regarding the example of the nested \key{if} expressions.  Again, we
-add the two input branches $B_1$ and $B_2$ to the control flow graph
-and obtain the labels $\ell_1$ and $\ell_2$.  The branches $\itm{thn}$
-and $\itm{els}$ of the current \key{if} inherit their context from the
-current one, i.e., predicate context. So we apply
-\code{explicate-pred} to $\itm{thn}$ with the two blocks
-\GOTO{$\ell_1$} and \GOTO{$\ell_2$}, to obtain $B_3$.
-Proceed in a similar way with the $\itm{els}$ branch, to obtain $B_4$.
-Finally, we apply \code{explicate-pred} to
-the predicate $\itm{cnd}$ and the blocks $B_3$ and $B_4$
-to obtain the result $B_5$.
+add the two branches $B_1$ and $B_2$ to the control flow graph and
+obtain the labels $\ell_1$ and $\ell_2$.  The ``then'' and ``else''
+branches of the current \key{if} inherit their context from the
+current one, that is, predicate context. So we apply
+\code{explicate-pred} to the ``then'' branch with the two blocks
+\GOTO{$\ell_1$} and \GOTO{$\ell_2$} to obtain $B_3$.  Proceed in a
+similar way with the ``else'' branch to obtain $B_4$.  Finally, we
+apply \code{explicate-pred} to the predicate of hte \code{if} and the
+blocks $B_3$ and $B_4$ to obtain the result $B_5$.
 \[
 (\key{if}\; \itm{cnd}\; \itm{thn}\; \itm{els})
 \quad\Rightarrow\quad
@@ -4523,7 +4553,7 @@ the trivial blocks on the right. Let us focus on \code{block61}.  The
 \code{block55}. The optimized code on the right of
 Figure~\ref{fig:optimize-jumps} bypasses \code{block57}, with the
 \code{then} branch jumping directly to \code{block55}. The story is
-similar for the \code{else} branch, as well as for the two branchs in
+similar for the \code{else} branch, as well as for the two branches in
 \code{block62}. After the jumps in \code{block61} and \code{block62}
 have been optimized in this way, there are no longer any jumps to
 blocks \code{block57} through \code{block60}, so they can be removed.