Jeremy Siek 4 年之前
父节点
当前提交
9d684238ca
共有 1 个文件被更改,包括 116 次插入86 次删除
  1. 116 86
      book.tex

+ 116 - 86
book.tex

@@ -2127,18 +2127,18 @@ followed by a move from \code{rax} to the left-hand side.  The move
 from \code{rax} is needed because the return value from
 from \code{rax} is needed because the return value from
 \code{read\_int} goes into \code{rax}, as is the case in general.  \\
 \code{read\_int} goes into \code{rax}, as is the case in general.  \\
 \begin{tabular}{lll}
 \begin{tabular}{lll}
-\begin{minipage}{0.4\textwidth}
+\begin{minipage}{0.3\textwidth}
 \begin{lstlisting}
 \begin{lstlisting}
-|$\itm{lhs}$| = (read);
+|$\itm{var}$| = (read);
 \end{lstlisting}
 \end{lstlisting}
 \end{minipage}
 \end{minipage}
 &
 &
 $\Rightarrow$
 $\Rightarrow$
 &
 &
-\begin{minipage}{0.4\textwidth}
+\begin{minipage}{0.3\textwidth}
 \begin{lstlisting}
 \begin{lstlisting}
 callq read_int
 callq read_int
-movq %rax, |$\itm{lhs}$|
+movq %rax, |$\itm{var}$|
 \end{lstlisting}
 \end{lstlisting}
 \end{minipage}
 \end{minipage}
 \end{tabular} \\
 \end{tabular} \\
@@ -3800,9 +3800,9 @@ One small challenge is that x86 does not provide an instruction that
 directly implements logical negation (\code{not} in $R_2$ and $C_1$).
 directly implements logical negation (\code{not} in $R_2$ and $C_1$).
 However, the \code{xorq} instruction can be used to encode \code{not}.
 However, the \code{xorq} instruction can be used to encode \code{not}.
 The \key{xorq} instruction takes two arguments, performs a pairwise
 The \key{xorq} instruction takes two arguments, performs a pairwise
-exclusive-or operation on each bit of its arguments, and writes the
-results into its second argument.  Recall the truth table for
-exclusive-or:
+exclusive-or ($\mathrm{XOR}$) operation on each bit of its arguments,
+and writes the results into its second argument.  Recall the truth
+table for exclusive-or:
 \begin{center}
 \begin{center}
 \begin{tabular}{l|cc}
 \begin{tabular}{l|cc}
    & 0 & 1 \\ \hline
    & 0 & 1 \\ \hline
@@ -3810,14 +3810,19 @@ exclusive-or:
 1  & 1 & 0
 1  & 1 & 0
 \end{tabular}
 \end{tabular}
 \end{center}
 \end{center}
-For example, $0011 \mathrel{\mathrm{XOR}} 0101 = 0110$.  Notice that
-in the row of the table for the bit $1$, the result is the opposite of the
-second bit.  Thus, the \code{not} operation can be implemented by
-\code{xorq} with $1$ as the first argument:
-\begin{align*}
-  0001 \mathrel{\mathrm{XOR}} 0000 &= 0001\\
-  0001 \mathrel{\mathrm{XOR}} 0001 &= 0000
-\end{align*}
+For example, applying $\mathrm{XOR}$ to each bit of the binary numbers
+$0011$ and $0101$ yields $0110$. Notice that in the row of the table
+for the bit $1$, the result is the opposite of the second bit.  Thus,
+the \code{not} operation can be implemented by \code{xorq} with $1$ as
+the first argument:
+\[
+\Var~ \key{=}~ \LP\key{not}~\Arg\RP\key{;}
+\qquad\Rightarrow\qquad
+\begin{array}{l}
+\key{movq}~ \Arg\key{,} \Var\\
+\key{xorq}~ \key{\$1,} \Var
+\end{array}
+\]
 
 
 \begin{figure}[tp]
 \begin{figure}[tp]
 \fbox{
 \fbox{
@@ -3828,12 +3833,12 @@ second bit.  Thus, the \code{not} operation can be implemented by
 \Arg &::=&  \gray{\IMM{\Int} \mid \REG{\code{'}\Reg} \mid \DEREF{\Reg}{\Int}} 
 \Arg &::=&  \gray{\IMM{\Int} \mid \REG{\code{'}\Reg} \mid \DEREF{\Reg}{\Int}} 
      \mid \BYTEREG{\code{'}\Reg} \\
      \mid \BYTEREG{\code{'}\Reg} \\
 \itm{cc} & ::= & \key{e} \mid \key{l} \mid \key{le} \mid \key{g} \mid \key{ge} \\
 \itm{cc} & ::= & \key{e} \mid \key{l} \mid \key{le} \mid \key{g} \mid \key{ge} \\
-\Instr &::=& \gray{\BININSTR{\code{'addq}}{\Arg}{\Arg}} 
-       \mid \gray{\BININSTR{\code{'subq}}{\Arg}{\Arg}} \\
-       &\mid& \gray{\BININSTR{\code{'movq}}{\Arg}{\Arg}} 
-       \mid \gray{\UNIINSTR{\code{'negq}}{\Arg}} \\
-       &\mid& \gray{\CALLQ{\itm{label}} \mid \RETQ{}} 
-       \mid \gray{\PUSHQ{\Arg} \mid \POPQ{\Arg}} \\
+\Instr &::=& \gray{ \BININSTR{\code{'addq}}{\Arg}{\Arg} 
+       \mid \BININSTR{\code{'subq}}{\Arg}{\Arg} } \\
+       &\mid& \gray{ \BININSTR{\code{'movq}}{\Arg}{\Arg} 
+       \mid \UNIINSTR{\code{'negq}}{\Arg} } \\
+       &\mid& \gray{ \CALLQ{\itm{label}} \mid \RETQ{} 
+       \mid \PUSHQ{\Arg} \mid \POPQ{\Arg} } \\
        &\mid& \BININSTR{\code{'xorq}}{\Arg}{\Arg}
        &\mid& \BININSTR{\code{'xorq}}{\Arg}{\Arg}
        \mid \BININSTR{\code{'cmpq}}{\Arg}{\Arg}\\
        \mid \BININSTR{\code{'cmpq}}{\Arg}{\Arg}\\
        &\mid& \BININSTR{\code{'set}}{\code{'}\itm{cc}}{\Arg} 
        &\mid& \BININSTR{\code{'set}}{\code{'}\itm{cc}}{\Arg} 
@@ -3871,40 +3876,65 @@ which is part of the \code{rax} register.  Thankfully, the
 \key{movzbq} instruction can then be used to move from a single byte
 \key{movzbq} instruction can then be used to move from a single byte
 register to a normal 64-bit register.
 register to a normal 64-bit register.
 
 
-For compiling the \key{if} expression, the x86 instructions for
-jumping are relevant. The \key{Jmp} instruction updates the program
+The x86 instructions for jumping are relevant to the compilation of
+\key{if} expressions. The \key{Jmp} instruction updates the program
 counter to point to the instruction after the indicated label.  The
 counter to point to the instruction after the indicated label.  The
 \key{JmpIf} instruction updates the program counter to point to the
 \key{JmpIf} instruction updates the program counter to point to the
 instruction after the indicated label depending on whether the result
 instruction after the indicated label depending on whether the result
 in the EFLAGS register matches the condition code \itm{cc}, otherwise
 in the EFLAGS register matches the condition code \itm{cc}, otherwise
-the \key{JmpIf} instruction falls through to the next
-instruction. Because the \key{JmpIf} instruction relies on the EFLAGS
-register, it is quite common for the \key{JmpIf} to be immediately
-preceded by a \key{cmpq} instruction, to set the EFLAGS register.
-Our abstract syntax for \key{JmpIf} differs from the concrete syntax
-for x86 to separate the instruction name from the condition code. For
-example, \code{(JmpIf le foo)} corresponds to \code{jle foo}.
+the \key{JmpIf} instruction falls through to the next instruction
+\footnote{The abstract syntax for \key{JmpIf} differs from the
+  concrete syntax for x86 in that it separates the instruction name
+  from the condition code. For example, \code{(JmpIf le foo)}
+  corresponds to \code{jle foo}.}.  Because the \key{JmpIf}
+instruction relies on the EFLAGS register, it is common for the
+\key{JmpIf} to be immediately preceded by a \key{cmpq} instruction to
+set the EFLAGS register.
 
 
 
 
 \section{The $C_1$ Intermediate Language}
 \section{The $C_1$ Intermediate Language}
 \label{sec:c1}
 \label{sec:c1}
 
 
-As with $R_1$, we shall compile $R_2$ to a C-like intermediate
-language, but we need to grow that intermediate language to handle the
-new features in $R_2$: Booleans and conditional expressions.
-Figure~\ref{fig:c1-syntax} shows the new features of $C_1$; we add
-logic and comparison operators to the $\Exp$ non-terminal, the
-literals \key{\#t} and \key{\#f} to the $\Arg$ non-terminal.
-Regarding control flow, $C_1$ differs considerably from $R_2$.
-Instead of \key{if} expressions, $C_1$ has goto's and conditional
-goto's in the grammar for $\Tail$. This means that a sequence of
-statements may now end with a \code{goto} or a conditional
-\code{goto}, which jumps to one of two labeled pieces of code
+As with $R_1$, we compile $R_2$ to a C-like intermediate language, but
+we need to grow that intermediate language to handle the new features
+in $R_2$: Booleans and conditional expressions.
+Figure~\ref{fig:c1-concrete-syntax} defines the concrete syntax of
+$C_1$ and Figure~\ref{fig:c1-syntax} defines the abstract syntax.  In
+particular, we add logical and comparison operators to the $\Exp$
+non-terminal and the literals \key{\#t} and \key{\#f} to the $\Arg$
+non-terminal.  Regarding control flow, $C_1$ differs considerably from
+$R_2$.  Instead of \key{if} expressions, $C_1$ has \key{goto} and
+conditional \key{goto} in the grammar for $\Tail$. This means that a
+sequence of statements may now end with a \code{goto} or a conditional
+\code{goto}. The conditional \code{goto} jumps to one of two labels
 depending on the outcome of the comparison. In
 depending on the outcome of the comparison. In
 Section~\ref{sec:explicate-control-r2} we discuss how to translate
 Section~\ref{sec:explicate-control-r2} we discuss how to translate
 from $R_2$ to $C_1$, bridging this gap between \key{if} expressions
 from $R_2$ to $C_1$, bridging this gap between \key{if} expressions
 and \key{goto}'s.
 and \key{goto}'s.
 
 
+\begin{figure}[tbp]
+\fbox{
+\begin{minipage}{0.96\textwidth}
+\small    
+\[
+\begin{array}{lcl}
+\Atm &::=& \gray{ \Int \mid \Var } \mid \itm{bool} \\
+\itm{cmp} &::= & \key{eq?} \mid \key{<}  \\
+\Exp &::=& \gray{ \Atm \mid \key{(read)} \mid \key{(-}~\Atm\key{)} \mid \key{(+}~\Atm~\Atm\key{)} } \\
+   &::=& \LP \key{not}~\Atm \RP \mid \LP \itm{cmp}~\Atm~\Atm\RP \\
+\Stmt &::=& \gray{ \Var~\key{=}~\Exp\key{;} } \\
+\Tail &::= & \gray{ \key{return}~\Exp\key{;} \mid \Stmt~\Tail } \\
+   &\mid& \key{goto}~\itm{label}\key{;}\\
+   &\mid& \key{if}~\LP \itm{cmp}~\Atm~\Atm \RP~ \key{goto}~\itm{label}\key{;} ~\key{else}~\key{goto}~\itm{label}\key{;} \\
+C_1 & ::= & \gray{ (\itm{label}\key{:}~ \Tail)^{+} }
+\end{array}
+\]
+\end{minipage}
+}
+\caption{The concrete syntax of the $C_1$ intermediate language.}
+\label{fig:c1-concrete-syntax}
+\end{figure}
+
 \begin{figure}[tp]
 \begin{figure}[tp]
 \fbox{
 \fbox{
 \begin{minipage}{0.96\textwidth}
 \begin{minipage}{0.96\textwidth}
@@ -3913,10 +3943,10 @@ and \key{goto}'s.
 \begin{array}{lcl}
 \begin{array}{lcl}
 \Atm &::=& \gray{\INT{\Int} \mid \VAR{\Var}} \mid \BOOL{\itm{bool}} \\
 \Atm &::=& \gray{\INT{\Int} \mid \VAR{\Var}} \mid \BOOL{\itm{bool}} \\
 \itm{cmp} &::= & \key{eq?} \mid \key{<}  \\
 \itm{cmp} &::= & \key{eq?} \mid \key{<}  \\
-\Exp &::= & \gray{\Atm \mid \READ{} \mid \NEG{\Atm} }\\
-     &\mid& \gray{ \ADD{\Atm}{\Atm} } 
-     \mid \UNIOP{\key{'not}}{\Atm} \\
-     &\mid& \BINOP{'\itm{cmp}}{\Atm}{\Atm} \\
+\Exp &::= & \gray{ \Atm \mid \READ{} }\\
+     &\mid& \gray{ \NEG{\Atm} \mid \ADD{\Atm}{\Atm} } \\
+     &\mid& \UNIOP{\key{'not}}{\Atm} 
+     \mid \BINOP{\key{'}\itm{cmp}}{\Atm}{\Atm} \\
 \Stmt &::=& \gray{ \ASSIGN{\VAR{\Var}}{\Exp} } \\
 \Stmt &::=& \gray{ \ASSIGN{\VAR{\Var}}{\Exp} } \\
 \Tail &::= & \gray{\RETURN{\Exp} \mid \SEQ{\Stmt}{\Tail} } \\
 \Tail &::= & \gray{\RETURN{\Exp} \mid \SEQ{\Stmt}{\Tail} } \\
     &\mid& \GOTO{\itm{label}} \\
     &\mid& \GOTO{\itm{label}} \\
@@ -3936,7 +3966,7 @@ C_1 & ::= & \gray{\PROGRAM{\itm{info}}{\CFG{\key{(}\itm{label}\,\key{.}\,\Tail\k
 
 
 Recall that the purpose of \code{explicate-control} is to make the
 Recall that the purpose of \code{explicate-control} is to make the
 order of evaluation explicit in the syntax of the program.  With the
 order of evaluation explicit in the syntax of the program.  With the
-addition of \key{if} in $R_2$, things get more interesting.
+addition of \key{if} in $R_2$ this get more interesting.
 
 
 As a motivating example, consider the following program that has an
 As a motivating example, consider the following program that has an
 \key{if} expression nested in the predicate of another \key{if}.
 \key{if} expression nested in the predicate of another \key{if}.
@@ -4004,8 +4034,8 @@ Following the order of evaluation in the output of
 \code{remove-complex-opera*}, we first have the \code{(read)} and
 \code{remove-complex-opera*}, we first have the \code{(read)} and
 comparison to \code{1} from the predicate of the inner \key{if}.  In
 comparison to \code{1} from the predicate of the inner \key{if}.  In
 the output of \code{explicate-control}, in the \code{start} block,
 the output of \code{explicate-control}, in the \code{start} block,
-this becomes a \code{(read)} followed by a conditional goto to either
-\code{block61} or \code{block62}. Each of these contains the
+this becomes a \code{(read)} followed by a conditional \key{goto} to
+either \code{block61} or \code{block62}. Each of these contains the
 translations of the code \code{(eq? (read) 0)} and \code{(eq? (read)
 translations of the code \code{(eq? (read) 0)} and \code{(eq? (read)
   1)}, respectively. Regarding \code{block61}, we start with the
   1)}, respectively. Regarding \code{block61}, we start with the
 \code{(read)} and comparison to \code{0} and then have a conditional
 \code{(read)} and comparison to \code{0} and then have a conditional
@@ -4098,11 +4128,14 @@ new kind of context to deal with: the predicate position of the
 an $R_2$ expression and two pieces of $C_1$ code (two $\Tail$'s) for
 an $R_2$ expression and two pieces of $C_1$ code (two $\Tail$'s) for
 the then-branch and else-branch. The output of \code{explicate-pred}
 the then-branch and else-branch. The output of \code{explicate-pred}
 is a $C_1$ $\Tail$ and a list of formerly \key{let}-bound variables.
 is a $C_1$ $\Tail$ and a list of formerly \key{let}-bound variables.
-However, these three functions also need to
-construct the control-flow graph, which we recommend they do via
-updates to a global variable (be careful!). Next we consider the
-specific additions to the tail and assign functions, and some of cases
-for the pred function.
+
+Note that the three explicate functions need to construct a
+control-flow graph, which we recommend they do via updates to a global
+variable.
+
+In the following paragraphs we consider the specific additions to the
+\code{explicate-tail} and \code{explicate-assign} functions, and some
+of cases for the \code{explicate-pred} function.
 
 
 The \code{explicate-tail} function needs an additional case for
 The \code{explicate-tail} function needs an additional case for
 \key{if}. The branches of the \key{if} inherit the current context, so
 \key{if}. The branches of the \key{if} inherit the current context, so
@@ -4119,17 +4152,16 @@ $\itm{cnd}$ and the blocks $B_1$ and $B_2$.
 Next we consider the case for \key{if} in the \code{explicate-assign}
 Next we consider the case for \key{if} in the \code{explicate-assign}
 function. The context of the \key{if} is an assignment to some
 function. The context of the \key{if} is an assignment to some
 variable $x$ and then the control continues to some block $B_1$.  The
 variable $x$ and then the control continues to some block $B_1$.  The
-code that we generate for the $\itm{thn}$ and $\itm{els}$ branches
-needs to continue to $B_1$, so we add $B_1$ to the control flow graph
-with a fresh label $\ell_1$.  Again, the branches of the \key{if}
-inherit the current context, so that are in assignment positions.  Let
-$B_2$ be the result of applying \code{explicate-assign} to the
-$\itm{thn}$ branch, variable $x$, and the block \GOTO{$\ell_1$}.  Let
-$B_3$ be the result of applying \code{explicate-assign} to the
-$\itm{else}$ branch, variable $x$, and the block \GOTO{$\ell_1$}. The
-\key{if} translates to the block $B_4$ which is the result of applying
-\code{explicate-pred} to the predicate $\itm{cnd}$ and the blocks
-$B_2$ and $B_3$.
+code that we generate for the ``then'' and ``else'' branches needs to
+continue to $B_1$, so we add $B_1$ to the control flow graph with a
+fresh label $\ell_1$.  Again, the branches of the \key{if} inherit the
+current context, so that are in assignment positions.  Let $B_2$ be
+the result of applying \code{explicate-assign} to the ``then'' branch,
+variable $x$, and the block \GOTO{$\ell_1$}.  Let $B_3$ be the result
+of applying \code{explicate-assign} to the ``else'' branch, variable
+$x$, and the block \GOTO{$\ell_1$}. The \key{if} translates to the
+block $B_4$ which is the result of applying \code{explicate-pred} to
+the predicate $\itm{cnd}$ and the blocks $B_2$ and $B_3$.
 \[
 \[
 (\key{if}\; \itm{cnd}\; \itm{thn}\; \itm{els}) \quad\Rightarrow\quad B_4
 (\key{if}\; \itm{cnd}\; \itm{thn}\; \itm{els}) \quad\Rightarrow\quad B_4
 \]
 \]
@@ -4137,22 +4169,21 @@ $B_2$ and $B_3$.
 The function \code{explicate-pred} will need a case for every
 The function \code{explicate-pred} will need a case for every
 expression that can have type \code{Boolean}. We detail a few cases
 expression that can have type \code{Boolean}. We detail a few cases
 here and leave the rest for the reader. The input to this function is
 here and leave the rest for the reader. The input to this function is
-an expression and two blocks, $B_1$ and $B_2$, for the branches of the
-enclosing \key{if}. Suppose the expression is the Boolean \code{\#t}.
-Then we can perform a kind of partial evaluation and translate it to the
-``then'' branch $B_1$. Likewise, we translate
+an expression and two blocks, $B_1$ and $B_2$, for the two branches of
+the enclosing \key{if}. Suppose the expression is the Boolean
+\code{\#t}.  Then we can perform a kind of partial evaluation and
+translate it to the ``then'' branch $B_1$. Likewise, we translate
 \code{\#f} to the ``else`` branch $B_2$.
 \code{\#f} to the ``else`` branch $B_2$.
 \[
 \[
 \key{\#t} \quad\Rightarrow\quad B_1,
 \key{\#t} \quad\Rightarrow\quad B_1,
 \qquad\qquad\qquad
 \qquad\qquad\qquad
 \key{\#f} \quad\Rightarrow\quad B_2
 \key{\#f} \quad\Rightarrow\quad B_2
 \]
 \]
-Next, suppose the
-expression is a less-than comparison. We translate it to a conditional
-goto. We need labels for the two branches $B_1$ and $B_2$, so we add
-those blocks to the control flow graph and obtain some labels $\ell_1$
-and $\ell_2$. The translation of the less-than comparison is as
-follows.
+Next, suppose the expression is a less-than comparison. We translate
+it to a conditional \code{goto}. We need labels for the two branches
+$B_1$ and $B_2$, so we add those blocks to the control flow graph and
+obtain some labels $\ell_1$ and $\ell_2$. The translation of the
+less-than comparison is as follows.
 \[
 \[
 (\key{<}~e_1~e_2) \quad\Rightarrow\quad
 (\key{<}~e_1~e_2) \quad\Rightarrow\quad
 \begin{array}{l}
 \begin{array}{l}
@@ -4164,18 +4195,17 @@ follows.
 \]
 \]
 
 
 The case for \key{if} in \code{explicate-pred} is particularly
 The case for \key{if} in \code{explicate-pred} is particularly
-illuminating, as it deals with the challenges that we discussed above
+illuminating as it deals with the challenges that we discussed above
 regarding the example of the nested \key{if} expressions.  Again, we
 regarding the example of the nested \key{if} expressions.  Again, we
-add the two input branches $B_1$ and $B_2$ to the control flow graph
-and obtain the labels $\ell_1$ and $\ell_2$.  The branches $\itm{thn}$
-and $\itm{els}$ of the current \key{if} inherit their context from the
-current one, i.e., predicate context. So we apply
-\code{explicate-pred} to $\itm{thn}$ with the two blocks
-\GOTO{$\ell_1$} and \GOTO{$\ell_2$}, to obtain $B_3$.
-Proceed in a similar way with the $\itm{els}$ branch, to obtain $B_4$.
-Finally, we apply \code{explicate-pred} to
-the predicate $\itm{cnd}$ and the blocks $B_3$ and $B_4$
-to obtain the result $B_5$.
+add the two branches $B_1$ and $B_2$ to the control flow graph and
+obtain the labels $\ell_1$ and $\ell_2$.  The ``then'' and ``else''
+branches of the current \key{if} inherit their context from the
+current one, that is, predicate context. So we apply
+\code{explicate-pred} to the ``then'' branch with the two blocks
+\GOTO{$\ell_1$} and \GOTO{$\ell_2$} to obtain $B_3$.  Proceed in a
+similar way with the ``else'' branch to obtain $B_4$.  Finally, we
+apply \code{explicate-pred} to the predicate of hte \code{if} and the
+blocks $B_3$ and $B_4$ to obtain the result $B_5$.
 \[
 \[
 (\key{if}\; \itm{cnd}\; \itm{thn}\; \itm{els})
 (\key{if}\; \itm{cnd}\; \itm{thn}\; \itm{els})
 \quad\Rightarrow\quad
 \quad\Rightarrow\quad
@@ -4523,7 +4553,7 @@ the trivial blocks on the right. Let us focus on \code{block61}.  The
 \code{block55}. The optimized code on the right of
 \code{block55}. The optimized code on the right of
 Figure~\ref{fig:optimize-jumps} bypasses \code{block57}, with the
 Figure~\ref{fig:optimize-jumps} bypasses \code{block57}, with the
 \code{then} branch jumping directly to \code{block55}. The story is
 \code{then} branch jumping directly to \code{block55}. The story is
-similar for the \code{else} branch, as well as for the two branchs in
+similar for the \code{else} branch, as well as for the two branches in
 \code{block62}. After the jumps in \code{block61} and \code{block62}
 \code{block62}. After the jumps in \code{block61} and \code{block62}
 have been optimized in this way, there are no longer any jumps to
 have been optimized in this way, there are no longer any jumps to
 blocks \code{block57} through \code{block60}, so they can be removed.
 blocks \code{block57} through \code{block60}, so they can be removed.