|
@@ -2127,18 +2127,18 @@ followed by a move from \code{rax} to the left-hand side. The move
|
|
|
from \code{rax} is needed because the return value from
|
|
|
\code{read\_int} goes into \code{rax}, as is the case in general. \\
|
|
|
\begin{tabular}{lll}
|
|
|
-\begin{minipage}{0.4\textwidth}
|
|
|
+\begin{minipage}{0.3\textwidth}
|
|
|
\begin{lstlisting}
|
|
|
-|$\itm{lhs}$| = (read);
|
|
|
+|$\itm{var}$| = (read);
|
|
|
\end{lstlisting}
|
|
|
\end{minipage}
|
|
|
&
|
|
|
$\Rightarrow$
|
|
|
&
|
|
|
-\begin{minipage}{0.4\textwidth}
|
|
|
+\begin{minipage}{0.3\textwidth}
|
|
|
\begin{lstlisting}
|
|
|
callq read_int
|
|
|
-movq %rax, |$\itm{lhs}$|
|
|
|
+movq %rax, |$\itm{var}$|
|
|
|
\end{lstlisting}
|
|
|
\end{minipage}
|
|
|
\end{tabular} \\
|
|
@@ -3800,9 +3800,9 @@ One small challenge is that x86 does not provide an instruction that
|
|
|
directly implements logical negation (\code{not} in $R_2$ and $C_1$).
|
|
|
However, the \code{xorq} instruction can be used to encode \code{not}.
|
|
|
The \key{xorq} instruction takes two arguments, performs a pairwise
|
|
|
-exclusive-or operation on each bit of its arguments, and writes the
|
|
|
-results into its second argument. Recall the truth table for
|
|
|
-exclusive-or:
|
|
|
+exclusive-or ($\mathrm{XOR}$) operation on each bit of its arguments,
|
|
|
+and writes the results into its second argument. Recall the truth
|
|
|
+table for exclusive-or:
|
|
|
\begin{center}
|
|
|
\begin{tabular}{l|cc}
|
|
|
& 0 & 1 \\ \hline
|
|
@@ -3810,14 +3810,19 @@ exclusive-or:
|
|
|
1 & 1 & 0
|
|
|
\end{tabular}
|
|
|
\end{center}
|
|
|
-For example, $0011 \mathrel{\mathrm{XOR}} 0101 = 0110$. Notice that
|
|
|
-in the row of the table for the bit $1$, the result is the opposite of the
|
|
|
-second bit. Thus, the \code{not} operation can be implemented by
|
|
|
-\code{xorq} with $1$ as the first argument:
|
|
|
-\begin{align*}
|
|
|
- 0001 \mathrel{\mathrm{XOR}} 0000 &= 0001\\
|
|
|
- 0001 \mathrel{\mathrm{XOR}} 0001 &= 0000
|
|
|
-\end{align*}
|
|
|
+For example, applying $\mathrm{XOR}$ to each bit of the binary numbers
|
|
|
+$0011$ and $0101$ yields $0110$. Notice that in the row of the table
|
|
|
+for the bit $1$, the result is the opposite of the second bit. Thus,
|
|
|
+the \code{not} operation can be implemented by \code{xorq} with $1$ as
|
|
|
+the first argument:
|
|
|
+\[
|
|
|
+\Var~ \key{=}~ \LP\key{not}~\Arg\RP\key{;}
|
|
|
+\qquad\Rightarrow\qquad
|
|
|
+\begin{array}{l}
|
|
|
+\key{movq}~ \Arg\key{,} \Var\\
|
|
|
+\key{xorq}~ \key{\$1,} \Var
|
|
|
+\end{array}
|
|
|
+\]
|
|
|
|
|
|
\begin{figure}[tp]
|
|
|
\fbox{
|
|
@@ -3828,12 +3833,12 @@ second bit. Thus, the \code{not} operation can be implemented by
|
|
|
\Arg &::=& \gray{\IMM{\Int} \mid \REG{\code{'}\Reg} \mid \DEREF{\Reg}{\Int}}
|
|
|
\mid \BYTEREG{\code{'}\Reg} \\
|
|
|
\itm{cc} & ::= & \key{e} \mid \key{l} \mid \key{le} \mid \key{g} \mid \key{ge} \\
|
|
|
-\Instr &::=& \gray{\BININSTR{\code{'addq}}{\Arg}{\Arg}}
|
|
|
- \mid \gray{\BININSTR{\code{'subq}}{\Arg}{\Arg}} \\
|
|
|
- &\mid& \gray{\BININSTR{\code{'movq}}{\Arg}{\Arg}}
|
|
|
- \mid \gray{\UNIINSTR{\code{'negq}}{\Arg}} \\
|
|
|
- &\mid& \gray{\CALLQ{\itm{label}} \mid \RETQ{}}
|
|
|
- \mid \gray{\PUSHQ{\Arg} \mid \POPQ{\Arg}} \\
|
|
|
+\Instr &::=& \gray{ \BININSTR{\code{'addq}}{\Arg}{\Arg}
|
|
|
+ \mid \BININSTR{\code{'subq}}{\Arg}{\Arg} } \\
|
|
|
+ &\mid& \gray{ \BININSTR{\code{'movq}}{\Arg}{\Arg}
|
|
|
+ \mid \UNIINSTR{\code{'negq}}{\Arg} } \\
|
|
|
+ &\mid& \gray{ \CALLQ{\itm{label}} \mid \RETQ{}
|
|
|
+ \mid \PUSHQ{\Arg} \mid \POPQ{\Arg} } \\
|
|
|
&\mid& \BININSTR{\code{'xorq}}{\Arg}{\Arg}
|
|
|
\mid \BININSTR{\code{'cmpq}}{\Arg}{\Arg}\\
|
|
|
&\mid& \BININSTR{\code{'set}}{\code{'}\itm{cc}}{\Arg}
|
|
@@ -3871,40 +3876,65 @@ which is part of the \code{rax} register. Thankfully, the
|
|
|
\key{movzbq} instruction can then be used to move from a single byte
|
|
|
register to a normal 64-bit register.
|
|
|
|
|
|
-For compiling the \key{if} expression, the x86 instructions for
|
|
|
-jumping are relevant. The \key{Jmp} instruction updates the program
|
|
|
+The x86 instructions for jumping are relevant to the compilation of
|
|
|
+\key{if} expressions. The \key{Jmp} instruction updates the program
|
|
|
counter to point to the instruction after the indicated label. The
|
|
|
\key{JmpIf} instruction updates the program counter to point to the
|
|
|
instruction after the indicated label depending on whether the result
|
|
|
in the EFLAGS register matches the condition code \itm{cc}, otherwise
|
|
|
-the \key{JmpIf} instruction falls through to the next
|
|
|
-instruction. Because the \key{JmpIf} instruction relies on the EFLAGS
|
|
|
-register, it is quite common for the \key{JmpIf} to be immediately
|
|
|
-preceded by a \key{cmpq} instruction, to set the EFLAGS register.
|
|
|
-Our abstract syntax for \key{JmpIf} differs from the concrete syntax
|
|
|
-for x86 to separate the instruction name from the condition code. For
|
|
|
-example, \code{(JmpIf le foo)} corresponds to \code{jle foo}.
|
|
|
+the \key{JmpIf} instruction falls through to the next instruction
|
|
|
+\footnote{The abstract syntax for \key{JmpIf} differs from the
|
|
|
+ concrete syntax for x86 in that it separates the instruction name
|
|
|
+ from the condition code. For example, \code{(JmpIf le foo)}
|
|
|
+ corresponds to \code{jle foo}.}. Because the \key{JmpIf}
|
|
|
+instruction relies on the EFLAGS register, it is common for the
|
|
|
+\key{JmpIf} to be immediately preceded by a \key{cmpq} instruction to
|
|
|
+set the EFLAGS register.
|
|
|
|
|
|
|
|
|
\section{The $C_1$ Intermediate Language}
|
|
|
\label{sec:c1}
|
|
|
|
|
|
-As with $R_1$, we shall compile $R_2$ to a C-like intermediate
|
|
|
-language, but we need to grow that intermediate language to handle the
|
|
|
-new features in $R_2$: Booleans and conditional expressions.
|
|
|
-Figure~\ref{fig:c1-syntax} shows the new features of $C_1$; we add
|
|
|
-logic and comparison operators to the $\Exp$ non-terminal, the
|
|
|
-literals \key{\#t} and \key{\#f} to the $\Arg$ non-terminal.
|
|
|
-Regarding control flow, $C_1$ differs considerably from $R_2$.
|
|
|
-Instead of \key{if} expressions, $C_1$ has goto's and conditional
|
|
|
-goto's in the grammar for $\Tail$. This means that a sequence of
|
|
|
-statements may now end with a \code{goto} or a conditional
|
|
|
-\code{goto}, which jumps to one of two labeled pieces of code
|
|
|
+As with $R_1$, we compile $R_2$ to a C-like intermediate language, but
|
|
|
+we need to grow that intermediate language to handle the new features
|
|
|
+in $R_2$: Booleans and conditional expressions.
|
|
|
+Figure~\ref{fig:c1-concrete-syntax} defines the concrete syntax of
|
|
|
+$C_1$ and Figure~\ref{fig:c1-syntax} defines the abstract syntax. In
|
|
|
+particular, we add logical and comparison operators to the $\Exp$
|
|
|
+non-terminal and the literals \key{\#t} and \key{\#f} to the $\Arg$
|
|
|
+non-terminal. Regarding control flow, $C_1$ differs considerably from
|
|
|
+$R_2$. Instead of \key{if} expressions, $C_1$ has \key{goto} and
|
|
|
+conditional \key{goto} in the grammar for $\Tail$. This means that a
|
|
|
+sequence of statements may now end with a \code{goto} or a conditional
|
|
|
+\code{goto}. The conditional \code{goto} jumps to one of two labels
|
|
|
depending on the outcome of the comparison. In
|
|
|
Section~\ref{sec:explicate-control-r2} we discuss how to translate
|
|
|
from $R_2$ to $C_1$, bridging this gap between \key{if} expressions
|
|
|
and \key{goto}'s.
|
|
|
|
|
|
+\begin{figure}[tbp]
|
|
|
+\fbox{
|
|
|
+\begin{minipage}{0.96\textwidth}
|
|
|
+\small
|
|
|
+\[
|
|
|
+\begin{array}{lcl}
|
|
|
+\Atm &::=& \gray{ \Int \mid \Var } \mid \itm{bool} \\
|
|
|
+\itm{cmp} &::= & \key{eq?} \mid \key{<} \\
|
|
|
+\Exp &::=& \gray{ \Atm \mid \key{(read)} \mid \key{(-}~\Atm\key{)} \mid \key{(+}~\Atm~\Atm\key{)} } \\
|
|
|
+ &::=& \LP \key{not}~\Atm \RP \mid \LP \itm{cmp}~\Atm~\Atm\RP \\
|
|
|
+\Stmt &::=& \gray{ \Var~\key{=}~\Exp\key{;} } \\
|
|
|
+\Tail &::= & \gray{ \key{return}~\Exp\key{;} \mid \Stmt~\Tail } \\
|
|
|
+ &\mid& \key{goto}~\itm{label}\key{;}\\
|
|
|
+ &\mid& \key{if}~\LP \itm{cmp}~\Atm~\Atm \RP~ \key{goto}~\itm{label}\key{;} ~\key{else}~\key{goto}~\itm{label}\key{;} \\
|
|
|
+C_1 & ::= & \gray{ (\itm{label}\key{:}~ \Tail)^{+} }
|
|
|
+\end{array}
|
|
|
+\]
|
|
|
+\end{minipage}
|
|
|
+}
|
|
|
+\caption{The concrete syntax of the $C_1$ intermediate language.}
|
|
|
+\label{fig:c1-concrete-syntax}
|
|
|
+\end{figure}
|
|
|
+
|
|
|
\begin{figure}[tp]
|
|
|
\fbox{
|
|
|
\begin{minipage}{0.96\textwidth}
|
|
@@ -3913,10 +3943,10 @@ and \key{goto}'s.
|
|
|
\begin{array}{lcl}
|
|
|
\Atm &::=& \gray{\INT{\Int} \mid \VAR{\Var}} \mid \BOOL{\itm{bool}} \\
|
|
|
\itm{cmp} &::= & \key{eq?} \mid \key{<} \\
|
|
|
-\Exp &::= & \gray{\Atm \mid \READ{} \mid \NEG{\Atm} }\\
|
|
|
- &\mid& \gray{ \ADD{\Atm}{\Atm} }
|
|
|
- \mid \UNIOP{\key{'not}}{\Atm} \\
|
|
|
- &\mid& \BINOP{'\itm{cmp}}{\Atm}{\Atm} \\
|
|
|
+\Exp &::= & \gray{ \Atm \mid \READ{} }\\
|
|
|
+ &\mid& \gray{ \NEG{\Atm} \mid \ADD{\Atm}{\Atm} } \\
|
|
|
+ &\mid& \UNIOP{\key{'not}}{\Atm}
|
|
|
+ \mid \BINOP{\key{'}\itm{cmp}}{\Atm}{\Atm} \\
|
|
|
\Stmt &::=& \gray{ \ASSIGN{\VAR{\Var}}{\Exp} } \\
|
|
|
\Tail &::= & \gray{\RETURN{\Exp} \mid \SEQ{\Stmt}{\Tail} } \\
|
|
|
&\mid& \GOTO{\itm{label}} \\
|
|
@@ -3936,7 +3966,7 @@ C_1 & ::= & \gray{\PROGRAM{\itm{info}}{\CFG{\key{(}\itm{label}\,\key{.}\,\Tail\k
|
|
|
|
|
|
Recall that the purpose of \code{explicate-control} is to make the
|
|
|
order of evaluation explicit in the syntax of the program. With the
|
|
|
-addition of \key{if} in $R_2$, things get more interesting.
|
|
|
+addition of \key{if} in $R_2$ this get more interesting.
|
|
|
|
|
|
As a motivating example, consider the following program that has an
|
|
|
\key{if} expression nested in the predicate of another \key{if}.
|
|
@@ -4004,8 +4034,8 @@ Following the order of evaluation in the output of
|
|
|
\code{remove-complex-opera*}, we first have the \code{(read)} and
|
|
|
comparison to \code{1} from the predicate of the inner \key{if}. In
|
|
|
the output of \code{explicate-control}, in the \code{start} block,
|
|
|
-this becomes a \code{(read)} followed by a conditional goto to either
|
|
|
-\code{block61} or \code{block62}. Each of these contains the
|
|
|
+this becomes a \code{(read)} followed by a conditional \key{goto} to
|
|
|
+either \code{block61} or \code{block62}. Each of these contains the
|
|
|
translations of the code \code{(eq? (read) 0)} and \code{(eq? (read)
|
|
|
1)}, respectively. Regarding \code{block61}, we start with the
|
|
|
\code{(read)} and comparison to \code{0} and then have a conditional
|
|
@@ -4098,11 +4128,14 @@ new kind of context to deal with: the predicate position of the
|
|
|
an $R_2$ expression and two pieces of $C_1$ code (two $\Tail$'s) for
|
|
|
the then-branch and else-branch. The output of \code{explicate-pred}
|
|
|
is a $C_1$ $\Tail$ and a list of formerly \key{let}-bound variables.
|
|
|
-However, these three functions also need to
|
|
|
-construct the control-flow graph, which we recommend they do via
|
|
|
-updates to a global variable (be careful!). Next we consider the
|
|
|
-specific additions to the tail and assign functions, and some of cases
|
|
|
-for the pred function.
|
|
|
+
|
|
|
+Note that the three explicate functions need to construct a
|
|
|
+control-flow graph, which we recommend they do via updates to a global
|
|
|
+variable.
|
|
|
+
|
|
|
+In the following paragraphs we consider the specific additions to the
|
|
|
+\code{explicate-tail} and \code{explicate-assign} functions, and some
|
|
|
+of cases for the \code{explicate-pred} function.
|
|
|
|
|
|
The \code{explicate-tail} function needs an additional case for
|
|
|
\key{if}. The branches of the \key{if} inherit the current context, so
|
|
@@ -4119,17 +4152,16 @@ $\itm{cnd}$ and the blocks $B_1$ and $B_2$.
|
|
|
Next we consider the case for \key{if} in the \code{explicate-assign}
|
|
|
function. The context of the \key{if} is an assignment to some
|
|
|
variable $x$ and then the control continues to some block $B_1$. The
|
|
|
-code that we generate for the $\itm{thn}$ and $\itm{els}$ branches
|
|
|
-needs to continue to $B_1$, so we add $B_1$ to the control flow graph
|
|
|
-with a fresh label $\ell_1$. Again, the branches of the \key{if}
|
|
|
-inherit the current context, so that are in assignment positions. Let
|
|
|
-$B_2$ be the result of applying \code{explicate-assign} to the
|
|
|
-$\itm{thn}$ branch, variable $x$, and the block \GOTO{$\ell_1$}. Let
|
|
|
-$B_3$ be the result of applying \code{explicate-assign} to the
|
|
|
-$\itm{else}$ branch, variable $x$, and the block \GOTO{$\ell_1$}. The
|
|
|
-\key{if} translates to the block $B_4$ which is the result of applying
|
|
|
-\code{explicate-pred} to the predicate $\itm{cnd}$ and the blocks
|
|
|
-$B_2$ and $B_3$.
|
|
|
+code that we generate for the ``then'' and ``else'' branches needs to
|
|
|
+continue to $B_1$, so we add $B_1$ to the control flow graph with a
|
|
|
+fresh label $\ell_1$. Again, the branches of the \key{if} inherit the
|
|
|
+current context, so that are in assignment positions. Let $B_2$ be
|
|
|
+the result of applying \code{explicate-assign} to the ``then'' branch,
|
|
|
+variable $x$, and the block \GOTO{$\ell_1$}. Let $B_3$ be the result
|
|
|
+of applying \code{explicate-assign} to the ``else'' branch, variable
|
|
|
+$x$, and the block \GOTO{$\ell_1$}. The \key{if} translates to the
|
|
|
+block $B_4$ which is the result of applying \code{explicate-pred} to
|
|
|
+the predicate $\itm{cnd}$ and the blocks $B_2$ and $B_3$.
|
|
|
\[
|
|
|
(\key{if}\; \itm{cnd}\; \itm{thn}\; \itm{els}) \quad\Rightarrow\quad B_4
|
|
|
\]
|
|
@@ -4137,22 +4169,21 @@ $B_2$ and $B_3$.
|
|
|
The function \code{explicate-pred} will need a case for every
|
|
|
expression that can have type \code{Boolean}. We detail a few cases
|
|
|
here and leave the rest for the reader. The input to this function is
|
|
|
-an expression and two blocks, $B_1$ and $B_2$, for the branches of the
|
|
|
-enclosing \key{if}. Suppose the expression is the Boolean \code{\#t}.
|
|
|
-Then we can perform a kind of partial evaluation and translate it to the
|
|
|
-``then'' branch $B_1$. Likewise, we translate
|
|
|
+an expression and two blocks, $B_1$ and $B_2$, for the two branches of
|
|
|
+the enclosing \key{if}. Suppose the expression is the Boolean
|
|
|
+\code{\#t}. Then we can perform a kind of partial evaluation and
|
|
|
+translate it to the ``then'' branch $B_1$. Likewise, we translate
|
|
|
\code{\#f} to the ``else`` branch $B_2$.
|
|
|
\[
|
|
|
\key{\#t} \quad\Rightarrow\quad B_1,
|
|
|
\qquad\qquad\qquad
|
|
|
\key{\#f} \quad\Rightarrow\quad B_2
|
|
|
\]
|
|
|
-Next, suppose the
|
|
|
-expression is a less-than comparison. We translate it to a conditional
|
|
|
-goto. We need labels for the two branches $B_1$ and $B_2$, so we add
|
|
|
-those blocks to the control flow graph and obtain some labels $\ell_1$
|
|
|
-and $\ell_2$. The translation of the less-than comparison is as
|
|
|
-follows.
|
|
|
+Next, suppose the expression is a less-than comparison. We translate
|
|
|
+it to a conditional \code{goto}. We need labels for the two branches
|
|
|
+$B_1$ and $B_2$, so we add those blocks to the control flow graph and
|
|
|
+obtain some labels $\ell_1$ and $\ell_2$. The translation of the
|
|
|
+less-than comparison is as follows.
|
|
|
\[
|
|
|
(\key{<}~e_1~e_2) \quad\Rightarrow\quad
|
|
|
\begin{array}{l}
|
|
@@ -4164,18 +4195,17 @@ follows.
|
|
|
\]
|
|
|
|
|
|
The case for \key{if} in \code{explicate-pred} is particularly
|
|
|
-illuminating, as it deals with the challenges that we discussed above
|
|
|
+illuminating as it deals with the challenges that we discussed above
|
|
|
regarding the example of the nested \key{if} expressions. Again, we
|
|
|
-add the two input branches $B_1$ and $B_2$ to the control flow graph
|
|
|
-and obtain the labels $\ell_1$ and $\ell_2$. The branches $\itm{thn}$
|
|
|
-and $\itm{els}$ of the current \key{if} inherit their context from the
|
|
|
-current one, i.e., predicate context. So we apply
|
|
|
-\code{explicate-pred} to $\itm{thn}$ with the two blocks
|
|
|
-\GOTO{$\ell_1$} and \GOTO{$\ell_2$}, to obtain $B_3$.
|
|
|
-Proceed in a similar way with the $\itm{els}$ branch, to obtain $B_4$.
|
|
|
-Finally, we apply \code{explicate-pred} to
|
|
|
-the predicate $\itm{cnd}$ and the blocks $B_3$ and $B_4$
|
|
|
-to obtain the result $B_5$.
|
|
|
+add the two branches $B_1$ and $B_2$ to the control flow graph and
|
|
|
+obtain the labels $\ell_1$ and $\ell_2$. The ``then'' and ``else''
|
|
|
+branches of the current \key{if} inherit their context from the
|
|
|
+current one, that is, predicate context. So we apply
|
|
|
+\code{explicate-pred} to the ``then'' branch with the two blocks
|
|
|
+\GOTO{$\ell_1$} and \GOTO{$\ell_2$} to obtain $B_3$. Proceed in a
|
|
|
+similar way with the ``else'' branch to obtain $B_4$. Finally, we
|
|
|
+apply \code{explicate-pred} to the predicate of hte \code{if} and the
|
|
|
+blocks $B_3$ and $B_4$ to obtain the result $B_5$.
|
|
|
\[
|
|
|
(\key{if}\; \itm{cnd}\; \itm{thn}\; \itm{els})
|
|
|
\quad\Rightarrow\quad
|
|
@@ -4523,7 +4553,7 @@ the trivial blocks on the right. Let us focus on \code{block61}. The
|
|
|
\code{block55}. The optimized code on the right of
|
|
|
Figure~\ref{fig:optimize-jumps} bypasses \code{block57}, with the
|
|
|
\code{then} branch jumping directly to \code{block55}. The story is
|
|
|
-similar for the \code{else} branch, as well as for the two branchs in
|
|
|
+similar for the \code{else} branch, as well as for the two branches in
|
|
|
\code{block62}. After the jumps in \code{block61} and \code{block62}
|
|
|
have been optimized in this way, there are no longer any jumps to
|
|
|
blocks \code{block57} through \code{block60}, so they can be removed.
|