Jeremy Siek пре 6 година
родитељ
комит
3463b6b971
1 измењених фајлова са 127 додато и 80 уклоњено
  1. 127 80
      book.tex

+ 127 - 80
book.tex

@@ -1216,17 +1216,20 @@ $-16\key{(\%rbp)}$, etc.
 
 \begin{figure}[tbp]
 \begin{lstlisting}
+start:
+	movq	$10, -8(%rbp)
+	negq	-8(%rbp)
+	movq	-8(%rbp), %rax
+	addq	$52, %rax
+	jmp conclusion
+
 	.globl main
 main:
 	pushq	%rbp
 	movq	%rsp, %rbp
 	subq	$16, %rsp
-
-	movq	$10, -8(%rbp)
-	negq	-8(%rbp)
-	movq	$52, %rax
-	addq	-8(%rbp), %rax
-
+	jmp start
+conclusion:
 	addq	$16, %rsp
 	popq	%rbp
 	retq
@@ -1263,25 +1266,25 @@ room for storing variables.  This program just needs one variable ($8$
 bytes) but because the frame size is required to be a multiple of 16
 bytes, it rounds to 16 bytes.
 
-The next four instructions carry out the work of computing
-$\BINOP{+}{52}{\UNIOP{-}{10} }$. The first instruction \key{movq \$10,
-  -8(\%rbp)} stores $10$ in variable $1$. The instruction \key{negq
-  -8(\%rbp)} changes variable $1$ to $-10$. The \key{movq \$52, \%rax}
-places $52$ in the register \key{rax} and \key{addq -8(\%rbp), \%rax}
-adds the contents of variable $1$ to \key{rax}, at which point
-\key{rax} contains $42$.
-
-The last three instructions are the typical \emph{conclusion} of a
-procedure.  The first two are necessary to get the state of the
-machine back to where it was at the beginning of the procedure.  The
-\key{addq \$16, \%rsp} instruction moves the stack pointer back to
-point at the old base pointer. The amount added here needs to match
-the amount that was subtracted in the prelude of the procedure. Then
-\key{popq \%rbp} returns the old base pointer to \key{rbp} and adds
-$8$ to the stack pointer.  The final instruction, \key{retq}, jumps
-back to the procedure that called this one and adds 8 to the stack
-pointer, which returns the stack pointer to where it was prior to the
-procedure call.
+The four instructions under the label \code{start} carry out the work
+of computing $\BINOP{+}{52}{\UNIOP{-}{10} }$. The first instruction
+\key{movq \$10, -8(\%rbp)} stores $10$ in variable $1$. The
+instruction \key{negq -8(\%rbp)} changes variable $1$ to $-10$. The
+\key{movq \$52, \%rax} places $52$ in the register \key{rax} and
+\key{addq -8(\%rbp), \%rax} adds the contents of variable $1$ to
+\key{rax}, at which point \key{rax} contains $42$.
+
+The three instructions under the label \code{conclusion} are the
+typical finale of a procedure.  The first two are necessary to get the
+state of the machine back to where it was at the beginning of the
+procedure.  The \key{addq \$16, \%rsp} instruction moves the stack
+pointer back to point at the old base pointer. The amount added here
+needs to match the amount that was subtracted in the prelude of the
+procedure. Then \key{popq \%rbp} returns the old base pointer to
+\key{rbp} and adds $8$ to the stack pointer.  The final instruction,
+\key{retq}, jumps back to the procedure that called this one and adds
+8 to the stack pointer, which returns the stack pointer to where it
+was prior to the procedure call.
 
 The compiler will need a convenient representation for manipulating
 x86 programs, so we define an abstract syntax for x86 in
@@ -1716,6 +1719,7 @@ place of the complex expression, as shown in the output of
 \code{remove-complex-opera*} on the right.\\
 \begin{tabular}{lll}
 \begin{minipage}{0.4\textwidth}
+% s0_19.rkt
 \begin{lstlisting}
  (program ()
    (+ 52 (- 10)))
@@ -1734,12 +1738,14 @@ $\Rightarrow$
 \end{tabular}
 
 We recommend implementing this pass with two mutually recursive
-functions, \key{rco-arg} and \key{rco-exp}. The idea is to apply
-\key{rco-arg} to subexpressions that need to become simple and to
-apply \key{rco-exp} to subexpressions can stay complex.  Both
-functions take an expression in $R_1$ as input and return two things:
-the output expression and associatoin list mapping temporary variables
-to complex subexpressions.  You can return multiple things from a
+functions, \code{rco-arg} and \code{rco-exp}. The idea is to apply
+\code{rco-arg} to subexpressions that need to become simple and to
+apply \code{rco-exp} to subexpressions can stay complex.  
+Both functions take an expression in $R_1$ as input.
+The \code{rco-exp} function returns an expression.
+The \code{rco-arg} function returns two things:
+a simple expression and association list mapping temporary variables
+to complex subexpressions. You can return multiple things from a
 function using Racket's \key{values} form and you can receive multiple
 things from a function call using the \key{define-values} form. If you
 are not familiar with these constructs, the Racket documentation will
@@ -1750,7 +1756,7 @@ function returns multiple values.
 \begin{tabular}{lll}
 \begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
-(rco-exp `(+ 52 (- 10)))
+(rco-arg `(- 10))
 \end{lstlisting}
 \end{minipage}
 &
@@ -1758,7 +1764,7 @@ $\Rightarrow$
 &
 \begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
-  (values `(+ 52 tmp.1)
+  (values `tmp.1
            `((tmp.1 . (- 10))))
 \end{lstlisting}
 \end{minipage}
@@ -1834,16 +1840,30 @@ your passes on the example programs.
 The \code{explicate-control} pass makes the order of execution
 explicit in the syntax of the program. For $R_1$, this amounts to
 flattening \key{let} constructs into a sequence of assignment
-statements. 
-
-UNDER CONSTRUCTION
-
+statements. For example, consider the following $R_1$ program.
+% s0_11.rkt
+\begin{lstlisting}
+(program ()
+  (let ([y (let ([x 20])
+	   (+ x (let ([x 22]) x)))])
+    y))
+\end{lstlisting}
+%
+The output of \code{remove-complex-opera*} is shown below, on the
+left.  The right-hand-side of a \key{let} executes before its body, so
+the order of evaluation for this program is to assign \code{20} to
+\code{x.1}, assign \code{22} to \code{x.2}, assign \code{(+ x.1 x.2)}
+to \code{y}, then return \code{y}. Indeed, the result of
+\code{explicate-control} produces code in the $C_0$ language that
+makes this explicit.\\
 \begin{tabular}{lll}
 \begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
 (program ()
-  (let ([tmp.1 (- 10)])
-    (+ 52 tmp.1)))
+  (let ([y (let ([x.1 20]) 
+             (let ([x.2 22])
+               (+ x.1 x.2)))])
+   y))
 \end{lstlisting}
 \end{minipage}
 &
@@ -1852,25 +1872,54 @@ $\Rightarrow$
 \begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
 (program ()
-  ((start . (seq (assign tmp.1 (- 10))
-                 (return (+ 52 tmp1))))))
+  ((start . 
+   (seq (assign x.1 20)
+   (seq (assign x.2 22)
+   (seq (assign y (+ x.1 x.2))
+   (return y)))))))
 \end{lstlisting}
 \end{minipage}
 \end{tabular}
 
+We recommend implementing \code{explicate-control} using two mutually
+recursive functions: \code{explicate-control-tail} and
+\code{explicate-control-assign}.  The \code{explicate-control-tail}
+function should be applied to expressions in tail position, whereas
+\code{explicate-control-assign} should be applied to expressions that
+occur on the right-hand-side of a \code{let}.  The function
+\code{explicate-control-tail} takes an $R_1$ expression as input and
+produces a $C_0$ $\Tail$ (see the grammar in
+Figure~\ref{fig:c0-syntax}).  The \code{explicate-control-assign}
+function takes an $R_1$ expression, the variable that it is to be
+assigned to, and $C_0$ code (a $\Tail$) that should come after the
+assignment (e.g., the code generated for the body of the \key{let}).
+
+\section{Uncover Locals}
+\label{sec:uncover-locals-r1}
+
+The pass \code{uncover-locals} simply collects all of the variables in
+the program and places then in the $\itm{info}$ of the program
+construct. Here is the output for the example program of the last
+section.
 
+\begin{minipage}{0.4\textwidth}
+\begin{lstlisting}
+(program ((locals . (x.1 x.2 y)))
+  ((start . 
+   (seq (assign x.1 20)
+   (seq (assign x.2 22)
+   (seq (assign y (+ x.1 x.2))
+   (return y)))))))
+\end{lstlisting}
+\end{minipage}
 
 \section{Select Instructions}
-\label{sec:select-s0}
+\label{sec:select-r1}
 
 In the \key{select-instructions} pass we begin the work of translating
 from $C_0$ to x86. The target language of this pass is a pseudo-x86
 language that still uses variables, so we add an AST node of the form
-$\VAR{\itm{var}}$ to the x86 abstract syntax. Also, the \key{program}
-form should still list the variables (similar to $C_0$):
-\[
-  (\key{program}\;(\Var^{*})\;\Instr^{+})
-\]
+$\VAR{\itm{var}}$ to the x86 abstract syntax. 
 The \key{select-instructions} pass deals with the differing format of
 arithmetic operations. For example, in $C_0$ an addition operation can
 take the form below.  To translate to x86, we need to use the
@@ -1945,10 +1994,9 @@ $\Rightarrow$
 \end{minipage}
 \end{tabular} \\
 
-Regarding the \RETURN{\Arg} statement of $C_0$, we recommend treating it
-as an assignment to the \key{rax} register and let the procedure
-conclusion handle the transfer of control back to the calling
-procedure.
+Regarding the \RETURN{\Arg} statement of $C_0$, we recommend treating
+it as an assignment to the \key{rax} register followed by a jump to
+the conclusion of the program (so the conclusion needs to be labeled).
 
 \begin{exercise}
 \normalfont
@@ -1961,7 +2009,7 @@ your passes on the example programs.
 \end{exercise}
 
 \section{Assign Homes}
-\label{sec:assign-s0}
+\label{sec:assign-r1}
 
 As discussed in Section~\ref{sec:plan-s0-x86}, the
 \key{assign-homes} pass places all of the variables on the stack.
@@ -1987,12 +2035,9 @@ the \code{assign-homes} pass translates the above to
 
 In the process of assigning stack locations to variables, it is
 convenient to compute and store the size of the frame (in bytes) in
-the first field of the \key{program} node which will be needed later
-to generate the procedure conclusion.
-\[
-  (\key{program}\;\Int\;\Instr^{+})
-\]
-Some operating systems place restrictions on
+the $\itm{info}$ field of the \key{program} node, with the key
+\code{stack-space}, which will be needed later to generate the
+procedure conclusion.  Some operating systems place restrictions on
 the frame size. For example, Mac OS X requires the frame size to be a
 multiple of 16 bytes.
 
@@ -2024,6 +2069,7 @@ After \key{assign-homes} pass, the above has been translated to
    (movq (int 42) (deref rbp -8))
    (movq (deref rbp -8) (deref rbp -16))
    (movq (deref rbp -16) (reg rax))
+   (jmp conclusion)
 \end{lstlisting}
 The second \key{movq} instruction is problematic because both
 arguments are stack locations. We suggest fixing this problem by
@@ -2059,33 +2105,34 @@ regard. The main work that this step needs to perform is to create the
 conclusion, as shown in Figure~\ref{fig:p1-x86} of
 Section~\ref{sec:x86}. You need to know the number of stack-allocated
 variables, so we suggest computing it in the \key{assign-homes} pass
-(Section~\ref{sec:assign-s0}) and storing it in the $\itm{info}$ field
+(Section~\ref{sec:assign-r1}) and storing it in the $\itm{info}$ field
 of the \key{program} node.
 
-Your compiled code should print the result of the program's execution
-by using the \code{print\_int} function provided in
-\code{runtime.c}. If your compiler has been implemented correctly so
-far, this final result should be stored in the \key{rax} register.
-We'll talk more about how to perform function calls with arguments in
-general later on, but for now, place the following after the compiled
-code for the $R_1$ program but before the conclusion:
+%% Your compiled code should print the result of the program's execution
+%% by using the \code{print\_int} function provided in
+%% \code{runtime.c}. If your compiler has been implemented correctly so
+%% far, this final result should be stored in the \key{rax} register.
+%% We'll talk more about how to perform function calls with arguments in
+%% general later on, but for now, place the following after the compiled
+%% code for the $R_1$ program but before the conclusion:
 
-\begin{lstlisting}
-    movq %rax, %rdi
-    callq print_int
-\end{lstlisting}
+%% \begin{lstlisting}
+%%     movq %rax, %rdi
+%%     callq print_int
+%% \end{lstlisting}
 
-These lines move the value in \key{rax} into the \key{rdi} register, which
-stores the first argument to be passed into \key{print\_int}.
+%% These lines move the value in \key{rax} into the \key{rdi} register, which
+%% stores the first argument to be passed into \key{print\_int}.
 
 If you want your program to run on Mac OS X, your code needs to
 determine whether or not it is running on a Mac, and prefix
 underscores to labels like \key{main}.  You can determine the platform
 with the Racket call \code{(system-type 'os)}, which returns
-\code{'macosx}, \code{'unix}, or \code{'windows}.  In addition to
-placing underscores on \key{main}, you need to put them in front of
-\key{callq} labels (so \code{callq print\_int} becomes \code{callq
-  \_print\_int}).
+\code{'macosx}, \code{'unix}, or \code{'windows}.  
+%% In addition to
+%% placing underscores on \key{main}, you need to put them in front of
+%% \key{callq} labels (so \code{callq print\_int} becomes \code{callq
+%%   \_print\_int}).
 
 \begin{exercise}
 \normalfont Implement the \key{print-x86} pass and test it on all of
@@ -2868,7 +2915,7 @@ shown in Figure~\ref{fig:reg-alloc-passes}.
   after the \code{build-interference} pass. The three new passes,
   \code{uncover-live}, \code{build-interference}, and
   \code{allocate-registers} replace the \code{assign-homes} pass of
-  Section~\ref{sec:assign-s0}.  Just like \code{assign-homes}, the
+  Section~\ref{sec:assign-r1}.  Just like \code{assign-homes}, the
   output of \code{allocate-registers} should be in the form
   \[
   (\key{program}\;\Int\;\Instr^{+})
@@ -2885,7 +2932,7 @@ shown in Figure~\ref{fig:reg-alloc-passes}.
   Once you have obtained the coloring from \code{color-graph}, you can
   assign the variables to registers or stack locations and then reuse
   code from the \code{assign-homes} pass from
-  Section~\ref{sec:assign-s0} to replace the variables with their
+  Section~\ref{sec:assign-r1} to replace the variables with their
   assigned location.
   
   Test your updated compiler by creating new example programs that
@@ -3796,7 +3843,7 @@ discard the live after sets, as they are no longer needed.
 \subsection{Assign Homes}
 \label{sec:assign-homes-r2}
 
-The \code{assign-homes} function (Section~\ref{sec:assign-s0}) needs
+The \code{assign-homes} function (Section~\ref{sec:assign-r1}) needs
 to be updated to handle the \key{if} statement, simply by recursively
 processing the child nodes.  Hopefully your code already handles the
 other new instructions, but if not, you can generalize your code.
@@ -5361,7 +5408,7 @@ address of the \code{add1} label into the \code{rbx} register.
    leaq add1(%rip), %rbx
 \end{lstlisting}
 
-In Sections~\ref{sec:x86} and \ref{sec:select-s0} we saw the use of
+In Sections~\ref{sec:x86} and \ref{sec:select-r1} we saw the use of
 the \code{callq} instruction for jumping to a function as specified by
 a label. The use of the instruction changes slightly if the function
 is specified by an address in a register, that is, an \emph{indirect