Jeremy Siek пре 9 година
родитељ
комит
18d91fce83
1 измењених фајлова са 187 додато и 104 уклоњено
  1. 187 104
      book.tex

+ 187 - 104
book.tex

@@ -374,7 +374,7 @@ called an ``alternative''.
 
 \begin{figure}[tbp]
 \fbox{
-\begin{minipage}{\textwidth}
+\begin{minipage}{0.96\textwidth}
 \[
 R_0 ::= \Int \mid ({\tt \key{read}}) \mid (\key{-} \; R_0) \mid
    (\key{+} \; R_0 \; R_0) 
@@ -772,25 +772,29 @@ some fun and creativity.
 The $R_1$ language extends the $R_0$ language
 (Figure~\ref{fig:r0-syntax}) with variable definitions.  The syntax of
 the $R_1$ language is defined by the grammar in
-Figure~\ref{fig:r1-syntax}. In addition to variable definitions, the
-$R_1$ language includes the \key{program} form to mark the top of the
-program, which is helpful in some of the compiler passes.  The $R_1$
-language is rich enough to exhibit several compilation techniques but
-simple enough so that the reader can implement a compiler for it in a
-couple weeks of part-time work.  To give the reader a feeling for the
-scale of this first compiler, the instructor solution for the $R_1$
-compiler consists of 6 recursive functions and a few small helper
-functions that together span 256 lines of code.
+Figure~\ref{fig:r1-syntax}. As in $R_0$, \key{read} is a nullary
+operator, \key{-} is a unary operator, and \key{+} is a binary
+operator. In addition to variable definitions, the $R_1$ language
+includes the \key{program} form to mark the top of the program, which
+is helpful in some of the compiler passes.  The $R_1$ language is rich
+enough to exhibit several compilation techniques but simple enough so
+that the reader can implement a compiler for it in a week of part-time
+work.  To give the reader a feeling for the scale of this first
+compiler, the instructor solution for the $R_1$ compiler consists of 6
+recursive functions and a few small helper functions that together
+span 256 lines of code.
 
 \begin{figure}[btp]
 \centering
 \fbox{
-\begin{minipage}{\textwidth}
-\begin{align*}
-\Exp &::= \Int \mid ({\tt \key{read}}) \mid (\key{-} \; \Exp) \mid
-   (\key{+} \; \Exp \; \Exp)  \mid  \Var \mid \LET{\Var}{\Exp}{\Exp} \\
-R_1 &::= (\key{program} \; () \; \Exp)
-\end{align*}
+\begin{minipage}{0.96\textwidth}
+\[
+\begin{array}{rcl}
+\Op  &::=& \key{read} \mid \key{-} \mid \key{+} \\
+\Exp &::=& \Int \mid (\Op \; \Exp^{*})  \mid  \Var \mid \LET{\Var}{\Exp}{\Exp} \\
+R_1  &::=& (\key{program} \; \Exp)
+\end{array}
+\]
 \end{minipage}
 }
 \caption{The syntax of the $R_1$ language. 
@@ -803,7 +807,7 @@ and initializes the variable with the value of an expression.  So the
 following program initializes \code{x} to \code{32} and then evaluates
 the body \code{(+ 10 x)}, producing \code{42}.
 \begin{lstlisting}
-   (program ()
+   (program
       (let ([x (+ 12 20)]) (+ 10 x)))
 \end{lstlisting}
 When there are multiple \key{let}'s for the same variable, the closest
@@ -811,7 +815,7 @@ enclosing \key{let} is used. That is, variable definitions overshadow
 prior definitions. Consider the following program with two \key{let}'s
 that define variables named \code{x}. Can you figure out the result?
 \begin{lstlisting}
-   (program ()
+   (program
       (let ([x 32]) (+ (let ([x 10]) x) x)))
 \end{lstlisting}
 For the purposes of showing which variable uses correspond to which
@@ -819,7 +823,7 @@ definitions, the following shows the \code{x}'s annotated with subscripts
 to distinguish them. Double check that your answer for the above is
 the same as your answer for this annotated version of the program.
 \begin{lstlisting}
-   (program ()
+   (program
       (let ([x|$_1$| 32]) (+ (let ([x|$_2$| 10]) x|$_2$|) x|$_1$|)))
 \end{lstlisting}
 The initializing expression is always evaluated before the body of the
@@ -828,7 +832,7 @@ performed before the \key{read} for \code{y}. Given the input
 \code{52} then \code{10}, the following produces \code{42} (and not
 \code{-42}).
 \begin{lstlisting}
-   (program ()
+   (program
      (let ([x (read)]) (let ([y (read)]) (- x y))))
 \end{lstlisting}
 
@@ -865,7 +869,7 @@ to the variable, then evaluates the body of the \key{let}.
         (fx- 0 (interp-R1 env e))]
        [`(+ ,e1 ,e2)
         (fx+ (interp-R1 env e1) (interp-R1 env e2))]
-       [`(program () ,e) (interp-R1 '() e)]
+       [`(program ,e) (interp-R1 '() e)]
        ))
 \end{lstlisting}
 \caption{Interpreter for the $R_1$ language.}
@@ -1287,7 +1291,7 @@ translate the program on the left into the program on the right. \\
 \begin{tabular}{lll}
 \begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
- (program ()
+ (program
    (let ([x 32])
      (+ (let ([x 10]) x) x)))
 \end{lstlisting}
@@ -1297,7 +1301,7 @@ $\Rightarrow$
 &
 \begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
-(program ()
+(program
   (let ([x.1 32])
     (+ (let ([x.2 10]) x.2) x.1)))
 \end{lstlisting}
@@ -1310,7 +1314,7 @@ with a \key{let} nested inside the initializing expression of another
 \begin{tabular}{lll}
 \begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
-(program ()
+(program
   (let ([x (let ([x 4])
              (+ x 1))])
     (+ x 2)))
@@ -1321,7 +1325,7 @@ $\Rightarrow$
 &
 \begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
-(program ()
+(program
   (let ([x.2 (let ([x.1 4])
                (+ x.1 1))])
     (+ x.2 2)))
@@ -1366,8 +1370,8 @@ implement the clauses for variables and for the \key{let} construct.
            [(? symbol?) ___]
            [(? integer?) e]
            [`(let ([,x ,e]) ,body) ___]
-           [`(program ,info ,e)
-            `(program ,info ,((uniquify alist) e))]
+           [`(program ,e)
+            `(program ,((uniquify alist) e))]
            [`(,op ,es ...)
             `(,op ,@(map (uniquify alist) es))]
            ))))
@@ -1401,33 +1405,68 @@ your \key{uniquify} pass on the example programs.
 \section{Flatten Expressions}
 \label{sec:flatten-s0}
 
-The \key{flatten} pass will transform $R_1$ programs into $C_0$
-programs. In particular, the purpose of the \key{flatten} pass is to
-get rid of nested expressions, such as the $\UNIOP{-}{10}$ in the
-following program.
+The \code{flatten} pass will transform $R_1$ programs into $C_0$
+programs. In particular, the purpose of the \code{flatten} pass is to
+get rid of nested expressions, such as the \code{(- 10)} in the below
+program. This can be accomplished by introducing a new variable,
+assigning the nested expression to the new variable, and then using
+the new variable in place of the nested expressions, as shown in the
+output of \code{flatten} on the right.\\
+\begin{tabular}{lll}
+\begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
-   (program ()
-     (+ 52 (- 10)))
+ (program
+   (+ 52 (- 10)))
 \end{lstlisting}
-This can be accomplished by introducing a new variable, assigning the
-nested expression to the new variable, and then using the new variable
-in place of the nested expressions. For example, the above program is
-translated to the following one.
+\end{minipage}
+&
+$\Rightarrow$
+&
+\begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
-   (program (tmp.1 tmp.2)
-     (assign tmp.1 (- 10))
-     (assign tmp.2 (+ 52 tmp.1))
-     (return tmp.2))
+(program (tmp.1 tmp.2)
+  (assign tmp.1 (- 10))
+  (assign tmp.2 (+ 52 tmp.1))
+  (return tmp.2))
 \end{lstlisting}
+\end{minipage}
+\end{tabular}
+
+The clause of \code{flatten} for \key{let} is straightforward to
+implement as it just requires the generation of an assignment
+statement for the \key{let}-bound variable. The following shows the
+result of \code{flatten} for a \key{let}. \\
+\begin{tabular}{lll}
+\begin{minipage}{0.4\textwidth}
+\begin{lstlisting}
+ (program
+   (let ([x (+ (- 10) 11)])
+     (+ x 41)))
+\end{lstlisting}
+\end{minipage}
+&
+$\Rightarrow$
+&
+\begin{minipage}{0.4\textwidth}
+\begin{lstlisting}
+(program (tmp.1 x tmp.2)
+  (assign tmp.1 (- 10))
+  (assign x (+ tmp.1 11))
+  (assign tmp.2 (+ x 41))
+  (return tmp.2))
+\end{lstlisting}
+\end{minipage}
+\end{tabular}
 
 We recommend implementing \key{flatten} as a structurally recursive
 function that returns two things, 1) the newly flattened expression,
 and 2) a list of assignment statements, one for each of the new
-variables introduced while flattening the expression. You can return
-multiple things from a function using the \key{values} form and you
-can receive multiple things from a function call using the
-\key{define-values} form. If you are not familiar with these
-constructs, the Racket documentation will be of help.
+variables introduced while flattening the expression.  The newly
+flattened expression should be leaf node. You can return multiple
+things from a function using the \key{values} form and you can receive
+multiple things from a function call using the \key{define-values}
+form. If you are not familiar with these constructs, the Racket
+documentation will be of help.
 
 The clause of \key{flatten} for the \key{program} node needs to
 recursively flatten the body of the program and also compute the list
@@ -1492,61 +1531,90 @@ of the form $\VAR{\itm{var}}$ to the x86 abstract syntax.  The
 \key{select-instructions} pass deals with the differing format of
 arithmetic operations. For example, in $C_0$ an addition operation
 could take the following form:
-\[
-\ASSIGN{x}{ \BINOP{+}{10}{32} }
-\]
+\begin{lstlisting}
+   (assign x (+ 10 32))
+\end{lstlisting}
 To translate to x86, we need to express this addition using the
 \key{addq} instruction that does an inplace update. So we first move
-$10$ to $x$ then perform the \key{addq}.
-\[
-(\key{mov}\,\INT{10}\, \VAR{x})\; (\key{addq} \;\INT{32}\; \VAR{x})
-\]
+\code{10} to \code{x} then perform the \key{addq}.
+\begin{lstlisting}
+  (movq (int 10) (var x))
+  (addq (int 32) (var x))
+\end{lstlisting}
 
 There are some cases that require special care to avoid generating
 needlessly complicated code. If one of the arguments is the same as
 the left-hand side of the assignment, then there is no need for the
-extra move instruction.  For example, the following
-\[
-\ASSIGN{x}{ \BINOP{+}{10}{x} }
-\quad\text{should translate to}\quad
-(\key{addq} \; \INT{10}\; \VAR{x})
-\]
+extra move instruction.  For example, the following assignment
+statement can be translated into a single \key{addq} instruction.\\
+\begin{tabular}{lll}
+\begin{minipage}{0.4\textwidth}
+\begin{lstlisting}
+ (assign x (+ 10 x))
+\end{lstlisting}
+\end{minipage}
+&
+$\Rightarrow$
+&
+\begin{minipage}{0.4\textwidth}
+\begin{lstlisting}
+(addq (int 10) (var x))
+\end{lstlisting}
+\end{minipage}
+\end{tabular} \\
 
 Regarding the \RETURN{e} statement of $C_0$, we recommend treating it
 as an assignment to the \key{rax} register and let the procedure
 conclusion handle the transfer of control back to the calling
 procedure.
 
+\begin{exercise}
+\normalfont
+Implement the \key{select-instructions} pass and test it on all of the
+example programs that you created for the previous passes and create
+three new example programs that are designed to exercise all of the
+interesting code in this pass. Use the \key{interp-tests} function
+(Appendix~\ref{appendix:utilities}) from \key{utilities.rkt} to test
+your passes on the example programs.
+\end{exercise}
+
 \section{Assign Homes}
 \label{sec:assign-s0}
 
 As discussed in Section~\ref{sec:plan-s0-x86}, the
 \key{assign-homes} pass places all of the variables on the stack.
-Consider again the example $R_1$ program $\BINOP{+}{52}{ \UNIOP{-}{10} }$,
+Consider again the example $R_1$ program \code{(+ 52 (- 10))},
 which after \key{select-instructions} looks like the following.
-\[
-\begin{array}{l}
-(\key{movq}\;\INT{10}\; \VAR{x})\\
-(\key{negq}\; \VAR{x})\\
-(\key{movq}\; \INT{52}\; \REG{\itm{rax}})\\
-(\key{addq}\; \VAR{x} \REG{\itm{rax}})
-\end{array}
-\]
-The one and only variable $x$ is assigned to stack location
-\key{-8(\%rbp)}, so the \key{assign-homes} pass translates the
+\begin{lstlisting}
+   (movq (int 10) (var x))
+   (negq (var x))
+   (movq (int 52) (reg rax))
+   (addq (var x) (reg rax))
+\end{lstlisting}
+The one and only variable \code{x} is assigned to stack location
+\code{-8(\%rbp)}, so the \code{assign-homes} pass translates the
 above to
-\[
-\begin{array}{l}
-(\key{movq}\;\INT{10}\; \STACKLOC{{-}8})\\
-(\key{negq}\; \STACKLOC{{-}8})\\
-(\key{movq}\; \INT{52}\; \REG{\itm{rax}})\\
-(\key{addq}\; \STACKLOC{{-}8}\; \REG{\itm{rax}})
-\end{array}
-\]
+\begin{lstlisting}
+   (movq (int 10) (stack -8))
+   (negq (stack -8))
+   (movq (int 52) (reg rax))
+   (addq (stack -8) (reg rax))
+\end{lstlisting}
 
 In the process of assigning stack locations to variables, it is
-convenient to compute and store the size of the frame which will be
-needed later to generate the procedure conclusion.
+convenient to compute and store the size of the frame in the
+$\itm{info}$ field of the \key{program} node which will be needed
+later to generate the procedure conclusion. Some operating systems
+place restrictions on the frame size. For example, Mac OS X requires
+the frame size to be a multiple of 16 bytes.
+
+\begin{exercise}
+\normalfont
+Implement the \key{assign-homes} pass and test it on all of the
+example programs that you created for the previous passes pass. Use
+the \key{interp-tests} function (Appendix~\ref{appendix:utilities})
+from \key{utilities.rkt} to test your passes on the example programs.
+\end{exercise}
 
 \section{Patch Instructions}
 \label{sec:patch-s0}
@@ -1557,32 +1625,38 @@ references. For most instructions, the rule is that at most one
 argument may be a memory reference.
 
 Consider again the following example.
-\[
-\LET{a}{42}{ \LET{b}{a}{ b }}
-\]
+\begin{lstlisting}
+   (let ([a 42])
+     (let ([b a])
+       b))
+\end{lstlisting}
 After \key{assign-homes} pass, the above has been translated to
-\[
-\begin{array}{l}
-(\key{movq} \;\INT{42}\; \STACKLOC{{-}8})\\
-(\key{movq}\;\STACKLOC{{-}8}\; \STACKLOC{{-}16})\\
-(\key{movq}\;\STACKLOC{{-}16}\; \REG{\itm{rax}})
-\end{array}
-\]
+\begin{lstlisting}
+   (movq (int 42) (stack -8))
+   (movq (stack -8) (stack -16))
+   (movq (stack -16) (reg rax))
+\end{lstlisting}
 The second \key{movq} instruction is problematic because both arguments
 are stack locations. We suggest fixing this problem by moving from the
 source to \key{rax} and then from \key{rax} to the destination, as
 follows.
-\[
-\begin{array}{l}
-(\key{movq} \;\INT{42}\; \STACKLOC{{-}8})\\
-(\key{movq}\;\STACKLOC{{-}8}\; \REG{\itm{rax}})\\
-(\key{movq}\;\REG{\itm{rax}}\; \STACKLOC{{-}16})\\
-(\key{movq}\;\STACKLOC{{-}16}\; \REG{\itm{rax}})
-\end{array}
-\]
+\begin{lstlisting}
+   (movq (int 42) (stack -8))
+   (movq (stack -8) (reg rax))
+   (movq (reg rax) (stack -16))
+   (movq (stack -16) (reg rax))
+\end{lstlisting}
+
+\begin{exercise}
+\normalfont
+Implement the \key{patch-instructions} pass and test it on all of the
+example programs that you created for the previous passes and create
+three new example programs that are designed to exercise all of the
+interesting code in this pass. Use the \key{interp-tests} function
+(Appendix~\ref{appendix:utilities}) from \key{utilities.rkt} to test
+your passes on the example programs.
+\end{exercise}
 
-%% The \key{imulq} instruction is a special case because the destination
-%% argument must be a register.
 
 \section{Print x86-64}
 \label{sec:print-x86}
@@ -1593,10 +1667,19 @@ representation (defined in Figure~\ref{fig:x86-a}). The Racket
 \key{format} and \key{string-append} functions are useful in this
 regard. The main work that this step needs to perform is to create the
 \key{\_main} function and the standard instructions for its prelude
-and conclusion, as described in Section~\ref{sec:x86-64}. You need to
-know the number of stack-allocated variables, which is convenient to
-compute in the \key{assign-homes} pass (Section~\ref{sec:assign-s0})
-and then store in the $\itm{info}$ field of the \key{program}.
+and conclusion, as shown in Figure~\ref{fig:p1-x86} of
+Section~\ref{sec:x86-64}. You need to know the number of
+stack-allocated variables, for which it is suggest that you compute in
+the \key{assign-homes} pass (Section~\ref{sec:assign-s0}) and store in
+the $\itm{info}$ field of the \key{program} node.
+
+\begin{exercise}
+\normalfont Implement the \key{print-x86} pass and test it on all of
+the example programs that you created for the previous passes. Use the
+\key{compiler-tests} function (Appendix~\ref{appendix:utilities}) from
+\key{utilities.rkt} to test your complete compiler on the example
+programs.
+\end{exercise}
 
 %% \section{Testing with Interpreters}