Răsfoiți Sursa

minor edits

Jeremy Siek 9 ani în urmă
părinte
comite
b47629fe9f
2 a modificat fișierele cu 63 adăugiri și 54 ștergeri
  1. 62 53
      book.tex
  2. 1 1
      defs.tex

+ 62 - 53
book.tex

@@ -15,6 +15,7 @@
 \usepackage{semantic}
 \usepackage{wrapfig}
 \usepackage{tikz}
+\usetikzlibrary{arrows}
 
 % Computer Modern is already the default. -Jeremy
 %\renewcommand{\ttdefault}{cmtt}
@@ -78,7 +79,8 @@ columns=fullflexible
   Indiana University \\
   \\
   with contributions from: \\
-  Carl Factora
+  Carl Factora \\
+  Cameron Swords
    }
 
 \begin{document}
@@ -120,13 +122,14 @@ Talk about pre-requisites.
 
 Need to give thanks to 
 \begin{itemize}
+\item Bor-Yuh Evan Chang
 \item Kent Dybvig
 \item Daniel P. Friedman
+\item Ronald Garcia
 \item Abdulaziz Ghuloum
-\item Oscar Waddell
+\item Ryan Newton
 \item Dipanwita Sarkar
-\item Ronald Garcia
-\item Bor-Yuh Evan Chang
+\item Oscar Waddell
 \end{itemize}
 
 %\mbox{}\\
@@ -140,9 +143,9 @@ Need to give thanks to
 In this chapter, we review the basic tools that are needed for
 implementing a compiler. We use abstract syntax trees (ASTs) in the
 form of S-expressions to represent programs (Section~\ref{sec:ast})
-and pattern matching to inspect an AST node
+and pattern matching to inspect individual nodes in an AST
 (Section~\ref{sec:pattern-matching}).  We use recursion to construct
-and deconstruct entire ASTs (Section~\ref{sec:recursion}).
+and deconstruct ASTs (Section~\ref{sec:recursion}).
 
 \section{Abstract Syntax Trees}
 \label{sec:ast}
@@ -152,6 +155,7 @@ programs is the \emph{abstract syntax tree} (AST). When considering
 some part of a program, a compiler needs to ask what kind of part it
 is and what sub-parts it has. For example, the program on the left is
 represented by the AST on the right.
+\marginpar{\scriptsize The arrow heads need to be bigger. -JGS}
 \begin{center}
 \begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
@@ -174,7 +178,7 @@ represented by the AST on the right.
 \end{equation}
 \end{minipage}
 \end{center}
-We shall use the standard terminology for trees: each square above is
+We shall use the standard terminology for trees: each circle above is
 called a \emph{node}. The arrows connect a node to its \emph{children}
 (which are also nodes). The top-most node is the \emph{root}.  Every
 node except for the root has a \emph{parent} (the node it is the child
@@ -182,14 +186,14 @@ of). If a node has no children, it is a \emph{leaf} node.  Otherwise
 it is an \emph{internal} node.
 
 When deciding how to compile the above program, we need to know that
-the root node an addition and that it has two children: \texttt{read}
-and the negation of \texttt{8}. The abstract syntax tree data
-structure directly supports these queries and hence is a good
+the root node operation is addition and that it has two children:
+\texttt{read} and the negation of \texttt{8}. The abstract syntax tree
+data structure directly supports these queries and hence is a good
 choice. In this book, we will often write down the textual
 representation of a program even when we really have in mind the AST,
-simply because the textual representation is easier to typeset.  We
-recommend that, in your mind, you should alway interpret programs as
-abstract syntax trees.
+because the textual representation is more concise.  We recommend
+that, in your mind, you alway interpret programs as abstract syntax
+trees.
 
 \section{Grammars}
 \label{sec:grammar}
@@ -214,7 +218,7 @@ reader already knows what an integer is.) A name such as $\itm{arith}$
 that is defined by the rules, is a \emph{non-terminal}.
 
 The second rule for the $\itm{arith}$ language is the \texttt{read}
-function to receive an input integer from the user of the program.
+operation that receives an input integer from the user of the program.
 \begin{equation}
   \itm{arith} ::= (\key{read}) \label{eq:arith-read}
 \end{equation}
@@ -227,8 +231,10 @@ another arith by negating it.
 Symbols such as \key{-} that play an auxilliary role in the abstract
 syntax are called \emph{terminal} symbols.
 
-By rule \eqref{eq:arith-int}, \texttt{8} is an $\itm{arith}$, then by
-rule \eqref{eq:arith-neg}, the following AST is an $\itm{arith}$.
+We can apply the rules to build ASTs in the $\itm{arith}$
+language. For example, by rule \eqref{eq:arith-int}, \texttt{8} is an
+$\itm{arith}$, then by rule \eqref{eq:arith-neg}, the following AST is
+an $\itm{arith}$.
 \begin{center}
 \begin{minipage}{0.25\textwidth}
 \begin{lstlisting}
@@ -259,18 +265,17 @@ $\itm{arith}$, so we can apply rule \eqref{eq:arith-add} to show that
 \texttt{(+ (read) (- 8))} is in the $\itm{arith}$ language.
 
 If you have an AST for which the above four rules do not apply, then
-the AST is not in $\itm{arith}$. For example, the AST \texttt{(- (read)
-  (+ 8))} is not in $\itm{arith}$ because there are no rules for $+$
-with only one argument, nor for $-$ with two arguments.  Whenever we
-define a language through a grammar, we implicitly mean for the
-language to be the smallest set of programs that are justified by the
-rules. That is, the language only includes those programs that the
-rules allow.
+the AST is not in $\itm{arith}$. For example, the AST \texttt{(-
+  (read) (+ 8))} is not in $\itm{arith}$ because there are no rules
+for \key{+} with only one argument, nor for \key{-} with two
+arguments.  Whenever we define a language with a grammar, we
+implicitly mean for the language to be the smallest set of programs
+that are justified by the rules. That is, the language only includes
+those programs that the rules allow.
 
 It is common to have many rules with the same left-hand side, so the
-following vertical bar notation is used to gather several rules on one
-line.  We refer to each clause between a vertical bar as an
-``alternative''.
+following vertical bar notation is used to gather several rules.  We
+refer to each clause between a vertical bar as an ``alternative''.
 \[
 \itm{arith} ::= \Int \mid ({\tt \key{read}}) \mid (\key{-} \; \itm{arith}) \mid
    (\key{+} \; \itm{arith} \; \itm{arith}) 
@@ -280,13 +285,13 @@ line.  We refer to each clause between a vertical bar as an
 \label{sec:s-expr}
 
 Racket, as a descendant of Lisp~\citep{McCarthy:1960dz}, has
-particularly convenient support for creating and manipulating abstract
-syntax trees with its \emph{symbolic expression} feature, or
-S-expression for short. We can create an S-expression simply by
-writing a backquote followed by the textual representation of the
-AST. (Technically speaking, this is called a \emph{quasiquote} in
-Racket.)  For example, an S-expression to represent the AST
-\eqref{eq:arith-prog} is created by the following Racket expression:
+convenient support for creating and manipulating abstract syntax trees
+with its \emph{symbolic expression} feature, or S-expression for
+short. We can create an S-expression simply by writing a backquote
+followed by the textual representation of the AST. (Technically
+speaking, this is called a \emph{quasiquote} in Racket.)  For example,
+an S-expression to represent the AST \eqref{eq:arith-prog} is created
+by the following Racket expression:
 \begin{center}
 \texttt{`(+ (read) (- 8))}
 \end{center}
@@ -500,10 +505,10 @@ arbitrary S-expression, {\tt sexp}, and determines whether or not {\tt
 one grammar rule for $\itm{arith}$ and the body of each clause makes a
 recursive call for each child node. This pattern of recursive function
 is so common that it has a name, \emph{structural recursion}.  In
-general, when a recursive function is defined using a set of match
-clauses that correspond to a grammar, and each clause body makes a
-recursive call on each child node, then we say the function is defined
-by structural recursion.
+general, when a recursive function is defined using a sequence of
+match clauses that correspond to a grammar, and each clause body makes
+a recursive call on each child node, then we say the function is
+defined by structural recursion.
 
 \begin{center}
 \begin{minipage}{0.7\textwidth}
@@ -589,7 +594,7 @@ by structural recursion.
 
 %% \end{verbatim}
 
-\section{Interpreter}
+\section{Interpreters}
 \label{sec:interp-arith}
 
 The meaning, or semantics, of a program is typically defined in the
@@ -739,6 +744,7 @@ evaluator on several examples and tests the output program.  The
 \end{lstlisting}
 
 \begin{exercise}
+\normalfont % I don't like the italics for exercises. -Jeremy
 We challenge the reader to improve on the simple partial evaluator in
 Figure~\ref{fig:pe-arith} by replacing the \texttt{pe-neg} and
 \texttt{pe-add} helper functions with functions that know more about
@@ -755,7 +761,7 @@ output that takes the form of the $\itm{residual}$ non-terminal in the
 following grammar.
 \[
 \begin{array}{lcl}
-e &::=& (\TTKEY{read}) \mid (\key{-} \;({\tt \TTKEY{read}})) \mid (\key{+} \;e\; e)\\
+e &::=& (\key{read}) \mid (\key{-} \;(\key{read})) \mid (\key{+} \; e \; e)\\
 \itm{residual} &::=& \Int \mid (\key{+}\; \Int\; e) \mid e
 \end{array}
 \]
@@ -1108,7 +1114,7 @@ communicated from one step of the compiler to the next.
 \label{fig:x86-ast-a}
 \end{figure}
 
-\section{From $S_0$ to x86-64 via $C_0$}
+\section{Planning the trip from $S_0$ to x86-64}
 \label{sec:plan-s0-x86}
 
 To compile one language to another it helps to focus on the
@@ -1289,6 +1295,7 @@ it to different expressions, as in the last clause for primitive
 operations in Figure~\ref{fig:uniquify-s0}.
 
 \begin{exercise}
+\normalfont % I don't like the italics for exercises. -Jeremy
 Complete the \key{uniquify} pass by filling in the blanks, that is,
 implement the clauses for variables and for the \key{let} construct.
 \end{exercise}
@@ -1313,6 +1320,7 @@ implement the clauses for variables and for the \key{let} construct.
 \end{figure}
 
 \begin{exercise}
+\normalfont % I don't like the italics for exercises. -Jeremy
 Test your \key{uniquify} pass by creating three example $S_0$ programs
 and checking whether the output programs produce the same result as
 the input programs. The $S_0$ programs should be designed to test the
@@ -1322,8 +1330,8 @@ that overshadow eachother.  The three programs should be in a
 subdirectory named \key{tests} and they shoul have the same file name
 except for a different integer at the end of the name, followed by the
 ending \key{.scm}.  Use the \key{interp-tests} function
-(Appendix~\ref{appendix:utilities}) from \key{utilities.rkt} to test your
-\key{uniquify} pass on the example programs.
+(Appendix~\ref{appendix:utilities}) from \key{utilities.rkt} to test
+your \key{uniquify} pass on the example programs.
 
 %% You can use the interpreter \key{interpret-S0} defined in the
 %% \key{interp.rkt} file. The entire sequence of tests should be a short
@@ -1385,6 +1393,7 @@ of \key{flatten}.
 \]
 
 \begin{exercise}
+\normalfont
 Implement the \key{flatten} pass and test it on all of the example
 programs that you created to test the \key{uniquify} pass and create
 three new example programs that are designed to exercise all of the
@@ -1496,7 +1505,7 @@ follows.
 The \key{imulq} instruction is a special case because the destination
 argument must be a register.
 
-\section{Print x86}
+\section{Print x86-64}
 \label{sec:print-x86}
 
 The last step of the compiler from $S_0$ to x86-64 is to convert the
@@ -1532,7 +1541,7 @@ and then store in the $\itm{info}$ field of the \key{program}.
 \chapter{Register Allocation}
 \label{ch:register-allocation}
 
-In Chapter~\ref{ch:int-exp} we simplified the generation of x86
+In Chapter~\ref{ch:int-exp} we simplified the generation of x86-64
 assembly by placing all variables on the stack. We can improve the
 performance of the generated code considerably if we instead try to
 place as many variables as possible into registers.  The CPU can
@@ -1541,7 +1550,7 @@ take from several cycles (to go to cache) to hundreds of cycles (to go
 to main memory).  Figure~\ref{fig:reg-eg} shows a program with four
 variables that serves as a running example. We show the source program
 and also the output of instruction selection. At that point the
-program is almost x86 assembly but not quite; it still contains
+program is almost x86-64 assembly but not quite; it still contains
 variables instead of stack locations or registers.
 
 \begin{figure}
@@ -1995,7 +2004,7 @@ Applying this assignment to our running example
   (movq (stack-loc -16) (reg rax))
   (subq (reg rbx) (reg rax)))
 \end{lstlisting}
-This program is almost an x86 program. The remaining step is to apply
+This program is almost an x86-64 program. The remaining step is to apply
 the patch instructions pass. In this example, the trivial move of
 \key{-16(\%rbp)} to itself is deleted and the addition of
 \key{-8(\%rbp)} to \key{-16(\%rbp)} is fixed by going through
@@ -2013,11 +2022,11 @@ shown in Figure~\ref{fig:reg-alloc-passes}.
 \[
 \begin{tikzpicture}[baseline=(current  bounding  box.center)]
 \node (1) at (-2,0)     {$C_0$};
-\node (2)  at (0,0)     {$\text{x86}^{*}$};
-\node (3)  at (0,-1.5)  {$\text{x86}^{*}$};
-\node (4)  at (0,-3)    {$\text{x86}^{*}$};
-\node (5)  at (0,-4.5)  {$\text{x86}^{*}$};
-\node (6)  at (2,-4.5)  {$\text{x86}$};
+\node (2)  at (0,0)     {$\text{x86-64}^{*}$};
+\node (3)  at (0,-1.5)  {$\text{x86-64}^{*}$};
+\node (4)  at (0,-3)    {$\text{x86-64}^{*}$};
+\node (5)  at (0,-4.5)  {$\text{x86-64}^{*}$};
+\node (6)  at (2,-4.5)  {$\text{x86-64}$};
 
 \path[->,bend left=15] (1) edge [above] node {\ttfamily\scriptsize select-instr.}      (2);
 \path[->,            ] (2) edge [right] node {\ttfamily\scriptsize uncover-live}       (3);
@@ -2240,7 +2249,7 @@ languages considered in this book ($S_0, S_1, \ldots$) and interprets
 the program, returning the result value.  The \key{interp-C} function
 interprets an AST for a program in one of the C-like languages ($C_0,
 C_1, \ldots$), and the \key{interp-x86} function interprets an AST for
-an x86 program.
+an x86-64 program.
 
 \section{Utility Functions}
 \label{appendix:utilities}
@@ -2276,7 +2285,7 @@ the input for the Scheme program.
 The compiler-tests function takes a compiler name (a string) a
 description of the passes (see the comment for \key{interp-tests}) a
 test family name (a string), and a list of test numbers (see the
-comment for interp-tests), and runs the compiler to generate x86 (a
+comment for interp-tests), and runs the compiler to generate x86-64 (a
 \key{.s} file) and then runs gcc to generate machine code.  It runs
 the machine code and checks that the output is 42.
 \begin{lstlisting}

+ 1 - 1
defs.tex

@@ -25,5 +25,5 @@
 
 \newcommand{\IF}[3]{(\key{if}\,#1\;#2\;#3)}
 
-\newcommand{\TTKEY}[1]{\normalfont\tt\key{#1}}
+\newcommand{\TTKEY}[1]{{\normalfont\tt #1}}