4 年之前 · 97317caa66
--- a/book.tex
+++ b/book.tex
@@ -78,6 +78,7 @@
 
															 language=Lisp,
														
 
															 basicstyle=\ttfamily\small,
														
 
															 morekeywords={seq,assign,program,block,define,lambda,match},
														
 
															+deletekeywords={read},
														
 
															 escapechar=|,
														
 
															 columns=flexible,
														
 
															 moredelim=[is][\color{red}]{~}{~}
														
@@ -302,8 +303,7 @@ typically stored in text files on disk, as \emph{concrete syntax}.
 
															 ASTs can be represented in many different ways, depending on the programming
														
 
															 language used to write the compiler.
														
 
															 %
														
 
															-Because this book uses Racket (\url{http://racket-lang.org}), a
														
 
															-descendant of Lisp, we can use S-expressions to conveniently represent
														
 
															+We use Racket's \code{struct} feature to conveniently represent
														
 
															 ASTs (Section~\ref{sec:ast}). We use grammars to defined the abstract
														
 
															 syntax of programming languages (Section~\ref{sec:grammar}) and
														
 
															 pattern matching to inspect individual nodes in an AST
														
@@ -311,15 +311,14 @@ pattern matching to inspect individual nodes in an AST
 
															 and deconstruct entire ASTs (Section~\ref{sec:recursion}).  This
														
 
															 chapter provides an brief introduction to these ideas.
														
 
															-\section{Abstract Syntax Trees and S-expressions}
														
 
															+\section{Abstract Syntax Trees and Racket Structures}
														
 
															 \label{sec:ast}
														
 
															 The primary data structure that is commonly used for representing
														
 
															 programs is the \emph{abstract syntax tree} (AST). When considering
														
 
															 some part of a program, a compiler needs to ask what kind of thing it
														
 
															 is and what sub-parts it contains. For example, the program on the
														
 
															-left, represented by an S-expression, corresponds to the AST on the
														
 
															-right.
														
 
															+left corresponds to the AST on the right.
														
 
															 \begin{center}
														
 
															 \begin{minipage}{0.4\textwidth}
														
 
															 \begin{lstlisting}
														
@@ -349,64 +348,98 @@ node except for the root has a \emph{parent} (the node it is the child
 
															 of). If a node has no children, it is a \emph{leaf} node.  Otherwise
														
 
															 it is an \emph{internal} node.
														
 
															-Recall that an \emph{symbolic expression} (S-expression) is either
														
 
															-\begin{enumerate}
														
 
															-\item an atom, or
														
 
															-\item a pair of two S-expressions, written $(e_1 \key{.} e_2)$,
														
 
															-    where $e_1$ and $e_2$ are each an S-expression.
														
 
															-\end{enumerate}
														
 
															-An \emph{atom} can be a symbol, such as \code{`hello}, a number, the
														
 
															-null value \code{'()}, etc.  We can create an S-expression in Racket
														
 
															-simply by writing a backquote (called a quasi-quote in Racket)
														
 
															-followed by the textual representation of the S-expression.  It is
														
 
															-quite common to use S-expressions to represent a list, such as $a, b
														
 
															-,c$ in the following way:
														
 
															-\begin{lstlisting}
														
 
															-`(a . (b . (c . ())))
														
 
															-\end{lstlisting}
														
 
															-Each element of the list is in the first slot of a pair, and the
														
 
															-second slot is either the rest of the list or the null value, to mark
														
 
															-the end of the list. Such lists are so common that Racket provides
														
 
															-special notation for them that removes the need for the periods
														
 
															-and so many parenthesis:
														
 
															-\begin{lstlisting}
														
 
															-`(a b c)
														
 
															-\end{lstlisting}
														
 
															-The following expression creates an S-expression that represents AST
														
 
															-\eqref{eq:arith-prog}.
														
 
															-\begin{lstlisting}
														
 
															-`(+ (read) (- 8))
														
 
															-\end{lstlisting}
														
 
															-When using S-expressions to represent ASTs, the convention is to
														
 
															-represent each AST node as a list and to put the operation symbol at
														
 
															-the front of the list. The rest of the list contains the children.  So
														
 
															-in the above case, the root AST node has operation \code{`+} and its
														
 
															-two children are \code{`(read)} and \code{`(- 8)}, just as in the
														
 
															-diagram \eqref{eq:arith-prog}.
														
 
															-
														
 
															-To build larger S-expressions one often needs to splice together
														
 
															-several smaller S-expressions. Racket provides the comma operator to
														
 
															-splice an S-expression into a larger one. For example, instead of
														
 
															-creating the S-expression for AST \eqref{eq:arith-prog} all at once,
														
 
															-we could have first created an S-expression for AST
														
 
															-\eqref{eq:arith-neg8} and then spliced that into the addition
														
 
															-S-expression.
														
 
															+%% Recall that an \emph{symbolic expression} (S-expression) is either
														
 
															+%% \begin{enumerate}
														
 
															+%% \item an atom, or
														
 
															+%% \item a pair of two S-expressions, written $(e_1 \key{.} e_2)$,
														
 
															+%%     where $e_1$ and $e_2$ are each an S-expression.
														
 
															+%% \end{enumerate}
														
 
															+%% An \emph{atom} can be a symbol, such as \code{`hello}, a number, the
														
 
															+%% null value \code{'()}, etc.  We can create an S-expression in Racket
														
 
															+%% simply by writing a backquote (called a quasi-quote in Racket)
														
 
															+%% followed by the textual representation of the S-expression.  It is
														
 
															+%% quite common to use S-expressions to represent a list, such as $a, b
														
 
															+%% ,c$ in the following way:
														
 
															+%% \begin{lstlisting}
														
 
															+%% `(a . (b . (c . ())))
														
 
															+%% \end{lstlisting}
														
 
															+%% Each element of the list is in the first slot of a pair, and the
														
 
															+%% second slot is either the rest of the list or the null value, to mark
														
 
															+%% the end of the list. Such lists are so common that Racket provides
														
 
															+%% special notation for them that removes the need for the periods
														
 
															+%% and so many parenthesis:
														
 
															+%% \begin{lstlisting}
														
 
															+%% `(a b c)
														
 
															+%% \end{lstlisting}
														
 
															+%% The following expression creates an S-expression that represents AST
														
 
															+%% \eqref{eq:arith-prog}.
														
 
															+%% \begin{lstlisting}
														
 
															+%% `(+ (read) (- 8))
														
 
															+%% \end{lstlisting}
														
 
															+%% When using S-expressions to represent ASTs, the convention is to
														
 
															+%% represent each AST node as a list and to put the operation symbol at
														
 
															+%% the front of the list. The rest of the list contains the children.  So
														
 
															+%% in the above case, the root AST node has operation \code{`+} and its
														
 
															+%% two children are \code{`(read)} and \code{`(- 8)}, just as in the
														
 
															+%% diagram \eqref{eq:arith-prog}.
														
 
															+
														
 
															+%% To build larger S-expressions one often needs to splice together
														
 
															+%% several smaller S-expressions. Racket provides the comma operator to
														
 
															+%% splice an S-expression into a larger one. For example, instead of
														
 
															+%% creating the S-expression for AST \eqref{eq:arith-prog} all at once,
														
 
															+%% we could have first created an S-expression for AST
														
 
															+%% \eqref{eq:arith-neg8} and then spliced that into the addition
														
 
															+%% S-expression.
														
 
															+%% \begin{lstlisting}
														
 
															+%% (define ast1.4 `(- 8))
														
 
															+%% (define ast1.1 `(+ (read) ,ast1.4))
														
 
															+%% \end{lstlisting}
														
 
															+%% In general, the Racket expression that follows the comma (splice)
														
 
															+%% can be any expression that produces an S-expression.
														
 
															+
														
 
															+We define a Racket \code{struct} for each kind of node. For this
														
 
															+chapter we require just two kinds of nodes: one for integer constants
														
 
															+and one for primitive operations. The following is the \code{struct}
														
 
															+definition for integer constants.
														
 
															+\begin{lstlisting}
														
 
															+(struct Int (value))
														
 
															+\end{lstlisting}
														
 
															+An integer node includes just one thing: the integer value.
														
 
															+To create a AST node for the integer $8$, we write \code{(Int 8)}.
														
 
															+\begin{lstlisting}
														
 
															+(define eight (Int 8))
														
 
															+\end{lstlisting}
														
 
															+The following is the \code{struct} definition for primitives operations.
														
 
															+\begin{lstlisting}
														
 
															+(struct Prim (op arg*))
														
 
															+\end{lstlisting}
														
 
															+A primitive operation node includes an operator symbol \code{op}
														
 
															+and a list of children \code{arg*}. For example, to create
														
 
															+an AST that negates the number $8$, we write \code{(Prim '- (list eight))}.
														
 
															+\begin{lstlisting}
														
 
															+(define neg-eight (Prim '- (list eight)))
														
 
															+\end{lstlisting}
														
 
															+Primitive operations may have zero or more children. The \code{read}
														
 
															+operator has zero children:
														
 
															+\begin{lstlisting}
														
 
															+(define rd (Prim 'read '()))
														
 
															+\end{lstlisting}
														
 
															+whereas the addition operator has two children:
														
 
															 \begin{lstlisting}
														
 
															-(define ast1.4 `(- 8))
														
 
															-(define ast1.1 `(+ (read) ,ast1.4))
														
 
															+(define ast1.1 (Prim '+ (list rd neg-eight)))
														
 
															 \end{lstlisting}
														
 
															-In general, the Racket expression that follows the comma (splice)
														
 
															-can be any expression that produces an S-expression.
														
 
															 When deciding how to compile program \eqref{eq:arith-prog}, we need to
														
 
															 know that the operation associated with the root node is addition and
														
 
															 that it has two children: \texttt{read} and a negation. The AST data
														
 
															 structure directly supports these queries, as we shall see in
														
 
															 Section~\ref{sec:pattern-matching}, and hence is a good choice for use
														
 
															-in compilers. In this book, we often write down the S-expression
														
 
															-representation of a program even when we really have in mind the AST
														
 
															-because the S-expression is more concise.  We recommend that, in your
														
 
															-mind, you always think of programs as abstract syntax trees.
														
 
															+in compilers.
														
 
															+
														
 
															+In this book, we often write down the concrete syntax of a program
														
 
															+even when we really have in mind the AST because the concrete syntax
														
 
															+is more concise.  We recommend that, in your mind, you always think of
														
 
															+programs as abstract syntax trees.
														
 
															 \section{Grammars}
														
 
															 \label{sec:grammar}
														
@@ -415,13 +448,17 @@ A programming language can be thought of as a \emph{set} of programs.
 
															 The set is typically infinite (one can always create larger and larger
														
 
															 programs), so one cannot simply describe a language by listing all of
														
 
															 the programs in the language. Instead we write down a set of rules, a
														
 
															-\emph{grammar}, for building programs. We shall write our rules in a
														
 
															-variant of Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.
														
 
															-As an example, we describe a small language, named $R_0$, that
														
 
															-consists of integers and arithmetic operations. The first grammar rule
														
 
															-says that any integer is an expression:
														
 
															+\emph{grammar}, for building programs. Grammars are often used to
														
 
															+define the concrete syntax of a language, but they can also be used to
														
 
															+describe the abstract syntax. We shall write our rules in a variant of
														
 
															+Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.  As an
														
 
															+example, we describe a small language, named $R_0$, that consists of
														
 
															+integers and arithmetic operations.
														
 
															+
														
 
															+The first grammar rule says that given any integer $n$, an integer
														
 
															+node $\INT{n}$ is an expression:
														
 
															 \begin{equation}
														
 
															-\Exp ::= \Int  \label{eq:arith-int}
														
 
															+\Exp ::= \INT{n}  \label{eq:arith-int}
														
 
															 \end{equation}
														
 
															 %
														
 
															 Each rule has a left-hand-side and a right-hand-side. The way to read
														
@@ -432,40 +469,39 @@ according to the left-hand-side.
 
															 A name such as $\Exp$ that is
														
 
															 defined by the grammar rules is a \emph{non-terminal}.
														
 
															 %
														
 
															-The name $\Int$ is a also a non-terminal, however, we do not define
														
 
															-$\Int$ because the reader already knows what an integer is.
														
 
															-%
														
 
															-Further, we make the simplifying design decision that all of the languages in
														
 
															-this book only handle machine-representable integers.  On most modern machines
														
 
															-this corresponds to integers represented with 64-bits, i.e., the in range
														
 
															-$-2^{63}$ to $2^{63}-1$.
														
 
															-%
														
 
															-However, we restrict this range further to match the Racket \texttt{fixnum}
														
 
															-datatype, which allows 63-bit integers on a 64-bit machine.
														
 
															+%% The name $\Int$ is a also a non-terminal, however, we do not define
														
 
															+%% $\Int$ because the reader already knows what an integer is.
														
 
															+
														
 
															+We make the simplifying design decision that all of the languages in
														
 
															+this book only handle machine-representable integers.  On most modern
														
 
															+machines this corresponds to integers represented with 64-bits, i.e.,
														
 
															+the in range $-2^{63}$ to $2^{63}-1$.  We restrict this range further
														
 
															+to match the Racket \texttt{fixnum} datatype, which allows 63-bit
														
 
															+integers on a 64-bit machine.
														
 
															 The second grammar rule is the \texttt{read} operation that receives
														
 
															 an input integer from the user of the program.
														
 
															 \begin{equation}
														
 
															-  \Exp ::= (\key{read}) \label{eq:arith-read}
														
 
															+  \Exp ::= \READ{} \label{eq:arith-read}
														
 
															 \end{equation}
														
 
															 The third rule says that, given an $\Exp$ node, you can build another
														
 
															 $\Exp$ node by negating it.
														
 
															 \begin{equation}
														
 
															-  \Exp ::= (\key{-} \; \Exp)  \label{eq:arith-neg}
														
 
															+  \Exp ::= \NEG{\Exp}  \label{eq:arith-neg}
														
 
															 \end{equation}
														
 
															 Symbols in typewriter font such as \key{-} and \key{read} are
														
 
															 \emph{terminal} symbols and must literally appear in the program for
														
 
															 the rule to be applicable.
														
 
															 We can apply the rules to build ASTs in the $R_0$
														
 
															-language. For example, by rule \eqref{eq:arith-int}, \texttt{8} is an
														
 
															+language. For example, by rule \eqref{eq:arith-int}, \texttt{(Int 8)} is an
														
 
															 $\Exp$, then by rule \eqref{eq:arith-neg}, the following AST is
														
 
															 an $\Exp$.
														
 
															 \begin{center}
														
 
															-\begin{minipage}{0.25\textwidth}
														
 
															+\begin{minipage}{0.4\textwidth}
														
 
															 \begin{lstlisting}
														
 
															-(- 8)
														
 
															+(Prim '- (list (Int 8)))
														
 
															 \end{lstlisting}
														
 
															 \end{minipage}
														
 
															 \begin{minipage}{0.25\textwidth}
														
@@ -483,26 +519,37 @@ an $\Exp$.
 
															 The next grammar rule defines addition expressions:
														
 
															 \begin{equation}
														
 
															-  \Exp ::= (\key{+} \; \Exp \; \Exp) \label{eq:arith-add}
														
 
															+  \Exp ::= \ADD{\Exp}{\Exp} \label{eq:arith-add}
														
 
															 \end{equation}
														
 
															-We can now see that the AST \eqref{eq:arith-prog} is an $\Exp$ in
														
 
															-$R_0$.  We know that \lstinline{(read)} is an $\Exp$ by rule
														
 
															-\eqref{eq:arith-read} and we have shown that \texttt{(- 8)} is an
														
 
															-$\Exp$, so we can apply rule \eqref{eq:arith-add} to show that
														
 
															-\texttt{(+ (read) (- 8))} is an $\Exp$ in the $R_0$ language.
														
 
															+We can now justify that the AST \eqref{eq:arith-prog} is an $\Exp$ in
														
 
															+$R_0$.  We know that \lstinline{(Prim 'read '())} is an $\Exp$ by rule
														
 
															+\eqref{eq:arith-read} and we have already shown that \code{(Prim '-
														
 
															+  (list (Int 8)))} is an $\Exp$, so we apply rule \eqref{eq:arith-add}
														
 
															+to show that
														
 
															+\begin{lstlisting}
														
 
															+(Prim '+ (list (Prim 'read '()) (Prim '- (list (Int 8)))))
														
 
															+\end{lstlisting}
														
 
															+is an $\Exp$ in the $R_0$ language.
														
 
															 If you have an AST for which the above rules do not apply, then the
														
 
															-AST is not in $R_0$. For example, the AST \texttt{(- (read) (+ 8))} is
														
 
															-not in $R_0$ because there are no rules for \key{+} with only one
														
 
															-argument, nor for \key{-} with two arguments.  Whenever we define a
														
 
															-language with a grammar, we mean for the language to only include
														
 
															-those programs that are justified by the rules.
														
 
															+AST is not in $R_0$. For example, the program \code{(- (read) (+ 8))}
														
 
															+is not in $R_0$ because there are no rules for \code{+} with only one
														
 
															+argument, nor for \key{-} with two arguments. Whenever we define a
														
 
															+language with a grammar, the language only includes those programs
														
 
															+that are justified by the rules.
														
 
															-The last grammar rule for $R_0$ states that there is a \key{program}
														
 
															+The last grammar rule for $R_0$ states that there is a \code{Program}
														
 
															 node to mark the top of the whole program:
														
 
															 \[
														
 
															-  R_0 ::= (\key{program} \; \Exp)
														
 
															+  R_0 ::= \PROGRAM{\code{'()}}{\Exp}
														
 
															 \]
														
 
															+The \code{Program} structure is defined as follows
														
 
															+\begin{lstlisting}
														
 
															+(struct Program (info body))
														
 
															+\end{lstlisting}
														
 
															+where \code{body} is an expression. In later chapters, the \code{info}
														
 
															+part will be used to store auxilliary information but for now it is
														
 
															+just the empty list.
														
 
															 The \code{read-program} function provided in \code{utilities.rkt}
														
 
															 reads programs in from a file (the sequence of characters in the
														
@@ -523,9 +570,9 @@ called an {\em alternative}.
 
															 \begin{minipage}{0.96\textwidth}
														
 
															 \[
														
 
															 \begin{array}{rcl}
														
 
															-\Exp &::=& \Int \mid ({\tt \key{read}}) \mid (\key{-} \; \Exp) \mid
														
 
															-   (\key{+} \; \Exp \; \Exp)  \\
														
 
															-R_0  &::=& (\key{program} \; \Exp)
														
 
															+\Exp &::=& \INT{n} \mid \READ{} \mid \NEG{\Exp} \\
														
 
															+     &\mid&  \ADD{\Exp}{\Exp}  \\
														
 
															+R_0  &::=& \code{(Program} \; \code{'()}\; \Exp \code{)}
														
 
															 \end{array}
														
 
															 \]
														
 
															 \end{minipage}
														
@@ -542,16 +589,14 @@ R_0  &::=& (\key{program} \; \Exp)
 
															 As mentioned above, compilers often need to access the children of an
														
 
															 AST node. Racket provides the \texttt{match} form to access the parts
														
 
															-of an S-expression. Consider the following example and the output on
														
 
															-the right.
														
 
															+of a structure. Consider the following example and the output on the
														
 
															+right.
														
 
															 \begin{center}
														
 
															 \begin{minipage}{0.5\textwidth}
														
 
															 \begin{lstlisting}
														
 
															 (match ast1.1
														
 
															-  [`(,op ,child1 ,child2)
														
 
															-    (print op) (newline)
														
 
															-    (print child1) (newline)
														
 
															-    (print child2)])
														
 
															+  [(Prim op (list child1 child2))
														
 
															+    (print op)])
														
 
															 \end{lstlisting}
														
 
															 \end{minipage}
														
 
															 \vrule
														
@@ -560,8 +605,6 @@ the right.
 
															    '+
														
 
															-   '(read)
														
 
															-   '(- 8)
														
 
															 \end{lstlisting}
														
 
															 \end{minipage}
														
 
															 \end{center}
														
@@ -581,25 +624,22 @@ clause may contain any Racket code whatsoever.
 
															 A \code{match} form may contain several clauses, as in the following
														
 
															 function \code{leaf?} that recognizes when an $R_0$ node is
														
 
															 a leaf. The \code{match} proceeds through the clauses in order,
														
 
															-checking whether the pattern can match the input S-expression. The
														
 
															+checking whether the pattern can match the input AST. The
														
 
															 body of the first clause that matches is executed. The output of
														
 
															-\code{leaf?} for several S-expressions is shown on the right. In the
														
 
															-below \code{match}, we see another form of pattern: the
														
 
															-pattern \code{(? fixnum?)} applies the predicate \code{fixnum?} to the input
														
 
															-S-expression to see if it is a machine-representable integer.
														
 
															+\code{leaf?} for several ASTs is shown on the right.
														
 
															 \begin{center}
														
 
															-\begin{minipage}{0.5\textwidth}
														
 
															+\begin{minipage}{0.6\textwidth}
														
 
															 \begin{lstlisting}
														
 
															 (define (leaf? arith)
														
 
															   (match arith
														
 
															-    [(? fixnum?) #t]
														
 
															-    [`(read) #t]
														
 
															-    [`(- ,c1) #f]
														
 
															-    [`(+ ,c1 ,c2) #f]))
														
 
															+    [(Int n) #t]
														
 
															+    [(Prim 'read '()) #t]
														
 
															+    [(Prim '- (list c1)) #f]
														
 
															+    [(Prim '+ (list c1 c2)) #f]))
														
 
															-(leaf? `(read))
														
 
															-(leaf? `(- 8))
														
 
															-(leaf? `(+ (read) (- 8)))
														
 
															+(leaf? (Prim 'read '()))
														
 
															+(leaf? (Prim '- (list (Int 8))))
														
 
															+(leaf? (Int 8))
														
 
															 \end{lstlisting}
														
 
															 \end{minipage}
														
 
															 \vrule
														
@@ -611,10 +651,10 @@ S-expression to see if it is a machine-representable integer.
 
															-
														
 
															+    
														
 
															    #t
														
 
															    #f
														
 
															-   #f
														
 
															+   #t
														
 
															 \end{lstlisting}
														
 
															 \end{minipage}
														
 
															 \end{center}
														
@@ -626,14 +666,13 @@ match against, then we make sure that 1) we have one clause for each
 
															 alternative of that non-terminal and 2) that the pattern in each
														
 
															 clause corresponds to the corresponding right-hand side of a grammar
														
 
															 rule. For the \code{match} in the \code{leaf?} function, we refer to
														
 
															-the grammar for $R\_0$ in Figure~\ref{fig:r0-syntax}. The $\Exp$
														
 
															+the grammar for $R_0$ in Figure~\ref{fig:r0-syntax}. The $\Exp$
														
 
															 non-terminal has 4 alternatives, so the \code{match} has 4 clauses.
														
 
															 The pattern in each clause corresponds to the right-hand side of a
														
 
															-grammar rule. For example, the pattern \code{`(+ ,c1 ,c2)} corresponds
														
 
															-to the right-hand side $(\key{+} \; \Exp \; \Exp)$. When translating
														
 
															+grammar rule. For example, the pattern \code{(Prim '+ (list c1 c2))}
														
 
															+corresponds to the right-hand side $\ADD{\Exp}{\Exp}$. When translating
														
 
															 from grammars to patterns, replace non-terminals such as $\Exp$ with
														
 
															-pattern variables (a comma followed by a variable name of your
														
 
															-choice).
														
 
															+pattern variables (e.g. \code{c1} and \code{c2}).
														
 
															 \section{Recursion}
														
@@ -662,22 +701,24 @@ one recursive function to handle each non-terminal in the grammar.
 
															 \begin{center}
														
 
															 \begin{minipage}{0.7\textwidth}
														
 
															 \begin{lstlisting}
														
 
															-(define (exp? sexp)
														
 
															-  (match sexp
														
 
															-    [(? fixnum?) #t]
														
 
															-    [`(read) #t]
														
 
															-    [`(- ,e) (exp? e)]
														
 
															-    [`(+ ,e1 ,e2)
														
 
															+(define (exp? ast)
														
 
															+  (match ast
														
 
															+    [(Int n) #t]
														
 
															+    [(Prim 'read '()) #t]
														
 
															+    [(Prim '- (list e)) (exp? e)]
														
 
															+    [(Prim '+ (list e1 e2))
														
 
															       (and (exp? e1) (exp? e2))]
														
 
															     [else #f]))
														
 
															-(define (R0? sexp)
														
 
															-  (match sexp
														
 
															-    [`(program ,e) (exp? e)]
														
 
															+(define (R0? ast)
														
 
															+  (match ast
														
 
															+    [(Program '() e) (exp? e)]
														
 
															     [else #f]))
														
 
															-(R0? `(program (+ (read) (- 8))))
														
 
															-(R0? `(program (- (read) (+ 8))))
														
 
															+(R0? (Program '() ast1.1)
														
 
															+(R0? (Program '()
														
 
															+       (Prim '- (list (Prim 'read '())
														
 
															+                      (Prim '+ (list (Num 8)))))))
														
 
															 \end{lstlisting}
														
 
															 \end{minipage}
														
 
															 \vrule
														
@@ -696,7 +737,6 @@ one recursive function to handle each non-terminal in the grammar.
 
															-
														
 
															    #t
														
 
															    #f
														
 
															 \end{lstlisting}
														
@@ -708,13 +748,13 @@ You may be tempted to merge the two functions into one, like this:
 
															 \begin{center}
														
 
															 \begin{minipage}{0.5\textwidth}
														
 
															 \begin{lstlisting}
														
 
															-(define (R0? sexp)
														
 
															-  (match sexp
														
 
															-    [(? fixnum?) #t]
														
 
															-    [`(read) #t]
														
 
															-    [`(- ,e) (R0? e)]
														
 
															-    [`(+ ,e1 ,e2) (and (R0? e1) (R0? e2))]
														
 
															-    [`(program ,e) (R0? e)]
														
 
															+(define (R0? ast)
														
 
															+  (match ast
														
 
															+    [(Int n) #t]
														
 
															+    [(Prim 'read '()) #t]
														
 
															+    [(Prim '- (list e)) (R0? e)]
														
 
															+    [(Prim '+ (list e1 e2)) (and (R0? e1) (R0? e2))]
														
 
															+    [(Program '() e) (R0? e)]
														
 
															     [else #f]))
														
 
															 \end{lstlisting}
														
 
															 \end{minipage}
														
@@ -725,7 +765,7 @@ to the {\tt program} wrapper.  Yet this style is generally \emph{not}
 
															 recommended because it can get you into trouble.
														
 
															 %
														
 
															 For instance, the above function is subtly wrong:
														
 
															-\lstinline{(R0? `(program (program 3)))} will return true, when it
														
 
															+\lstinline{(R0? (Program '() (Program '() (Int 3))))} will return true, when it
														
 
															 should return false.
														
 
															 %% NOTE FIXME - must check for consistency on this issue throughout.
														
@@ -754,18 +794,24 @@ clause per grammar rule for $R_0$ expressions.
 
															 \begin{lstlisting}
														
 
															 (define (interp-exp e)
														
 
															   (match e
														
 
															-    [(? fixnum?) e]
														
 
															-    [`(read)
														
 
															-     (let ([r (read)])
														
 
															-       (cond [(fixnum? r) r]
														
 
															-             [else (error 'interp-R0 "input not an integer" r)]))]
														
 
															-    [`(- ,e1)     (fx- 0 (interp-exp e1))]
														
 
															-    [`(+ ,e1 ,e2) (fx+ (interp-exp e1) (interp-exp e2))]
														
 
															-    ))
														
 
															+    [(Int n) n]
														
 
															+    [(Prim 'read '())
														
 
															+     (define r (read))
														
 
															+     (cond [(fixnum? r) r]
														
 
															+           [else (error 'interp-R1 "expected an integer" r)])]
														
 
															+    [(Prim '- (list e))
														
 
															+     (define v (interp-exp e))
														
 
															+     (fx- 0 v)]
														
 
															+    [(Prim '+ (list e1 e2))
														
 
															+     (define v1 (interp-exp e1))
														
 
															+     (define v2 (interp-exp e2))
														
 
															+     (fx+ v1 v2)]
														
 
															+    )))
														
 
															 (define (interp-R0 p)
														
 
															   (match p
														
 
															-    [`(program ,e) (interp-exp e)]))
														
 
															+    [(Program '() e) (interp-exp e)]
														
 
															+    ))
														
 
															 \end{lstlisting}
														
 
															 \caption{Interpreter for the $R_0$ language.}
														
 
															 \label{fig:interp-R0}
														
@@ -776,8 +822,11 @@ following program adds two integers.
 
															 \begin{lstlisting}
														
 
															 (+ 10 32)
														
 
															 \end{lstlisting}
														
 
															-The result is \key{42}.  (We wrote the above program in concrete syntax,
														
 
															-whereas the parsed abstract syntax is \lstinline{(program (+ 10 32))}.)
														
 
															+The result is \key{42}.  We wrote the above program in concrete syntax,
														
 
															+whereas the parsed abstract syntax is:
														
 
															+\begin{lstlisting}
														
 
															+(Program '() (Prim '+ (list (Int 10) (Int 32))))
														
 
															+\end{lstlisting}
														
 
															 The next example demonstrates that expressions may be nested within
														
 
															 each other, in this case nesting several additions and negations.
														
@@ -789,11 +838,11 @@ What is the result of the above program?
 
															 As mentioned previously, the $R_0$ language does not support
														
 
															 arbitrarily-large integers, but only $63$-bit integers, so we
														
 
															 interpret the arithmetic operations of $R_0$ using fixnum arithmetic
														
 
															-in Racket.  What happens when we run the following program?
														
 
															+in Racket.
														
 
															+Suppose $n = 999999999999999999$, which indeed fits in $63$-bits.
														
 
															+What happens when we run the following program in our interpreter?
														
 
															 \begin{lstlisting}
														
 
															-(define large 999999999999999999)
														
 
															-(interp-R0 `(program (+ (+ (+ ,large ,large) (+ ,large ,large))
														
 
															-                        (+ (+ ,large ,large) (+ ,large ,large)))))
														
 
															+(+ (+ (+ |$n$| |$n$|) (+ |$n$| |$n$|)) (+ (+ |$n$| |$n$|) (+ |$n$| |$n$|)))))
														
 
															 \end{lstlisting}
														
 
															 It produces an error:
														
 
															 \begin{lstlisting}
														
@@ -816,7 +865,8 @@ program \eqref{eq:arith-prog} performs a \key{read} and then subtracts
 
															 (interp-R0 ast1.1)
														
 
															 \end{lstlisting}
														
 
															 and the input the integer \code{50} we get the answer to life, the
														
 
															-universe, and everything: \code{42}.
														
 
															+universe, and everything: \code{42}!\footnote{\emph{The Hitchhiker's
														
 
															+    Guide to the Galaxy} by Douglas Adams.}
														
 
															 We include the \key{read} operation in $R_0$ so a clever student
														
 
															 cannot implement a compiler for $R_0$ that simply runs the interpreter
														
@@ -876,24 +926,26 @@ functions is the output of partially evaluating the children.
 
															 \begin{figure}[tbp]
														
 
															 \begin{lstlisting}
														
 
															 (define (pe-neg r)
														
 
															-  (cond [(fixnum? r) (fx- 0 r)]
														
 
															-        [else `(- ,r)]))
														
 
															+  (match r
														
 
															+    [(Int n) (Int (fx- 0 n))]
														
 
															+    [else (Prim '- (list r))]))
														
 
															 (define (pe-add r1 r2)
														
 
															-  (cond [(and (fixnum? r1) (fixnum? r2)) (fx+ r1 r2)]
														
 
															-        [else `(+ ,r1 ,r2)]))
														
 
															+  (match* (r1 r2)
														
 
															+    [((Int n1) (Int n2)) (Int (fx+ n1 n2))]
														
 
															+    [(_ _) (Prim '+ (list r1 r2))]))
														
 
															 (define (pe-exp e)
														
 
															   (match e
														
 
															-    [(? fixnum?) e]
														
 
															-    [`(read) `(read)]
														
 
															-    [`(- ,e1) (pe-neg (pe-exp e1))]
														
 
															-    [`(+ ,e1 ,e2) (pe-add (pe-exp e1) (pe-exp e2))]
														
 
															+    [(Int n) (Int n)]
														
 
															+    [(Prim 'read '()) (Prim 'read '())]
														
 
															+    [(Prim '- (list e1)) (pe-neg (pe-exp e1))]
														
 
															+    [(Prim '+ (list e1 e2)) (pe-add (pe-exp e1) (pe-exp e2))]
														
 
															     ))
														
 
															 (define (pe-R0 p)
														
 
															   (match p
														
 
															-    [`(program ,e) `(program ,(pe-exp e))]
														
 
															+    [(Program info e) (Program info (pe-exp e))]
														
 
															     ))
														
 
															 \end{lstlisting}
														
 
															 \caption{A partial evaluator for $R_0$ expressions.}
														
@@ -911,16 +963,17 @@ test whether it produces programs that get the same result as the
 
															 input programs. That is, we can test whether it satisfies Diagram
														
 
															 \eqref{eq:compile-correct}. The following code runs the partial
														
 
															 evaluator on several examples and tests the output program.  The
														
 
															-\texttt{assert} function is defined in Appendix~\ref{appendix:utilities}.\\
														
 
															+\texttt{parse-program} and \texttt{assert} functions are defined in
														
 
															+Appendix~\ref{appendix:utilities}.\\
														
 
															 \begin{minipage}{1.0\textwidth}
														
 
															 \begin{lstlisting}
														
 
															 (define (test-pe p)
														
 
															   (assert "testing pe-R0"
														
 
															      (equal? (interp-R0 p) (interp-R0 (pe-R0 p)))))
														
 
															-(test-pe `(+ (read) (- (+ 5 3))))
														
 
															-(test-pe `(+ 1 (+ (read) 1)))
														
 
															-(test-pe `(- (+ (read) (- 5))))
														
 
															+(test-pe (parse-program `(program () (+ 10 (- (+ 5 3))))))
														
 
															+(test-pe (parse-program `(program () (+ 1 (+ 3 1)))))
														
 
															+(test-pe (parse-program `(program () (- (+ 3 (- 5))))))
														
 
															 \end{lstlisting}
														
 
															 \end{minipage}
														
@@ -7778,7 +7831,7 @@ registers.
 
															 %%  LocalWords:  Sarkar lcl Matz aa representable Chez Ph Dan's nano
														
 
															 %%  LocalWords:  fk bh Siek plt uq Felleisen Bor Yuh ASTs AST Naur eq
														
 
															 %%  LocalWords:  BNF fixnum datatype arith prog backquote quasiquote
														
 
															-%%  LocalWords:  ast sexp Reynold's reynolds interp cond fx evaluator
														
 
															+%%  LocalWords:  ast Reynold's reynolds interp cond fx evaluator
														
 
															 %%  LocalWords:  quasiquotes pe nullary unary rcl env lookup gcc rax
														
 
															 %%  LocalWords:  addq movq callq rsp rbp rbx rcx rdx rsi rdi subq nx
														
 
															 %%  LocalWords:  negq pushq popq retq globl Kernighan uniquify lll ve
														
--- a/defs.tex
+++ b/defs.tex
@@ -17,7 +17,10 @@
 
															 \newcommand{\Op}{\itm{op}}
														
 
															 \newcommand{\key}[1]{\texttt{#1}}
														
 
															 \newcommand{\code}[1]{\texttt{#1}}
														
 
															-\newcommand{\READ}{(\key{read})}
														
 
															+\newcommand{\READ}{\key{(Prim}\;\code{'read}\;\key{'())}}
														
 
															+\newcommand{\NEG}[1]{\key{(Prim}\;\code{'-}\;\code{(list}\;#1\;\code{))}}
														
 
															+\newcommand{\PROGRAM}[2]{\code{(Program}\;#1\;#2\code{)}}
														
 
															+\newcommand{\ADD}[2]{\key{(Prim}\;\code{'+}\;\code{(list}\;#1\;#2\code{))}}
														
 
															 \newcommand{\UNIOP}[2]{(\key{#1}~#2)}
														
 
															 \newcommand{\BINOP}[3]{(\key{#1}~#2~#3)}
														
 
															 \newcommand{\LET}[3]{(\key{let}~([#1\;#2])~#3)}
														
@@ -25,9 +28,9 @@
 
															 \newcommand{\ASSIGN}[2]{(\key{assign}~#1\;#2)}
														
 
															 \newcommand{\RETURN}[1]{(\key{return}~#1)}
														
 
															-\newcommand{\INT}[1]{(\key{int}\;#1)}
														
 
															-\newcommand{\REG}[1]{(\key{reg}\;#1)}
														
 
															-\newcommand{\VAR}[1]{(\key{var}\;#1)}
														
 
															+\newcommand{\INT}[1]{\key{(Int}\;#1\key{)}}
														
 
															+\newcommand{\REG}[1]{\key{(Reg}\;#1\key{)}}
														
 
															+\newcommand{\VAR}[1]{\key{(Var}\;#1\key{)}}
														
 
															 \newcommand{\STACKLOC}[1]{(\key{stack}\;#1)}
														
 
															 \newcommand{\IF}[3]{(\key{if}\,#1\;#2\;#3)}