Browse Source

updated chapter 1 prelims

Jeremy Siek 4 years ago
parent
commit
97317caa66
2 changed files with 230 additions and 174 deletions
  1. 223 170
      book.tex
  2. 7 4
      defs.tex

+ 223 - 170
book.tex

@@ -78,6 +78,7 @@
 language=Lisp,
 basicstyle=\ttfamily\small,
 morekeywords={seq,assign,program,block,define,lambda,match},
+deletekeywords={read},
 escapechar=|,
 columns=flexible,
 moredelim=[is][\color{red}]{~}{~}
@@ -302,8 +303,7 @@ typically stored in text files on disk, as \emph{concrete syntax}.
 ASTs can be represented in many different ways, depending on the programming
 language used to write the compiler.
 %
-Because this book uses Racket (\url{http://racket-lang.org}), a
-descendant of Lisp, we can use S-expressions to conveniently represent
+We use Racket's \code{struct} feature to conveniently represent
 ASTs (Section~\ref{sec:ast}). We use grammars to defined the abstract
 syntax of programming languages (Section~\ref{sec:grammar}) and
 pattern matching to inspect individual nodes in an AST
@@ -311,15 +311,14 @@ pattern matching to inspect individual nodes in an AST
 and deconstruct entire ASTs (Section~\ref{sec:recursion}).  This
 chapter provides an brief introduction to these ideas.
 
-\section{Abstract Syntax Trees and S-expressions}
+\section{Abstract Syntax Trees and Racket Structures}
 \label{sec:ast}
 
 The primary data structure that is commonly used for representing
 programs is the \emph{abstract syntax tree} (AST). When considering
 some part of a program, a compiler needs to ask what kind of thing it
 is and what sub-parts it contains. For example, the program on the
-left, represented by an S-expression, corresponds to the AST on the
-right.
+left corresponds to the AST on the right.
 \begin{center}
 \begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
@@ -349,64 +348,98 @@ node except for the root has a \emph{parent} (the node it is the child
 of). If a node has no children, it is a \emph{leaf} node.  Otherwise
 it is an \emph{internal} node.
 
-Recall that an \emph{symbolic expression} (S-expression) is either
-\begin{enumerate}
-\item an atom, or
-\item a pair of two S-expressions, written $(e_1 \key{.} e_2)$,
-    where $e_1$ and $e_2$ are each an S-expression.
-\end{enumerate}
-An \emph{atom} can be a symbol, such as \code{`hello}, a number, the
-null value \code{'()}, etc.  We can create an S-expression in Racket
-simply by writing a backquote (called a quasi-quote in Racket)
-followed by the textual representation of the S-expression.  It is
-quite common to use S-expressions to represent a list, such as $a, b
-,c$ in the following way:
-\begin{lstlisting}
-`(a . (b . (c . ())))
-\end{lstlisting}
-Each element of the list is in the first slot of a pair, and the
-second slot is either the rest of the list or the null value, to mark
-the end of the list. Such lists are so common that Racket provides
-special notation for them that removes the need for the periods
-and so many parenthesis:
-\begin{lstlisting}
-`(a b c)
-\end{lstlisting}
-The following expression creates an S-expression that represents AST
-\eqref{eq:arith-prog}.
-\begin{lstlisting}
-`(+ (read) (- 8))
-\end{lstlisting}
-When using S-expressions to represent ASTs, the convention is to
-represent each AST node as a list and to put the operation symbol at
-the front of the list. The rest of the list contains the children.  So
-in the above case, the root AST node has operation \code{`+} and its
-two children are \code{`(read)} and \code{`(- 8)}, just as in the
-diagram \eqref{eq:arith-prog}.
-
-To build larger S-expressions one often needs to splice together
-several smaller S-expressions. Racket provides the comma operator to
-splice an S-expression into a larger one. For example, instead of
-creating the S-expression for AST \eqref{eq:arith-prog} all at once,
-we could have first created an S-expression for AST
-\eqref{eq:arith-neg8} and then spliced that into the addition
-S-expression.
+%% Recall that an \emph{symbolic expression} (S-expression) is either
+%% \begin{enumerate}
+%% \item an atom, or
+%% \item a pair of two S-expressions, written $(e_1 \key{.} e_2)$,
+%%     where $e_1$ and $e_2$ are each an S-expression.
+%% \end{enumerate}
+%% An \emph{atom} can be a symbol, such as \code{`hello}, a number, the
+%% null value \code{'()}, etc.  We can create an S-expression in Racket
+%% simply by writing a backquote (called a quasi-quote in Racket)
+%% followed by the textual representation of the S-expression.  It is
+%% quite common to use S-expressions to represent a list, such as $a, b
+%% ,c$ in the following way:
+%% \begin{lstlisting}
+%% `(a . (b . (c . ())))
+%% \end{lstlisting}
+%% Each element of the list is in the first slot of a pair, and the
+%% second slot is either the rest of the list or the null value, to mark
+%% the end of the list. Such lists are so common that Racket provides
+%% special notation for them that removes the need for the periods
+%% and so many parenthesis:
+%% \begin{lstlisting}
+%% `(a b c)
+%% \end{lstlisting}
+%% The following expression creates an S-expression that represents AST
+%% \eqref{eq:arith-prog}.
+%% \begin{lstlisting}
+%% `(+ (read) (- 8))
+%% \end{lstlisting}
+%% When using S-expressions to represent ASTs, the convention is to
+%% represent each AST node as a list and to put the operation symbol at
+%% the front of the list. The rest of the list contains the children.  So
+%% in the above case, the root AST node has operation \code{`+} and its
+%% two children are \code{`(read)} and \code{`(- 8)}, just as in the
+%% diagram \eqref{eq:arith-prog}.
+
+%% To build larger S-expressions one often needs to splice together
+%% several smaller S-expressions. Racket provides the comma operator to
+%% splice an S-expression into a larger one. For example, instead of
+%% creating the S-expression for AST \eqref{eq:arith-prog} all at once,
+%% we could have first created an S-expression for AST
+%% \eqref{eq:arith-neg8} and then spliced that into the addition
+%% S-expression.
+%% \begin{lstlisting}
+%% (define ast1.4 `(- 8))
+%% (define ast1.1 `(+ (read) ,ast1.4))
+%% \end{lstlisting}
+%% In general, the Racket expression that follows the comma (splice)
+%% can be any expression that produces an S-expression.
+
+We define a Racket \code{struct} for each kind of node. For this
+chapter we require just two kinds of nodes: one for integer constants
+and one for primitive operations. The following is the \code{struct}
+definition for integer constants.
+\begin{lstlisting}
+(struct Int (value))
+\end{lstlisting}
+An integer node includes just one thing: the integer value.
+To create a AST node for the integer $8$, we write \code{(Int 8)}.
+\begin{lstlisting}
+(define eight (Int 8))
+\end{lstlisting}
+The following is the \code{struct} definition for primitives operations.
+\begin{lstlisting}
+(struct Prim (op arg*))
+\end{lstlisting}
+A primitive operation node includes an operator symbol \code{op}
+and a list of children \code{arg*}. For example, to create
+an AST that negates the number $8$, we write \code{(Prim '- (list eight))}.
+\begin{lstlisting}
+(define neg-eight (Prim '- (list eight)))
+\end{lstlisting}
+Primitive operations may have zero or more children. The \code{read}
+operator has zero children:
+\begin{lstlisting}
+(define rd (Prim 'read '()))
+\end{lstlisting}
+whereas the addition operator has two children:
 \begin{lstlisting}
-(define ast1.4 `(- 8))
-(define ast1.1 `(+ (read) ,ast1.4))
+(define ast1.1 (Prim '+ (list rd neg-eight)))
 \end{lstlisting}
-In general, the Racket expression that follows the comma (splice)
-can be any expression that produces an S-expression.
 
 When deciding how to compile program \eqref{eq:arith-prog}, we need to
 know that the operation associated with the root node is addition and
 that it has two children: \texttt{read} and a negation. The AST data
 structure directly supports these queries, as we shall see in
 Section~\ref{sec:pattern-matching}, and hence is a good choice for use
-in compilers. In this book, we often write down the S-expression
-representation of a program even when we really have in mind the AST
-because the S-expression is more concise.  We recommend that, in your
-mind, you always think of programs as abstract syntax trees.
+in compilers.
+
+In this book, we often write down the concrete syntax of a program
+even when we really have in mind the AST because the concrete syntax
+is more concise.  We recommend that, in your mind, you always think of
+programs as abstract syntax trees.
 
 \section{Grammars}
 \label{sec:grammar}
@@ -415,13 +448,17 @@ A programming language can be thought of as a \emph{set} of programs.
 The set is typically infinite (one can always create larger and larger
 programs), so one cannot simply describe a language by listing all of
 the programs in the language. Instead we write down a set of rules, a
-\emph{grammar}, for building programs. We shall write our rules in a
-variant of Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.
-As an example, we describe a small language, named $R_0$, that
-consists of integers and arithmetic operations. The first grammar rule
-says that any integer is an expression:
+\emph{grammar}, for building programs. Grammars are often used to
+define the concrete syntax of a language, but they can also be used to
+describe the abstract syntax. We shall write our rules in a variant of
+Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.  As an
+example, we describe a small language, named $R_0$, that consists of
+integers and arithmetic operations.
+
+The first grammar rule says that given any integer $n$, an integer
+node $\INT{n}$ is an expression:
 \begin{equation}
-\Exp ::= \Int  \label{eq:arith-int}
+\Exp ::= \INT{n}  \label{eq:arith-int}
 \end{equation}
 %
 Each rule has a left-hand-side and a right-hand-side. The way to read
@@ -432,40 +469,39 @@ according to the left-hand-side.
 A name such as $\Exp$ that is
 defined by the grammar rules is a \emph{non-terminal}.
 %
-The name $\Int$ is a also a non-terminal, however, we do not define
-$\Int$ because the reader already knows what an integer is.
-%
-Further, we make the simplifying design decision that all of the languages in
-this book only handle machine-representable integers.  On most modern machines
-this corresponds to integers represented with 64-bits, i.e., the in range
-$-2^{63}$ to $2^{63}-1$.
-%
-However, we restrict this range further to match the Racket \texttt{fixnum}
-datatype, which allows 63-bit integers on a 64-bit machine.
+%% The name $\Int$ is a also a non-terminal, however, we do not define
+%% $\Int$ because the reader already knows what an integer is.
+
+We make the simplifying design decision that all of the languages in
+this book only handle machine-representable integers.  On most modern
+machines this corresponds to integers represented with 64-bits, i.e.,
+the in range $-2^{63}$ to $2^{63}-1$.  We restrict this range further
+to match the Racket \texttt{fixnum} datatype, which allows 63-bit
+integers on a 64-bit machine.
 
 The second grammar rule is the \texttt{read} operation that receives
 an input integer from the user of the program.
 \begin{equation}
-  \Exp ::= (\key{read}) \label{eq:arith-read}
+  \Exp ::= \READ{} \label{eq:arith-read}
 \end{equation}
 
 The third rule says that, given an $\Exp$ node, you can build another
 $\Exp$ node by negating it.
 \begin{equation}
-  \Exp ::= (\key{-} \; \Exp)  \label{eq:arith-neg}
+  \Exp ::= \NEG{\Exp}  \label{eq:arith-neg}
 \end{equation}
 Symbols in typewriter font such as \key{-} and \key{read} are
 \emph{terminal} symbols and must literally appear in the program for
 the rule to be applicable.
 
 We can apply the rules to build ASTs in the $R_0$
-language. For example, by rule \eqref{eq:arith-int}, \texttt{8} is an
+language. For example, by rule \eqref{eq:arith-int}, \texttt{(Int 8)} is an
 $\Exp$, then by rule \eqref{eq:arith-neg}, the following AST is
 an $\Exp$.
 \begin{center}
-\begin{minipage}{0.25\textwidth}
+\begin{minipage}{0.4\textwidth}
 \begin{lstlisting}
-(- 8)
+(Prim '- (list (Int 8)))
 \end{lstlisting}
 \end{minipage}
 \begin{minipage}{0.25\textwidth}
@@ -483,26 +519,37 @@ an $\Exp$.
 
 The next grammar rule defines addition expressions:
 \begin{equation}
-  \Exp ::= (\key{+} \; \Exp \; \Exp) \label{eq:arith-add}
+  \Exp ::= \ADD{\Exp}{\Exp} \label{eq:arith-add}
 \end{equation}
-We can now see that the AST \eqref{eq:arith-prog} is an $\Exp$ in
-$R_0$.  We know that \lstinline{(read)} is an $\Exp$ by rule
-\eqref{eq:arith-read} and we have shown that \texttt{(- 8)} is an
-$\Exp$, so we can apply rule \eqref{eq:arith-add} to show that
-\texttt{(+ (read) (- 8))} is an $\Exp$ in the $R_0$ language.
+We can now justify that the AST \eqref{eq:arith-prog} is an $\Exp$ in
+$R_0$.  We know that \lstinline{(Prim 'read '())} is an $\Exp$ by rule
+\eqref{eq:arith-read} and we have already shown that \code{(Prim '-
+  (list (Int 8)))} is an $\Exp$, so we apply rule \eqref{eq:arith-add}
+to show that
+\begin{lstlisting}
+(Prim '+ (list (Prim 'read '()) (Prim '- (list (Int 8)))))
+\end{lstlisting}
+is an $\Exp$ in the $R_0$ language.
 
 If you have an AST for which the above rules do not apply, then the
-AST is not in $R_0$. For example, the AST \texttt{(- (read) (+ 8))} is
-not in $R_0$ because there are no rules for \key{+} with only one
-argument, nor for \key{-} with two arguments.  Whenever we define a
-language with a grammar, we mean for the language to only include
-those programs that are justified by the rules.
+AST is not in $R_0$. For example, the program \code{(- (read) (+ 8))}
+is not in $R_0$ because there are no rules for \code{+} with only one
+argument, nor for \key{-} with two arguments. Whenever we define a
+language with a grammar, the language only includes those programs
+that are justified by the rules.
 
-The last grammar rule for $R_0$ states that there is a \key{program}
+The last grammar rule for $R_0$ states that there is a \code{Program}
 node to mark the top of the whole program:
 \[
-  R_0 ::= (\key{program} \; \Exp)
+  R_0 ::= \PROGRAM{\code{'()}}{\Exp}
 \]
+The \code{Program} structure is defined as follows
+\begin{lstlisting}
+(struct Program (info body))
+\end{lstlisting}
+where \code{body} is an expression. In later chapters, the \code{info}
+part will be used to store auxilliary information but for now it is
+just the empty list.
 
 The \code{read-program} function provided in \code{utilities.rkt}
 reads programs in from a file (the sequence of characters in the
@@ -523,9 +570,9 @@ called an {\em alternative}.
 \begin{minipage}{0.96\textwidth}
 \[
 \begin{array}{rcl}
-\Exp &::=& \Int \mid ({\tt \key{read}}) \mid (\key{-} \; \Exp) \mid
-   (\key{+} \; \Exp \; \Exp)  \\
-R_0  &::=& (\key{program} \; \Exp)
+\Exp &::=& \INT{n} \mid \READ{} \mid \NEG{\Exp} \\
+     &\mid&  \ADD{\Exp}{\Exp}  \\
+R_0  &::=& \code{(Program} \; \code{'()}\; \Exp \code{)}
 \end{array}
 \]
 \end{minipage}
@@ -542,16 +589,14 @@ R_0  &::=& (\key{program} \; \Exp)
 
 As mentioned above, compilers often need to access the children of an
 AST node. Racket provides the \texttt{match} form to access the parts
-of an S-expression. Consider the following example and the output on
-the right.
+of a structure. Consider the following example and the output on the
+right.
 \begin{center}
 \begin{minipage}{0.5\textwidth}
 \begin{lstlisting}
 (match ast1.1
-  [`(,op ,child1 ,child2)
-    (print op) (newline)
-    (print child1) (newline)
-    (print child2)])
+  [(Prim op (list child1 child2))
+    (print op)])
 \end{lstlisting}
 \end{minipage}
 \vrule
@@ -560,8 +605,6 @@ the right.
 
 
    '+
-   '(read)
-   '(- 8)
 \end{lstlisting}
 \end{minipage}
 \end{center}
@@ -581,25 +624,22 @@ clause may contain any Racket code whatsoever.
 A \code{match} form may contain several clauses, as in the following
 function \code{leaf?} that recognizes when an $R_0$ node is
 a leaf. The \code{match} proceeds through the clauses in order,
-checking whether the pattern can match the input S-expression. The
+checking whether the pattern can match the input AST. The
 body of the first clause that matches is executed. The output of
-\code{leaf?} for several S-expressions is shown on the right. In the
-below \code{match}, we see another form of pattern: the
-pattern \code{(? fixnum?)} applies the predicate \code{fixnum?} to the input
-S-expression to see if it is a machine-representable integer.
+\code{leaf?} for several ASTs is shown on the right.
 \begin{center}
-\begin{minipage}{0.5\textwidth}
+\begin{minipage}{0.6\textwidth}
 \begin{lstlisting}
 (define (leaf? arith)
   (match arith
-    [(? fixnum?) #t]
-    [`(read) #t]
-    [`(- ,c1) #f]
-    [`(+ ,c1 ,c2) #f]))
+    [(Int n) #t]
+    [(Prim 'read '()) #t]
+    [(Prim '- (list c1)) #f]
+    [(Prim '+ (list c1 c2)) #f]))
 
-(leaf? `(read))
-(leaf? `(- 8))
-(leaf? `(+ (read) (- 8)))
+(leaf? (Prim 'read '()))
+(leaf? (Prim '- (list (Int 8))))
+(leaf? (Int 8))
 \end{lstlisting}
 \end{minipage}
 \vrule
@@ -611,10 +651,10 @@ S-expression to see if it is a machine-representable integer.
 
 
 
-
+    
    #t
    #f
-   #f
+   #t
 \end{lstlisting}
 \end{minipage}
 \end{center}
@@ -626,14 +666,13 @@ match against, then we make sure that 1) we have one clause for each
 alternative of that non-terminal and 2) that the pattern in each
 clause corresponds to the corresponding right-hand side of a grammar
 rule. For the \code{match} in the \code{leaf?} function, we refer to
-the grammar for $R\_0$ in Figure~\ref{fig:r0-syntax}. The $\Exp$
+the grammar for $R_0$ in Figure~\ref{fig:r0-syntax}. The $\Exp$
 non-terminal has 4 alternatives, so the \code{match} has 4 clauses.
 The pattern in each clause corresponds to the right-hand side of a
-grammar rule. For example, the pattern \code{`(+ ,c1 ,c2)} corresponds
-to the right-hand side $(\key{+} \; \Exp \; \Exp)$. When translating
+grammar rule. For example, the pattern \code{(Prim '+ (list c1 c2))}
+corresponds to the right-hand side $\ADD{\Exp}{\Exp}$. When translating
 from grammars to patterns, replace non-terminals such as $\Exp$ with
-pattern variables (a comma followed by a variable name of your
-choice).
+pattern variables (e.g. \code{c1} and \code{c2}).
 
 
 \section{Recursion}
@@ -662,22 +701,24 @@ one recursive function to handle each non-terminal in the grammar.
 \begin{center}
 \begin{minipage}{0.7\textwidth}
 \begin{lstlisting}
-(define (exp? sexp)
-  (match sexp
-    [(? fixnum?) #t]
-    [`(read) #t]
-    [`(- ,e) (exp? e)]
-    [`(+ ,e1 ,e2)
+(define (exp? ast)
+  (match ast
+    [(Int n) #t]
+    [(Prim 'read '()) #t]
+    [(Prim '- (list e)) (exp? e)]
+    [(Prim '+ (list e1 e2))
       (and (exp? e1) (exp? e2))]
     [else #f]))
 
-(define (R0? sexp)
-  (match sexp
-    [`(program ,e) (exp? e)]
+(define (R0? ast)
+  (match ast
+    [(Program '() e) (exp? e)]
     [else #f]))
 
-(R0? `(program (+ (read) (- 8))))
-(R0? `(program (- (read) (+ 8))))
+(R0? (Program '() ast1.1)
+(R0? (Program '()
+       (Prim '- (list (Prim 'read '())
+                      (Prim '+ (list (Num 8)))))))
 \end{lstlisting}
 \end{minipage}
 \vrule
@@ -696,7 +737,6 @@ one recursive function to handle each non-terminal in the grammar.
 
 
 
-
    #t
    #f
 \end{lstlisting}
@@ -708,13 +748,13 @@ You may be tempted to merge the two functions into one, like this:
 \begin{center}
 \begin{minipage}{0.5\textwidth}
 \begin{lstlisting}
-(define (R0? sexp)
-  (match sexp
-    [(? fixnum?) #t]
-    [`(read) #t]
-    [`(- ,e) (R0? e)]
-    [`(+ ,e1 ,e2) (and (R0? e1) (R0? e2))]
-    [`(program ,e) (R0? e)]
+(define (R0? ast)
+  (match ast
+    [(Int n) #t]
+    [(Prim 'read '()) #t]
+    [(Prim '- (list e)) (R0? e)]
+    [(Prim '+ (list e1 e2)) (and (R0? e1) (R0? e2))]
+    [(Program '() e) (R0? e)]
     [else #f]))
 \end{lstlisting}
 \end{minipage}
@@ -725,7 +765,7 @@ to the {\tt program} wrapper.  Yet this style is generally \emph{not}
 recommended because it can get you into trouble.
 %
 For instance, the above function is subtly wrong:
-\lstinline{(R0? `(program (program 3)))} will return true, when it
+\lstinline{(R0? (Program '() (Program '() (Int 3))))} will return true, when it
 should return false.
 
 %% NOTE FIXME - must check for consistency on this issue throughout.
@@ -754,18 +794,24 @@ clause per grammar rule for $R_0$ expressions.
 \begin{lstlisting}
 (define (interp-exp e)
   (match e
-    [(? fixnum?) e]
-    [`(read)
-     (let ([r (read)])
-       (cond [(fixnum? r) r]
-             [else (error 'interp-R0 "input not an integer" r)]))]
-    [`(- ,e1)     (fx- 0 (interp-exp e1))]
-    [`(+ ,e1 ,e2) (fx+ (interp-exp e1) (interp-exp e2))]
-    ))
+    [(Int n) n]
+    [(Prim 'read '())
+     (define r (read))
+     (cond [(fixnum? r) r]
+           [else (error 'interp-R1 "expected an integer" r)])]
+    [(Prim '- (list e))
+     (define v (interp-exp e))
+     (fx- 0 v)]
+    [(Prim '+ (list e1 e2))
+     (define v1 (interp-exp e1))
+     (define v2 (interp-exp e2))
+     (fx+ v1 v2)]
+    )))
 
 (define (interp-R0 p)
   (match p
-    [`(program ,e) (interp-exp e)]))
+    [(Program '() e) (interp-exp e)]
+    ))
 \end{lstlisting}
 \caption{Interpreter for the $R_0$ language.}
 \label{fig:interp-R0}
@@ -776,8 +822,11 @@ following program adds two integers.
 \begin{lstlisting}
 (+ 10 32)
 \end{lstlisting}
-The result is \key{42}.  (We wrote the above program in concrete syntax,
-whereas the parsed abstract syntax is \lstinline{(program (+ 10 32))}.)
+The result is \key{42}.  We wrote the above program in concrete syntax,
+whereas the parsed abstract syntax is:
+\begin{lstlisting}
+(Program '() (Prim '+ (list (Int 10) (Int 32))))
+\end{lstlisting}
 
 The next example demonstrates that expressions may be nested within
 each other, in this case nesting several additions and negations.
@@ -789,11 +838,11 @@ What is the result of the above program?
 As mentioned previously, the $R_0$ language does not support
 arbitrarily-large integers, but only $63$-bit integers, so we
 interpret the arithmetic operations of $R_0$ using fixnum arithmetic
-in Racket.  What happens when we run the following program?
+in Racket.
+Suppose $n = 999999999999999999$, which indeed fits in $63$-bits.
+What happens when we run the following program in our interpreter?
 \begin{lstlisting}
-(define large 999999999999999999)
-(interp-R0 `(program (+ (+ (+ ,large ,large) (+ ,large ,large))
-                        (+ (+ ,large ,large) (+ ,large ,large)))))
+(+ (+ (+ |$n$| |$n$|) (+ |$n$| |$n$|)) (+ (+ |$n$| |$n$|) (+ |$n$| |$n$|)))))
 \end{lstlisting}
 It produces an error:
 \begin{lstlisting}
@@ -816,7 +865,8 @@ program \eqref{eq:arith-prog} performs a \key{read} and then subtracts
 (interp-R0 ast1.1)
 \end{lstlisting}
 and the input the integer \code{50} we get the answer to life, the
-universe, and everything: \code{42}.
+universe, and everything: \code{42}!\footnote{\emph{The Hitchhiker's
+    Guide to the Galaxy} by Douglas Adams.}
 
 We include the \key{read} operation in $R_0$ so a clever student
 cannot implement a compiler for $R_0$ that simply runs the interpreter
@@ -876,24 +926,26 @@ functions is the output of partially evaluating the children.
 \begin{figure}[tbp]
 \begin{lstlisting}
 (define (pe-neg r)
-  (cond [(fixnum? r) (fx- 0 r)]
-        [else `(- ,r)]))
+  (match r
+    [(Int n) (Int (fx- 0 n))]
+    [else (Prim '- (list r))]))
 
 (define (pe-add r1 r2)
-  (cond [(and (fixnum? r1) (fixnum? r2)) (fx+ r1 r2)]
-        [else `(+ ,r1 ,r2)]))
+  (match* (r1 r2)
+    [((Int n1) (Int n2)) (Int (fx+ n1 n2))]
+    [(_ _) (Prim '+ (list r1 r2))]))
 
 (define (pe-exp e)
   (match e
-    [(? fixnum?) e]
-    [`(read) `(read)]
-    [`(- ,e1) (pe-neg (pe-exp e1))]
-    [`(+ ,e1 ,e2) (pe-add (pe-exp e1) (pe-exp e2))]
+    [(Int n) (Int n)]
+    [(Prim 'read '()) (Prim 'read '())]
+    [(Prim '- (list e1)) (pe-neg (pe-exp e1))]
+    [(Prim '+ (list e1 e2)) (pe-add (pe-exp e1) (pe-exp e2))]
     ))
 
 (define (pe-R0 p)
   (match p
-    [`(program ,e) `(program ,(pe-exp e))]
+    [(Program info e) (Program info (pe-exp e))]
     ))
 \end{lstlisting}
 \caption{A partial evaluator for $R_0$ expressions.}
@@ -911,16 +963,17 @@ test whether it produces programs that get the same result as the
 input programs. That is, we can test whether it satisfies Diagram
 \eqref{eq:compile-correct}. The following code runs the partial
 evaluator on several examples and tests the output program.  The
-\texttt{assert} function is defined in Appendix~\ref{appendix:utilities}.\\
+\texttt{parse-program} and \texttt{assert} functions are defined in
+Appendix~\ref{appendix:utilities}.\\
 \begin{minipage}{1.0\textwidth}
 \begin{lstlisting}
 (define (test-pe p)
   (assert "testing pe-R0"
      (equal? (interp-R0 p) (interp-R0 (pe-R0 p)))))
 
-(test-pe `(+ (read) (- (+ 5 3))))
-(test-pe `(+ 1 (+ (read) 1)))
-(test-pe `(- (+ (read) (- 5))))
+(test-pe (parse-program `(program () (+ 10 (- (+ 5 3))))))
+(test-pe (parse-program `(program () (+ 1 (+ 3 1)))))
+(test-pe (parse-program `(program () (- (+ 3 (- 5))))))
 \end{lstlisting}
 \end{minipage}
 
@@ -7778,7 +7831,7 @@ registers.
 %%  LocalWords:  Sarkar lcl Matz aa representable Chez Ph Dan's nano
 %%  LocalWords:  fk bh Siek plt uq Felleisen Bor Yuh ASTs AST Naur eq
 %%  LocalWords:  BNF fixnum datatype arith prog backquote quasiquote
-%%  LocalWords:  ast sexp Reynold's reynolds interp cond fx evaluator
+%%  LocalWords:  ast Reynold's reynolds interp cond fx evaluator
 %%  LocalWords:  quasiquotes pe nullary unary rcl env lookup gcc rax
 %%  LocalWords:  addq movq callq rsp rbp rbx rcx rdx rsi rdi subq nx
 %%  LocalWords:  negq pushq popq retq globl Kernighan uniquify lll ve

+ 7 - 4
defs.tex

@@ -17,7 +17,10 @@
 \newcommand{\Op}{\itm{op}}
 \newcommand{\key}[1]{\texttt{#1}}
 \newcommand{\code}[1]{\texttt{#1}}
-\newcommand{\READ}{(\key{read})}
+\newcommand{\READ}{\key{(Prim}\;\code{'read}\;\key{'())}}
+\newcommand{\NEG}[1]{\key{(Prim}\;\code{'-}\;\code{(list}\;#1\;\code{))}}
+\newcommand{\PROGRAM}[2]{\code{(Program}\;#1\;#2\code{)}}
+\newcommand{\ADD}[2]{\key{(Prim}\;\code{'+}\;\code{(list}\;#1\;#2\code{))}}
 \newcommand{\UNIOP}[2]{(\key{#1}~#2)}
 \newcommand{\BINOP}[3]{(\key{#1}~#2~#3)}
 \newcommand{\LET}[3]{(\key{let}~([#1\;#2])~#3)}
@@ -25,9 +28,9 @@
 \newcommand{\ASSIGN}[2]{(\key{assign}~#1\;#2)}
 \newcommand{\RETURN}[1]{(\key{return}~#1)}
 
-\newcommand{\INT}[1]{(\key{int}\;#1)}
-\newcommand{\REG}[1]{(\key{reg}\;#1)}
-\newcommand{\VAR}[1]{(\key{var}\;#1)}
+\newcommand{\INT}[1]{\key{(Int}\;#1\key{)}}
+\newcommand{\REG}[1]{\key{(Reg}\;#1\key{)}}
+\newcommand{\VAR}[1]{\key{(Var}\;#1\key{)}}
 \newcommand{\STACKLOC}[1]{(\key{stack}\;#1)}
 
 \newcommand{\IF}[3]{(\key{if}\,#1\;#2\;#3)}