Jeremy Siek преди 4 години
родител
ревизия
53cb113cab
променени са 1 файла, в които са добавени 64 реда и са изтрити 53 реда
  1. 64 53
      book.tex

+ 64 - 53
book.tex

@@ -416,9 +416,9 @@ programs), so one cannot simply describe a language by listing all of
 the programs in the language. Instead we write down a set of rules, a
 \emph{grammar}, for building programs. We shall write our rules in a
 variant of Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.
-As an example, we describe a small language, named $R_0$, of
-integers and arithmetic operations. The first rule says that any
-integer is an expression, $\Exp$, in the language:
+As an example, we describe a small language, named $R_0$, that
+consists of integers and arithmetic operations. The first grammar rule
+says that any integer ($\Int$) is an expression ($\Exp$):
 \begin{equation}
 \Exp ::= \Int  \label{eq:arith-int}
 \end{equation}
@@ -431,9 +431,8 @@ according to the left-hand-side.
 A name such as $\Exp$ that is
 defined by the grammar rules is a \emph{non-terminal}.
 %
-The name $\Int$ is a also a non-terminal, however,
-we do not define $\Int$ because the
-reader already knows what an integer is.
+The name $\Int$ is a also a non-terminal, however, we do not define
+$\Int$ because the reader already knows what an integer is.
 %
 Further, we make the simplifying design decision that all of the languages in
 this book only handle machine-representable integers.  On most modern machines
@@ -454,9 +453,9 @@ $\Exp$ node by negating it.
 \begin{equation}
   \Exp ::= (\key{-} \; \Exp)  \label{eq:arith-neg}
 \end{equation}
-Symbols such as \key{-} in typewriter font are \emph{terminal} symbols
-and must literally appear in the program for the rule to be
-applicable.
+Symbols in typewriter font such as \key{-} and \key{read} are
+\emph{terminal} symbols and must literally appear in the program for
+the rule to be applicable.
 
 We can apply the rules to build ASTs in the $R_0$
 language. For example, by rule \eqref{eq:arith-int}, \texttt{8} is an
@@ -481,11 +480,11 @@ an $\Exp$.
 \end{minipage}
 \end{center}
 
-The following grammar rule defines addition expressions:
+The next grammar rule defines addition expressions:
 \begin{equation}
   \Exp ::= (\key{+} \; \Exp \; \Exp) \label{eq:arith-add}
 \end{equation}
-Now we can see that the AST \eqref{eq:arith-prog} is an $\Exp$ in
+We can now see that the AST \eqref{eq:arith-prog} is an $\Exp$ in
 $R_0$.  We know that \lstinline{(read)} is an $\Exp$ by rule
 \eqref{eq:arith-read} and we have shown that \texttt{(- 8)} is an
 $\Exp$, so we can apply rule \eqref{eq:arith-add} to show that
@@ -495,9 +494,8 @@ If you have an AST for which the above rules do not apply, then the
 AST is not in $R_0$. For example, the AST \texttt{(- (read) (+ 8))} is
 not in $R_0$ because there are no rules for \key{+} with only one
 argument, nor for \key{-} with two arguments.  Whenever we define a
-language with a grammar, we implicitly mean for the language to be the
-smallest set of programs that are justified by the rules. That is, the
-language only includes those programs that the rules allow.
+language with a grammar, we mean for the language to only include
+those programs that are justified by the rules.
 
 The last grammar rule for $R_0$ states that there is a \key{program}
 node to mark the top of the whole program:
@@ -541,11 +539,10 @@ R_0  &::=& (\key{program} \; \Exp)
 \section{Pattern Matching}
 \label{sec:pattern-matching}
 
-As mentioned above, one of the operations that a compiler needs to
-perform on an AST is to access the children of a node.  Racket
-provides the \texttt{match} form to access the parts of an
-S-expression. Consider the following example and the output on the
-right.
+As mentioned above, compilers often need to access the children of an
+AST node. Racket provides the \texttt{match} form to access the parts
+of an S-expression. Consider the following example and the output on
+the right.
 \begin{center}
 \begin{minipage}{0.5\textwidth}
 \begin{lstlisting}
@@ -571,24 +568,23 @@ The \texttt{match} form takes AST \eqref{eq:arith-prog} and binds its
 parts to the three variables \texttt{op}, \texttt{child1}, and
 \texttt{child2}. In general, a match clause consists of a
 \emph{pattern} and a \emph{body}. The pattern is a quoted S-expression
-that may contain pattern-variables (each one preceded by a comma).
+that may also contain pattern-variables (each one preceded by a comma).
 %
 The pattern is not the same thing as a quasiquote expression used to
-\emph{construct} ASTs, however, the similarity is intentional: constructing and
-deconstructing ASTs uses similar syntax.
+\emph{construct} ASTs, however, the similarity is intentional:
+constructing and deconstructing ASTs uses similar syntax.
 %
-While the pattern uses a restricted syntax,
-the body of the match clause may contain any Racket code whatsoever.
-
+While the pattern uses a restricted syntax, the body of the match
+clause may contain any Racket code whatsoever.
 
-A \texttt{match} form may contain several clauses, as in the following
-function \texttt{leaf?} that recognizes when an $R_0$ node is
-a leaf. The \texttt{match} proceeds through the clauses in order,
+A \code{match} form may contain several clauses, as in the following
+function \code{leaf?} that recognizes when an $R_0$ node is
+a leaf. The \code{match} proceeds through the clauses in order,
 checking whether the pattern can match the input S-expression. The
 body of the first clause that matches is executed. The output of
-\texttt{leaf?} for several S-expressions is shown on the right. In the
-below \texttt{match}, we see another form of pattern: the \texttt{(?
-  fixnum?)} applies the predicate \texttt{fixnum?} to the input
+\code{leaf?} for several S-expressions is shown on the right. In the
+below \code{match}, we see another form of pattern: the
+pattern \code{(? fixnum?)} applies the predicate \code{fixnum?} to the input
 S-expression to see if it is a machine-representable integer.
 \begin{center}
 \begin{minipage}{0.5\textwidth}
@@ -623,24 +619,45 @@ S-expression to see if it is a machine-representable integer.
 \end{center}
 
 
+When writing a \code{match}, we always refer to the grammar definition
+for the language and identify which non-terminal we're expecting to
+match against, then we make sure that 1) we have one clause for each
+alternative of that non-terminal and 2) that the pattern in each
+clause corresponds to the corresponding right-hand side of a grammar
+rule. For the \code{match} in the \code{leaf?} function, we refer to
+the grammar for $R\_0$ in Figure~\ref{fig:r0-syntax}. The $\Exp$
+non-terminal has 4 alternatives, so the \code{match} has 4 clauses.
+The pattern in each clause corresponds to the right-hand side of a
+grammar rule. For example, the pattern \code{`(+ ,c1 ,c2)} corresponds
+to the right-hand side $(\key{+} \; \Exp \; \Exp)$. When translating
+from grammars to patterns, replace non-terminals such as $\Exp$ with
+pattern variables (a comma followed by a variable name of your
+choice).
+
+
 \section{Recursion}
 \label{sec:recursion}
 
-Programs are inherently recursive in that an $R_0$ expression ($\Exp$)
-is made up of smaller expressions. Thus, the natural way to process an
-entire program is with a recursive function.  As a first example of
-such a function, we define \texttt{exp?} below, which takes an
-arbitrary S-expression, {\tt sexp}, and determines whether or not {\tt
-  sexp} is an $R_0$ expression. Note that each match clause
-corresponds to one grammar rule the body of each clause makes a
-recursive call for each child node. This pattern of recursive function
-is so common that it has a name, \emph{structural recursion}.  In
-general, when a recursive function is defined using a sequence of
-match clauses that correspond to a grammar, and each clause body makes
-a recursive call on each child node, then we say the function is
-defined by structural recursion. Below we also define a second
-function, named \code{R0?}, determines whether an S-expression is an
-$R_0$ program.
+Programs are inherently recursive. For example, an $R_0$ expression
+($\Exp$) is often made of smaller expressions. Thus, the natural way
+to process an entire program is with a recursive function.  As a first
+example of such a recursive function, we define \texttt{exp?} below,
+which takes an arbitrary S-expression, {\tt sexp}, and determines
+whether or not {\tt sexp} is an $R_0$ expression. As discussed in the
+previous section, each match clause corresponds to one grammar rule.
+The body of each clause makes a recursive call for each child
+node. This kind of recursive function is so common that it has a name:
+\emph{structural recursion}.  In general, when a recursive function is
+defined using a sequence of match clauses that correspond to a
+grammar, and the body of each clause makes a recursive call on each
+child node, then we say the function is defined by structural
+recursion\footnote{This principle of structuring code according to the
+  data definition is advocated in the book \emph{How to Design
+    Programs}
+  \url{http://www.ccs.neu.edu/home/matthias/HtDP2e/}.}. Below we also
+define a second function, named \code{R0?}, that determines whether an
+S-expression is an $R_0$ program.  In general we can expect to write
+one recursive function to handle each non-terminal in the grammar.
 %
 \begin{center}
 \begin{minipage}{0.7\textwidth}
@@ -686,14 +703,8 @@ $R_0$ program.
 \end{minipage}
 \end{center}
 
-Indeed, the structural recursion follows the grammar itself.  We can
-generally expect to write a recursive function to handle each
-non-terminal in the grammar.\footnote{This principle of structuring
-  code according to the data definition is advocated in the book
-  \emph{How to Design Programs}
-  \url{http://www.ccs.neu.edu/home/matthias/HtDP2e/}.}
 
-You may be tempted to write the program with just one function, like this:
+You may be tempted to merge the two functions into one, like this:
 \begin{center}
 \begin{minipage}{0.5\textwidth}
 \begin{lstlisting}