4 years ago · b5730df99b
--- a/book.tex
+++ b/book.tex
@@ -296,29 +296,44 @@ following people.
 
															 \label{ch:trees-recur}
														
 
															 In this chapter we review the basic tools that are needed to implement
														
 
															-a compiler. We use \emph{abstract syntax trees} (ASTs), which are data
														
 
															-structures in computer memory, in contrast to how programs are
														
 
															-typically stored in text files on disk, as \emph{concrete syntax}.
														
 
															+a compiler. Programs are typically input by a programmer as text,
														
 
															+i.e., a sequence of characters. The program-as-text representation is
														
 
															+called \emph{concrete syntax}. We use concrete syntax to concisely
														
 
															+write down and talk about programs. Inside the compiler, we use
														
 
															+\emph{abstract syntax trees} (ASTs) to represent programs in a way
														
 
															+that efficiently supports the operations that the compiler needs to
														
 
															+perform.
														
 
															 %
														
 
															-ASTs can be represented in many different ways, depending on the programming
														
 
															-language used to write the compiler.
														
 
															+The translation from concrete syntax to abstract syntax is a process
														
 
															+called \emph{parsing}~\cite{Aho:1986qf}. We do not cover the theory
														
 
															+and implementation of parsing in this book. A parser is provided in
														
 
															+the supporting materials for translating from concrete syntax to
														
 
															+abstract syntax for the languages used in this book.
														
 
															+
														
 
															+ASTs can be represented in many different ways inside the compiler,
														
 
															+depending on the programming language used to write the compiler.
														
 
															 %
														
 
															-We use Racket's \code{struct} feature to conveniently represent
														
 
															-ASTs (Section~\ref{sec:ast}). We use grammars to defined the abstract
														
 
															-syntax of programming languages (Section~\ref{sec:grammar}) and
														
 
															-pattern matching to inspect individual nodes in an AST
														
 
															+We use Racket's \code{struct} feature to represent ASTs
														
 
															+(Section~\ref{sec:ast}). We use grammars to define the abstract syntax
														
 
															+of programming languages (Section~\ref{sec:grammar}) and pattern
														
 
															+matching to inspect individual nodes in an AST
														
 
															 (Section~\ref{sec:pattern-matching}).  We use recursion to construct
														
 
															 and deconstruct entire ASTs (Section~\ref{sec:recursion}).  This
														
 
															 chapter provides an brief introduction to these ideas.
														
 
															+
														
 
															 \section{Abstract Syntax Trees and Racket Structures}
														
 
															 \label{sec:ast}
														
 
															-The primary data structure that is commonly used for representing
														
 
															-programs is the \emph{abstract syntax tree} (AST). When considering
														
 
															-some part of a program, a compiler needs to ask what kind of thing it
														
 
															-is and what sub-parts it contains. For example, the program on the
														
 
															-left corresponds to the AST on the right.
														
 
															+Compilers use abstract syntax trees to represent programs because
														
 
															+compilers often need to ask questions like: for a given part of a
														
 
															+program, what kind of language feature is it? What are the sub-parts
														
 
															+of this part of the program? Consider the program on the left and its
														
 
															+AST on the right. This program is an addition and it has two
														
 
															+sub-parts, a read operation and a negation. The negation has another
														
 
															+sub-part, the integer constant \code{8}. By using a tree to represent
														
 
															+the program, we can easily follow the links to go from one part of a
														
 
															+program to its sub-parts.
														
 
															 \begin{center}
														
 
															 \begin{minipage}{0.4\textwidth}
														
 
															 \begin{lstlisting}
														
@@ -341,12 +356,12 @@ left corresponds to the AST on the right.
 
															 \end{equation}
														
 
															 \end{minipage}
														
 
															 \end{center}
														
 
															-We shall use the standard terminology for trees: each circle above is
														
 
															-called a \emph{node}. The arrows connect a node to its \emph{children}
														
 
															-(which are also nodes). The top-most node is the \emph{root}.  Every
														
 
															-node except for the root has a \emph{parent} (the node it is the child
														
 
															-of). If a node has no children, it is a \emph{leaf} node.  Otherwise
														
 
															-it is an \emph{internal} node.
														
 
															+We use the standard terminology for trees to describe ASTs: each
														
 
															+circle above is called a \emph{node}. The arrows connect a node to its
														
 
															+\emph{children} (which are also nodes). The top-most node is the
														
 
															+\emph{root}.  Every node except for the root has a \emph{parent} (the
														
 
															+node it is the child of). If a node has no children, it is a
														
 
															+\emph{leaf} node.  Otherwise it is an \emph{internal} node.
														
 
															 %% Recall that an \emph{symbolic expression} (S-expression) is either
														
 
															 %% \begin{enumerate}
														
@@ -442,16 +457,15 @@ structure for each operation, as follows.
 
															 (struct Neg (value))
														
 
															 \end{lstlisting}
														
 
															 The reason we choose to use just one structure is that in many parts
														
 
															-of the compiler, the code for the different primitive operators is the
														
 
															+of the compiler the code for the different primitive operators is the
														
 
															 same, so we might as well just write that code once, which is enabled
														
 
															 by using a single structure.
														
 
															 When compiling a program such as \eqref{eq:arith-prog}, we need to
														
 
															 know that the operation associated with the root node is addition and
														
 
															-that it has two children: \texttt{read} and a negation. The AST data
														
 
															-structure directly supports these queries, as we shall see in
														
 
															-Section~\ref{sec:pattern-matching}, and hence is a good choice for use
														
 
															-in compilers.
														
 
															+we need to be able to access its two children. Racket provides pattern
														
 
															+matching over structures to support these kinds of queries, as we
														
 
															+shall see in Section~\ref{sec:pattern-matching}.
														
 
															 In this book, we often write down the concrete syntax of a program
														
 
															 even when we really have in mind the AST because the concrete syntax
														
@@ -472,8 +486,8 @@ Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.  As an
 
															 example, we describe a small language, named $R_0$, that consists of
														
 
															 integers and arithmetic operations.
														
 
															-The first grammar rule says that an instance of the \code{Int}
														
 
															-structure is an expression:
														
 
															+The first grammar rule for the abstract syntax of $R_0$ says that an
														
 
															+instance of the \code{Int} structure is an expression:
														
 
															 \begin{equation}
														
 
															 \Exp ::= \INT{\Int}  \label{eq:arith-int}
														
 
															 \end{equation}
														
@@ -570,19 +584,39 @@ where \code{body} is an expression. In later chapters, the \code{info}
 
															 part will be used to store auxilliary information but for now it is
														
 
															 just the empty list.
														
 
															-The \code{read-program} function provided in \code{utilities.rkt}
														
 
															-reads programs in from a file (the sequence of characters in the
														
 
															-concrete syntax of Racket) and parses them into the abstract syntax
														
 
															-tree. The concrete syntax does not include a \key{program} form; that
														
 
															-is added by the \code{read-program} function as it creates the
														
 
															-AST. See the description of \code{read-program} in
														
 
															+It is common to have many grammar rules with the same left-hand side
														
 
															+but different right-hand sides, such as the rules for $\Exp$ in the
														
 
															+grammar of $R_0$. As a short-hand, a vertical bar can be used to
														
 
															+combine several right-hand-sides into a single rule.
														
 
															+
														
 
															+We collect all of the grammar rules for the abstract syntax of $R_0$
														
 
															+in Figure~\ref{fig:r0-syntax}. The concrete syntax for $R_0$ is
														
 
															+defined in Figure~\ref{fig:r0-concrete-syntax}.
														
 
															+
														
 
															+The \code{read-program} function provided in \code{utilities.rkt} of
														
 
															+the support materials reads a program in from a file (the sequence of
														
 
															+characters in the concrete syntax of Racket) and parses it into an
														
 
															+abstract syntax tree. See the description of \code{read-program} in
														
 
															 Appendix~\ref{appendix:utilities} for more details.
														
 
															-It is common to have many rules with the same left-hand side, such as
														
 
															-$\Exp$ in the grammar for $R_0$, so there is a vertical bar notation
														
 
															-for gathering several rules, as shown in
														
 
															-Figure~\ref{fig:r0-syntax}. Each clause between a vertical bar is
														
 
															-called an {\em alternative}.
														
 
															+
														
 
															+\begin{figure}[tp]
														
 
															+\fbox{
														
 
															+\begin{minipage}{0.96\textwidth}
														
 
															+\[
														
 
															+\begin{array}{rcl}
														
 
															+\begin{array}{rcl}
														
 
															+  \Exp &::=& \Int \mid (\key{read}) \mid (\key{-}\;\Exp) \mid (\key{+} \; \Exp\;\Exp)
														
 
															+        \mid (\key{-}\;\Exp\;\Exp) \\
														
 
															+  R_0 &::=& \Exp
														
 
															+\end{array}
														
 
															+\end{array}
														
 
															+\]
														
 
															+\end{minipage}
														
 
															+}
														
 
															+\caption{The concrete syntax of $R_0$.}
														
 
															+\label{fig:r0-concrete-syntax}
														
 
															+\end{figure}
														
 
															 \begin{figure}[tp]
														
 
															 \fbox{
														
@@ -596,20 +630,18 @@ R_0  &::=& \PROGRAM{\code{'()}}{\Exp}
 
															 \]
														
 
															 \end{minipage}
														
 
															 }
														
 
															-\caption{The abstract syntax of $R_0$, a language of integer arithmetic.}
														
 
															+\caption{The abstract syntax of $R_0$.}
														
 
															 \label{fig:r0-syntax}
														
 
															 \end{figure}
														
 
															-
														
 
															-
														
 
															 \section{Pattern Matching}
														
 
															 \label{sec:pattern-matching}
														
 
															-As mentioned above, compilers often need to access the children of an
														
 
															-AST node. Racket provides the \texttt{match} form to access the parts
														
 
															-of a structure. Consider the following example and the output on the
														
 
															-right.
														
 
															+As mentioned in Section~\ref{sec:ast}, compilers often need to access
														
 
															+the parts of an AST node. Racket provides the \texttt{match} form to
														
 
															+access the parts of a structure. Consider the following example and
														
 
															+the output on the right.
														
 
															 \begin{center}
														
 
															 \begin{minipage}{0.5\textwidth}
														
 
															 \begin{lstlisting}
														
@@ -627,18 +659,21 @@ right.
 
															 \end{lstlisting}
														
 
															 \end{minipage}
														
 
															 \end{center}
														
 
															-The \texttt{match} form takes AST \eqref{eq:arith-prog} and binds its
														
 
															-parts to the three variables \texttt{op}, \texttt{child1}, and
														
 
															-\texttt{child2}. In general, a match clause consists of a
														
 
															-\emph{pattern} and a \emph{body}. The pattern is a quoted S-expression
														
 
															-that may also contain pattern-variables (each one preceded by a comma).
														
 
															+In the above example, the \texttt{match} form takes the AST
														
 
															+\eqref{eq:arith-prog} and binds its parts to the three pattern
														
 
															+variables \texttt{op}, \texttt{child1}, and \texttt{child2}. In
														
 
															+general, a match clause consists of a \emph{pattern} and a
														
 
															+\emph{body}. Patterns are recursively defined to be either a pattern
														
 
															+variable, a structure name followed by a pattern for each of the
														
 
															+structure's arguments, or an S-expression (symbols, lists, etc.).
														
 
															+(See Chapter 12 of The Racket
														
 
															+Guide\footnote{\url{https://docs.racket-lang.org/guide/match.html}}
														
 
															+and Chapter 9 of The Racket
														
 
															+Reference\footnote{\url{https://docs.racket-lang.org/reference/match.html}}
														
 
															+for a complete description of \code{match}.)
														
 
															 %
														
 
															-The pattern is not the same thing as a quasiquote expression used to
														
 
															-\emph{construct} ASTs, however, the similarity is intentional:
														
 
															-constructing and deconstructing ASTs uses similar syntax.
														
 
															-%
														
 
															-While the pattern uses a restricted syntax, the body of the match
														
 
															-clause may contain any Racket code whatsoever.
														
 
															+The body of a match clause may contain arbitrary Racket code.  The
														
 
															+pattern variables can be used in the scope of the body.
														
 
															 A \code{match} form may contain several clauses, as in the following
														
 
															 function \code{leaf?} that recognizes when an $R_0$ node is
														
@@ -678,20 +713,19 @@ body of the first clause that matches is executed. The output of
 
															 \end{minipage}
														
 
															 \end{center}
														
 
															-
														
 
															-When writing a \code{match}, we always refer to the grammar definition
														
 
															-for the language and identify which non-terminal we're expecting to
														
 
															-match against, then we make sure that 1) we have one clause for each
														
 
															-alternative of that non-terminal and 2) that the pattern in each
														
 
															-clause corresponds to the corresponding right-hand side of a grammar
														
 
															-rule. For the \code{match} in the \code{leaf?} function, we refer to
														
 
															-the grammar for $R_0$ in Figure~\ref{fig:r0-syntax}. The $\Exp$
														
 
															-non-terminal has 4 alternatives, so the \code{match} has 4 clauses.
														
 
															-The pattern in each clause corresponds to the right-hand side of a
														
 
															-grammar rule. For example, the pattern \code{(Prim '+ (list c1 c2))}
														
 
															-corresponds to the right-hand side $\ADD{\Exp}{\Exp}$. When translating
														
 
															-from grammars to patterns, replace non-terminals such as $\Exp$ with
														
 
															-pattern variables (e.g. \code{c1} and \code{c2}).
														
 
															+When writing a \code{match}, we refer to the grammar definition to
														
 
															+identify which non-terminal we are expecting to match against, then we
														
 
															+make sure that 1) we have one clause for each alternative of that
														
 
															+non-terminal and 2) that the pattern in each clause corresponds to the
														
 
															+corresponding right-hand side of a grammar rule. For the \code{match}
														
 
															+in the \code{leaf?} function, we refer to the grammar for $R_0$ in
														
 
															+Figure~\ref{fig:r0-syntax}. The $\Exp$ non-terminal has 4
														
 
															+alternatives, so the \code{match} has 4 clauses.  The pattern in each
														
 
															+clause corresponds to the right-hand side of a grammar rule. For
														
 
															+example, the pattern \code{(Prim '+ (list c1 c2))} corresponds to the
														
 
															+right-hand side $\ADD{\Exp}{\Exp}$. When translating from grammars to
														
 
															+patterns, replace non-terminals such as $\Exp$ with pattern variables
														
 
															+of your choice (e.g. \code{c1} and \code{c2}).
														
 
															 \section{Recursion}
														
@@ -701,21 +735,19 @@ Programs are inherently recursive. For example, an $R_0$ expression is
 
															 often made of smaller expressions. Thus, the natural way to process an
														
 
															 entire program is with a recursive function.  As a first example of
														
 
															 such a recursive function, we define \texttt{exp?} below, which takes
														
 
															-an arbitrary S-expression and determines whether or not it is an $R_0$
														
 
															-expression. As discussed in the previous section, each match clause
														
 
															-corresponds to one grammar rule.  The body of each clause makes a
														
 
															-recursive call for each child node. This kind of recursive function is
														
 
															-so common that it has a name: \emph{structural recursion}.  In
														
 
															-general, when a recursive function is defined using a sequence of
														
 
															-match clauses that correspond to a grammar, and the body of each
														
 
															-clause makes a recursive call on each child node, then we say the
														
 
															-function is defined by structural recursion\footnote{This principle of
														
 
															-  structuring code according to the data definition is advocated in
														
 
															-  the book \emph{How to Design Programs}
														
 
															+an arbitrary value and determines whether or not it is an $R_0$
														
 
															+expression.
														
 
															+%
														
 
															+When a recursive function is defined using a sequence of match clauses
														
 
															+that correspond to a grammar, and the body of each clause makes a
														
 
															+recursive call on each child node, then we say the function is defined
														
 
															+by \emph{structural recursion}\footnote{This principle of structuring
														
 
															+  code according to the data definition is advocated in the book
														
 
															+  \emph{How to Design Programs}
														
 
															   \url{http://www.ccs.neu.edu/home/matthias/HtDP2e/}.}. Below we also
														
 
															-define a second function, named \code{R0?}, that determines whether an
														
 
															-S-expression is an $R_0$ program.  In general we can expect to write
														
 
															-one recursive function to handle each non-terminal in the grammar.
														
 
															+define a second function, named \code{R0?}, that determines whether a
														
 
															+value is an $R_0$ program.  In general we can expect to write one
														
 
															+recursive function to handle each non-terminal in a grammar.
														
 
															 %
														
 
															 \begin{center}
														
 
															 \begin{minipage}{0.7\textwidth}
														
@@ -779,13 +811,13 @@ You may be tempted to merge the two functions into one, like this:
 
															 \end{minipage}
														
 
															 \end{center}
														
 
															 %
														
 
															-Sometimes such a trick will save a few lines of code, especially when it comes
														
 
															-to the {\tt program} wrapper.  Yet this style is generally \emph{not}
														
 
															-recommended because it can get you into trouble.
														
 
															+Sometimes such a trick will save a few lines of code, especially when
														
 
															+it comes to the \code{Program} wrapper.  Yet this style is generally
														
 
															+\emph{not} recommended because it can get you into trouble.
														
 
															 %
														
 
															-For instance, the above function is subtly wrong:
														
 
															-\lstinline{(R0? (Program '() (Program '() (Int 3))))} will return true, when it
														
 
															-should return false.
														
 
															+For example, the above function is subtly wrong:
														
 
															+\lstinline{(R0? (Program '() (Program '() (Int 3))))}
														
 
															+will return true, when it should return false.
														
 
															 %% NOTE FIXME - must check for consistency on this issue throughout.
														
@@ -798,16 +830,16 @@ specification of the language. For example, the Scheme language is
 
															 defined in the report by \cite{SPERBER:2009aa}. The Racket language is
														
 
															 defined in its reference manual~\citep{plt-tr}. In this book we use an
														
 
															 interpreter to define the meaning of each language that we consider,
														
 
															-following Reynolds' advice in this
														
 
															-regard~\citep{reynolds72:_def_interp}. An interpreter that is
														
 
															-designated (by some people) as the definition of a language is called
														
 
															-a \emph{definitional interpreter}.  Here we warm up by creating a
														
 
															-definitional interpreter for the $R_0$ language, which serves as a
														
 
															-second example of structural recursion. The \texttt{interp-R0}
														
 
															-function is defined in Figure~\ref{fig:interp-R0}. The body of the
														
 
															-function is a match on the input program followed by a call to the
														
 
															-\lstinline{interp-exp} helper function, which in turn has one match
														
 
															-clause per grammar rule for $R_0$ expressions.
														
 
															+following Reynolds' advice~\citep{reynolds72:_def_interp}. An
														
 
															+interpreter that is designated (by some people) as the definition of a
														
 
															+language is called a \emph{definitional interpreter}.  We warm up by
														
 
															+creating a definitional interpreter for the $R_0$ language, which
														
 
															+serves as a second example of structural recursion. The
														
 
															+\texttt{interp-R0} function is defined in
														
 
															+Figure~\ref{fig:interp-R0}. The body of the function is a match on the
														
 
															+input program followed by a call to the \lstinline{interp-exp} helper
														
 
															+function, which in turn has one match clause per grammar rule for
														
 
															+$R_0$ expressions.
														
 
															 \begin{figure}[tbp]
														
 
															 \begin{lstlisting}
														
@@ -858,8 +890,12 @@ As mentioned previously, the $R_0$ language does not support
 
															 arbitrarily-large integers, but only $63$-bit integers, so we
														
 
															 interpret the arithmetic operations of $R_0$ using fixnum arithmetic
														
 
															 in Racket.
														
 
															-Suppose $n = 999999999999999999$, which indeed fits in $63$-bits.
														
 
															-What happens when we run the following program in our interpreter?
														
 
															+Suppose
														
 
															+\[
														
 
															+  n = 999999999999999999
														
 
															+\]
														
 
															+which indeed fits in $63$-bits.  What happens when we run the
														
 
															+following program in our interpreter?
														
 
															 \begin{lstlisting}
														
 
															 (+ (+ (+ |$n$| |$n$|) (+ |$n$| |$n$|)) (+ (+ |$n$| |$n$|) (+ |$n$| |$n$|)))))
														
 
															 \end{lstlisting}
														
@@ -883,15 +919,15 @@ program \eqref{eq:arith-prog} performs a \key{read} and then subtracts
 
															 \begin{lstlisting}
														
 
															 (interp-R0 ast1.1)
														
 
															 \end{lstlisting}
														
 
															-and the input the integer \code{50} we get the answer to life, the
														
 
															+and if the input is \code{50}, then we get the answer to life, the
														
 
															 universe, and everything: \code{42}!\footnote{\emph{The Hitchhiker's
														
 
															     Guide to the Galaxy} by Douglas Adams.}
														
 
															 We include the \key{read} operation in $R_0$ so a clever student
														
 
															 cannot implement a compiler for $R_0$ that simply runs the interpreter
														
 
															 during compilation to obtain the output and then generates the trivial
														
 
															-code to produce the output. (Yes, a clever student did this in a
														
 
															-previous version of the course.)
														
 
															+code to produce the output. (Yes, a clever student did this in the
														
 
															+first instance of this course.)
														
 
															 The job of a compiler is to translate a program in one language into a
														
 
															 program in another language so that the output program behaves the
														
@@ -1025,17 +1061,35 @@ code.
 
															 \label{sec:s0}
														
 
															 The $R_1$ language extends the $R_0$ language
														
 
															-(Figure~\ref{fig:r0-syntax}) with variable definitions.  The syntax of
														
 
															-the $R_1$ language is defined by the grammar in
														
 
															-Figure~\ref{fig:r1-syntax}.  The non-terminal \Var{} may be any Racket
														
 
															-identifier. As in $R_0$, \key{read} is a nullary operator, \key{-} is
														
 
															-a unary operator, and \key{+} is a binary operator.  Similar to $R_0$,
														
 
															-the $R_1$ language includes the \key{Program} struct to mark the top
														
 
															-of the program. The $\itm{info}$ field of the \key{Program} struct
														
 
															-contains an \emph{association list} (a list of key-value pairs) that
														
 
															-is used to communicate auxiliary data from one compiler pass the
														
 
															-next. Despite the simplicity of the $R_1$ language, it is rich enough
														
 
															-to exhibit several compilation techniques.
														
 
															+(Figures~\ref{fig:r0-concrete-syntax} and \ref{fig:r0-syntax}) with
														
 
															+variable definitions.  The syntax of the $R_1$ language is defined by
														
 
															+the grammar in Figure~\ref{fig:r1-syntax}.  The non-terminal \Var{}
														
 
															+may be any Racket identifier. As in $R_0$, \key{read} is a nullary
														
 
															+operator, \key{-} is a unary operator, and \key{+} is a binary
														
 
															+operator.  Similar to $R_0$, the $R_1$ language includes the
														
 
															+\key{Program} struct to mark the top of the program. The $\itm{info}$
														
 
															+field of the \key{Program} struct contains an \emph{association list}
														
 
															+(a list of key-value pairs) that is used to communicate auxiliary data
														
 
															+from one compiler pass the next. Despite the simplicity of the $R_1$
														
 
															+language, it is rich enough to exhibit several compilation techniques.
														
 
															+
														
 
															+\begin{figure}[btp]
														
 
															+\centering
														
 
															+\fbox{
														
 
															+\begin{minipage}{0.96\textwidth}
														
 
															+\[
														
 
															+\begin{array}{rcl}
														
 
															+  \Exp &::=& \Int \mid (\key{read}) \mid (\key{-}\;\Exp) \mid (\key{+} \; \Exp\;\Exp)
														
 
															+        \mid (\key{-}\;\Exp\;\Exp) \\
														
 
															+     &\mid&  \Var \mid \key{(let}~\key{([}\Var ~\Exp \key{])}~ \Exp \key{)} \\
														
 
															+  R_1 &::=& \Exp
														
 
															+\end{array}
														
 
															+\]
														
 
															+\end{minipage}
														
 
															+}
														
 
															+\caption{The concrete syntax of $R_1$, a language of integers and variables.}
														
 
															+\label{fig:r1-concrete-syntax}
														
 
															+\end{figure}
														
 
															 \begin{figure}[btp]
														
 
															 \centering
														
@@ -3426,30 +3480,31 @@ programs to make sure that your move biasing is working properly.
 
															   live range splitting~\citep{Cooper:1998ly}. \\ --Jeremy}
														
 
															+
														
 
															 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
														
 
															 \chapter{Booleans and Control Flow}
														
 
															 \label{ch:bool-types}
														
 
															 The $R_0$ and $R_1$ languages only had a single kind of value, the
														
 
															-integers. In this Chapter we add a second kind of value, the Booleans,
														
 
															+integers. In this chapter we add a second kind of value, the Booleans,
														
 
															 to create the $R_2$ language. The Boolean values \emph{true} and
														
 
															 \emph{false} are written \key{\#t} and \key{\#f} respectively in
														
 
															-Racket.  We also introduce several operations that involve Booleans
														
 
															-(\key{and}, \key{not}, \key{eq?}, \key{<}, etc.) and the conditional
														
 
															-\key{if} expression. With the addition of \key{if} expressions,
														
 
															-programs can have non-trivial control flow which has an impact on
														
 
															-several parts of the compiler. Also, because we now have two kinds of
														
 
															-values, we need to worry about programs that apply an operation to the
														
 
															-wrong kind of value, such as \code{(not 1)}.
														
 
															+Racket.  The $R_2$ language includes several operations that involve
														
 
															+Booleans (\key{and}, \key{not}, \key{eq?}, \key{<}, etc.) and the
														
 
															+conditional \key{if} expression. With the addition of \key{if}
														
 
															+expressions, programs can have non-trivial control flow which has an
														
 
															+impact on several parts of the compiler. Also, because we now have two
														
 
															+kinds of values, we need to worry about programs that apply an
														
 
															+operation to the wrong kind of value, such as \code{(not 1)}.
														
 
															 There are two language design options for such situations.  One option
														
 
															 is to signal an error and the other is to provide a wider
														
 
															 interpretation of the operation. The Racket language uses a mixture of
														
 
															 these two options, depending on the operation and the kind of
														
 
															 value. For example, the result of \code{(not 1)} in Racket is
														
 
															-\code{\#f} because Racket treats non-zero integers like \code{\#t}. On
														
 
															-the other hand, \code{(car 1)} results in a run-time error in Racket
														
 
															-stating that \code{car} expects a pair.
														
 
															+\code{\#f} because Racket treats non-zero integers as if they were
														
 
															+\code{\#t}. On the other hand, \code{(car 1)} results in a run-time
														
 
															+error in Racket stating that \code{car} expects a pair.
														
 
															 The Typed Racket language makes similar design choices as Racket,
														
 
															 except much of the error detection happens at compile time instead of
														
@@ -3480,14 +3535,15 @@ conditional control flow.
 
															 \section{The $R_2$ Language}
														
 
															 \label{sec:r2-lang}
														
 
															-The syntax of the $R_2$ language is defined in
														
 
															-Figure~\ref{fig:r2-syntax}. It includes all of $R_1$ (shown in gray),
														
 
															-the Boolean literals \code{\#t} and \code{\#f}, and the conditional
														
 
															-\code{if} expression. Also, we expand the operators to include
														
 
															-subtraction, \key{and}, \key{or} and \key{not}, the \key{eq?}
														
 
															-operations for comparing two integers or two Booleans, and the
														
 
															-\key{<}, \key{<=}, \key{>}, and \key{>=} operations for comparing
														
 
															-integers.
														
 
															+The concrete syntax of the $R_2$ language is defined in
														
 
															+Figure~\ref{fig:r2-concretesyntax} and the abstract syntax is defined
														
 
															+in Figure~\ref{fig:r2-syntax}. The $R_2$ language includes all of
														
 
															+$R_1$ (shown in gray), the Boolean literals \code{\#t} and \code{\#f},
														
 
															+and the conditional \code{if} expression. Also, we expand the
														
 
															+operators to include subtraction, \key{and}, \key{or} and \key{not},
														
 
															+the \key{eq?}  operations for comparing two integers or two Booleans,
														
 
															+and the \key{<}, \key{<=}, \key{>}, and \key{>=} operations for
														
 
															+comparing integers.
														
 
															 \begin{figure}[tp]
														
 
															 \centering
														
@@ -3502,12 +3558,36 @@ integers.
 
															       \mid (\key{and}\;\Exp\;\Exp) \mid (\key{or}\;\Exp\;\Exp)
														
 
															       \mid (\key{not}\;\Exp) \\
														
 
															       &\mid& (\itm{cmp}\;\Exp\;\Exp) \mid \IF{\Exp}{\Exp}{\Exp} \\
														
 
															-  R_2 &::=& (\key{program} \; \itm{info}\; \Exp)
														
 
															+  R_2 &::=& \Exp
														
 
															+\end{array}
														
 
															+\]
														
 
															+\end{minipage}
														
 
															+}
														
 
															+\caption{The concrete syntax of $R_2$, extending $R_1$
														
 
															+  (Figure~\ref{fig:r1-concrete-syntax}) with Booleans and conditionals.}
														
 
															+\label{fig:r2-concrete-syntax}
														
 
															+\end{figure}
														
 
															+
														
 
															+\begin{figure}[tp]
														
 
															+\centering
														
 
															+\fbox{
														
 
															+\begin{minipage}{0.96\textwidth}
														
 
															+\[
														
 
															+\begin{array}{lcl}
														
 
															+  \itm{cmp} &::= & \key{eq?} \mid \key{<} \mid \key{<=} \mid \key{>} \mid \key{>=} \\
														
 
															+\Exp &::=& \gray{\INT{\Int} \mid \READ{} \mid \NEG{\Exp}} \\
														
 
															+     &\mid& \gray{\ADD{\Exp}{\Exp}  
														
 
															+      \mid  \VAR{\Var} \mid \LET{\Var}{\Exp}{\Exp}} \\
														
 
															+     &\mid& \key{\#t} \mid \key{\#f} 
														
 
															+      \mid (\key{and}\;\Exp\;\Exp) \mid (\key{or}\;\Exp\;\Exp)
														
 
															+      \mid (\key{not}\;\Exp) \\
														
 
															+      &\mid& (\itm{cmp}\;\Exp\;\Exp) \mid \IF{\Exp}{\Exp}{\Exp} \\
														
 
															+  R_2 &::=& \PROGRAM{\key{'()}}{\Exp}
														
 
															 \end{array}
														
 
															 \]
														
 
															 \end{minipage}
														
 
															 }
														
 
															-\caption{The syntax of $R_2$, extending $R_1$
														
 
															+\caption{The abstract syntax of $R_2$, extending $R_1$
														
 
															   (Figure~\ref{fig:r1-syntax}) with Booleans and conditionals.}
														
 
															 \label{fig:r2-syntax}
														
 
															 \end{figure}