4 жил өмнө · b5730df99b
--- a/book.tex
+++ b/book.tex
@@ -296,29 +296,44 @@ following people.
 
				 \label{ch:trees-recur}
			
 
				 
			
 
				 In this chapter we review the basic tools that are needed to implement
			
 
				-a compiler. We use \emph{abstract syntax trees} (ASTs), which are data
			
 
				-structures in computer memory, in contrast to how programs are
			
 
				-typically stored in text files on disk, as \emph{concrete syntax}.
			
 
				+a compiler. Programs are typically input by a programmer as text,
			
 
				+i.e., a sequence of characters. The program-as-text representation is
			
 
				+called \emph{concrete syntax}. We use concrete syntax to concisely
			
 
				+write down and talk about programs. Inside the compiler, we use
			
 
				+\emph{abstract syntax trees} (ASTs) to represent programs in a way
			
 
				+that efficiently supports the operations that the compiler needs to
			
 
				+perform.
			
 
				 %
			
 
				-ASTs can be represented in many different ways, depending on the programming
			
 
				-language used to write the compiler.
			
 
				+The translation from concrete syntax to abstract syntax is a process
			
 
				+called \emph{parsing}~\cite{Aho:1986qf}. We do not cover the theory
			
 
				+and implementation of parsing in this book. A parser is provided in
			
 
				+the supporting materials for translating from concrete syntax to
			
 
				+abstract syntax for the languages used in this book.
			
 
				+
			
 
				+ASTs can be represented in many different ways inside the compiler,
			
 
				+depending on the programming language used to write the compiler.
			
 
				 %
			
 
				-We use Racket's \code{struct} feature to conveniently represent
			
 
				-ASTs (Section~\ref{sec:ast}). We use grammars to defined the abstract
			
 
				-syntax of programming languages (Section~\ref{sec:grammar}) and
			
 
				-pattern matching to inspect individual nodes in an AST
			
 
				+We use Racket's \code{struct} feature to represent ASTs
			
 
				+(Section~\ref{sec:ast}). We use grammars to define the abstract syntax
			
 
				+of programming languages (Section~\ref{sec:grammar}) and pattern
			
 
				+matching to inspect individual nodes in an AST
			
 
				 (Section~\ref{sec:pattern-matching}).  We use recursion to construct
			
 
				 and deconstruct entire ASTs (Section~\ref{sec:recursion}).  This
			
 
				 chapter provides an brief introduction to these ideas.
			
 
				 
			
 
				+
			
 
				 \section{Abstract Syntax Trees and Racket Structures}
			
 
				 \label{sec:ast}
			
 
				 
			
 
				-The primary data structure that is commonly used for representing
			
 
				-programs is the \emph{abstract syntax tree} (AST). When considering
			
 
				-some part of a program, a compiler needs to ask what kind of thing it
			
 
				-is and what sub-parts it contains. For example, the program on the
			
 
				-left corresponds to the AST on the right.
			
 
				+Compilers use abstract syntax trees to represent programs because
			
 
				+compilers often need to ask questions like: for a given part of a
			
 
				+program, what kind of language feature is it? What are the sub-parts
			
 
				+of this part of the program? Consider the program on the left and its
			
 
				+AST on the right. This program is an addition and it has two
			
 
				+sub-parts, a read operation and a negation. The negation has another
			
 
				+sub-part, the integer constant \code{8}. By using a tree to represent
			
 
				+the program, we can easily follow the links to go from one part of a
			
 
				+program to its sub-parts.
			
 
				 \begin{center}
			
 
				 \begin{minipage}{0.4\textwidth}
			
 
				 \begin{lstlisting}
			
@@ -341,12 +356,12 @@ left corresponds to the AST on the right.
 
				 \end{equation}
			
 
				 \end{minipage}
			
 
				 \end{center}
			
 
				-We shall use the standard terminology for trees: each circle above is
			
 
				-called a \emph{node}. The arrows connect a node to its \emph{children}
			
 
				-(which are also nodes). The top-most node is the \emph{root}.  Every
			
 
				-node except for the root has a \emph{parent} (the node it is the child
			
 
				-of). If a node has no children, it is a \emph{leaf} node.  Otherwise
			
 
				-it is an \emph{internal} node.
			
 
				+We use the standard terminology for trees to describe ASTs: each
			
 
				+circle above is called a \emph{node}. The arrows connect a node to its
			
 
				+\emph{children} (which are also nodes). The top-most node is the
			
 
				+\emph{root}.  Every node except for the root has a \emph{parent} (the
			
 
				+node it is the child of). If a node has no children, it is a
			
 
				+\emph{leaf} node.  Otherwise it is an \emph{internal} node.
			
 
				 
			
 
				 %% Recall that an \emph{symbolic expression} (S-expression) is either
			
 
				 %% \begin{enumerate}
			
@@ -442,16 +457,15 @@ structure for each operation, as follows.
 
				 (struct Neg (value))
			
 
				 \end{lstlisting}
			
 
				 The reason we choose to use just one structure is that in many parts
			
 
				-of the compiler, the code for the different primitive operators is the
			
 
				+of the compiler the code for the different primitive operators is the
			
 
				 same, so we might as well just write that code once, which is enabled
			
 
				 by using a single structure.
			
 
				 
			
 
				 When compiling a program such as \eqref{eq:arith-prog}, we need to
			
 
				 know that the operation associated with the root node is addition and
			
 
				-that it has two children: \texttt{read} and a negation. The AST data
			
 
				-structure directly supports these queries, as we shall see in
			
 
				-Section~\ref{sec:pattern-matching}, and hence is a good choice for use
			
 
				-in compilers.
			
 
				+we need to be able to access its two children. Racket provides pattern
			
 
				+matching over structures to support these kinds of queries, as we
			
 
				+shall see in Section~\ref{sec:pattern-matching}.
			
 
				 
			
 
				 In this book, we often write down the concrete syntax of a program
			
 
				 even when we really have in mind the AST because the concrete syntax
			
@@ -472,8 +486,8 @@ Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.  As an
 
				 example, we describe a small language, named $R_0$, that consists of
			
 
				 integers and arithmetic operations.
			
 
				 
			
 
				-The first grammar rule says that an instance of the \code{Int}
			
 
				-structure is an expression:
			
 
				+The first grammar rule for the abstract syntax of $R_0$ says that an
			
 
				+instance of the \code{Int} structure is an expression:
			
 
				 \begin{equation}
			
 
				 \Exp ::= \INT{\Int}  \label{eq:arith-int}
			
 
				 \end{equation}
			
@@ -570,19 +584,39 @@ where \code{body} is an expression. In later chapters, the \code{info}
 
				 part will be used to store auxilliary information but for now it is
			
 
				 just the empty list.
			
 
				 
			
 
				-The \code{read-program} function provided in \code{utilities.rkt}
			
 
				-reads programs in from a file (the sequence of characters in the
			
 
				-concrete syntax of Racket) and parses them into the abstract syntax
			
 
				-tree. The concrete syntax does not include a \key{program} form; that
			
 
				-is added by the \code{read-program} function as it creates the
			
 
				-AST. See the description of \code{read-program} in
			
 
				+It is common to have many grammar rules with the same left-hand side
			
 
				+but different right-hand sides, such as the rules for $\Exp$ in the
			
 
				+grammar of $R_0$. As a short-hand, a vertical bar can be used to
			
 
				+combine several right-hand-sides into a single rule.
			
 
				+
			
 
				+We collect all of the grammar rules for the abstract syntax of $R_0$
			
 
				+in Figure~\ref{fig:r0-syntax}. The concrete syntax for $R_0$ is
			
 
				+defined in Figure~\ref{fig:r0-concrete-syntax}.
			
 
				+
			
 
				+The \code{read-program} function provided in \code{utilities.rkt} of
			
 
				+the support materials reads a program in from a file (the sequence of
			
 
				+characters in the concrete syntax of Racket) and parses it into an
			
 
				+abstract syntax tree. See the description of \code{read-program} in
			
 
				 Appendix~\ref{appendix:utilities} for more details.
			
 
				 
			
 
				-It is common to have many rules with the same left-hand side, such as
			
 
				-$\Exp$ in the grammar for $R_0$, so there is a vertical bar notation
			
 
				-for gathering several rules, as shown in
			
 
				-Figure~\ref{fig:r0-syntax}. Each clause between a vertical bar is
			
 
				-called an {\em alternative}.
			
 
				+
			
 
				+\begin{figure}[tp]
			
 
				+\fbox{
			
 
				+\begin{minipage}{0.96\textwidth}
			
 
				+\[
			
 
				+\begin{array}{rcl}
			
 
				+\begin{array}{rcl}
			
 
				+  \Exp &::=& \Int \mid (\key{read}) \mid (\key{-}\;\Exp) \mid (\key{+} \; \Exp\;\Exp)
			
 
				+        \mid (\key{-}\;\Exp\;\Exp) \\
			
 
				+  R_0 &::=& \Exp
			
 
				+\end{array}
			
 
				+\end{array}
			
 
				+\]
			
 
				+\end{minipage}
			
 
				+}
			
 
				+\caption{The concrete syntax of $R_0$.}
			
 
				+\label{fig:r0-concrete-syntax}
			
 
				+\end{figure}
			
 
				 
			
 
				 \begin{figure}[tp]
			
 
				 \fbox{
			
@@ -596,20 +630,18 @@ R_0  &::=& \PROGRAM{\code{'()}}{\Exp}
 
				 \]
			
 
				 \end{minipage}
			
 
				 }
			
 
				-\caption{The abstract syntax of $R_0$, a language of integer arithmetic.}
			
 
				+\caption{The abstract syntax of $R_0$.}
			
 
				 \label{fig:r0-syntax}
			
 
				 \end{figure}
			
 
				 
			
 
				 
			
 
				-
			
 
				-
			
 
				 \section{Pattern Matching}
			
 
				 \label{sec:pattern-matching}
			
 
				 
			
 
				-As mentioned above, compilers often need to access the children of an
			
 
				-AST node. Racket provides the \texttt{match} form to access the parts
			
 
				-of a structure. Consider the following example and the output on the
			
 
				-right.
			
 
				+As mentioned in Section~\ref{sec:ast}, compilers often need to access
			
 
				+the parts of an AST node. Racket provides the \texttt{match} form to
			
 
				+access the parts of a structure. Consider the following example and
			
 
				+the output on the right.
			
 
				 \begin{center}
			
 
				 \begin{minipage}{0.5\textwidth}
			
 
				 \begin{lstlisting}
			
@@ -627,18 +659,21 @@ right.
 
				 \end{lstlisting}
			
 
				 \end{minipage}
			
 
				 \end{center}
			
 
				-The \texttt{match} form takes AST \eqref{eq:arith-prog} and binds its
			
 
				-parts to the three variables \texttt{op}, \texttt{child1}, and
			
 
				-\texttt{child2}. In general, a match clause consists of a
			
 
				-\emph{pattern} and a \emph{body}. The pattern is a quoted S-expression
			
 
				-that may also contain pattern-variables (each one preceded by a comma).
			
 
				+In the above example, the \texttt{match} form takes the AST
			
 
				+\eqref{eq:arith-prog} and binds its parts to the three pattern
			
 
				+variables \texttt{op}, \texttt{child1}, and \texttt{child2}. In
			
 
				+general, a match clause consists of a \emph{pattern} and a
			
 
				+\emph{body}. Patterns are recursively defined to be either a pattern
			
 
				+variable, a structure name followed by a pattern for each of the
			
 
				+structure's arguments, or an S-expression (symbols, lists, etc.).
			
 
				+(See Chapter 12 of The Racket
			
 
				+Guide\footnote{\url{https://docs.racket-lang.org/guide/match.html}}
			
 
				+and Chapter 9 of The Racket
			
 
				+Reference\footnote{\url{https://docs.racket-lang.org/reference/match.html}}
			
 
				+for a complete description of \code{match}.)
			
 
				 %
			
 
				-The pattern is not the same thing as a quasiquote expression used to
			
 
				-\emph{construct} ASTs, however, the similarity is intentional:
			
 
				-constructing and deconstructing ASTs uses similar syntax.
			
 
				-%
			
 
				-While the pattern uses a restricted syntax, the body of the match
			
 
				-clause may contain any Racket code whatsoever.
			
 
				+The body of a match clause may contain arbitrary Racket code.  The
			
 
				+pattern variables can be used in the scope of the body.
			
 
				 
			
 
				 A \code{match} form may contain several clauses, as in the following
			
 
				 function \code{leaf?} that recognizes when an $R_0$ node is
			
@@ -678,20 +713,19 @@ body of the first clause that matches is executed. The output of
 
				 \end{minipage}
			
 
				 \end{center}
			
 
				 
			
 
				-
			
 
				-When writing a \code{match}, we always refer to the grammar definition
			
 
				-for the language and identify which non-terminal we're expecting to
			
 
				-match against, then we make sure that 1) we have one clause for each
			
 
				-alternative of that non-terminal and 2) that the pattern in each
			
 
				-clause corresponds to the corresponding right-hand side of a grammar
			
 
				-rule. For the \code{match} in the \code{leaf?} function, we refer to
			
 
				-the grammar for $R_0$ in Figure~\ref{fig:r0-syntax}. The $\Exp$
			
 
				-non-terminal has 4 alternatives, so the \code{match} has 4 clauses.
			
 
				-The pattern in each clause corresponds to the right-hand side of a
			
 
				-grammar rule. For example, the pattern \code{(Prim '+ (list c1 c2))}
			
 
				-corresponds to the right-hand side $\ADD{\Exp}{\Exp}$. When translating
			
 
				-from grammars to patterns, replace non-terminals such as $\Exp$ with
			
 
				-pattern variables (e.g. \code{c1} and \code{c2}).
			
 
				+When writing a \code{match}, we refer to the grammar definition to
			
 
				+identify which non-terminal we are expecting to match against, then we
			
 
				+make sure that 1) we have one clause for each alternative of that
			
 
				+non-terminal and 2) that the pattern in each clause corresponds to the
			
 
				+corresponding right-hand side of a grammar rule. For the \code{match}
			
 
				+in the \code{leaf?} function, we refer to the grammar for $R_0$ in
			
 
				+Figure~\ref{fig:r0-syntax}. The $\Exp$ non-terminal has 4
			
 
				+alternatives, so the \code{match} has 4 clauses.  The pattern in each
			
 
				+clause corresponds to the right-hand side of a grammar rule. For
			
 
				+example, the pattern \code{(Prim '+ (list c1 c2))} corresponds to the
			
 
				+right-hand side $\ADD{\Exp}{\Exp}$. When translating from grammars to
			
 
				+patterns, replace non-terminals such as $\Exp$ with pattern variables
			
 
				+of your choice (e.g. \code{c1} and \code{c2}).
			
 
				 
			
 
				 
			
 
				 \section{Recursion}
			
@@ -701,21 +735,19 @@ Programs are inherently recursive. For example, an $R_0$ expression is
 
				 often made of smaller expressions. Thus, the natural way to process an
			
 
				 entire program is with a recursive function.  As a first example of
			
 
				 such a recursive function, we define \texttt{exp?} below, which takes
			
 
				-an arbitrary S-expression and determines whether or not it is an $R_0$
			
 
				-expression. As discussed in the previous section, each match clause
			
 
				-corresponds to one grammar rule.  The body of each clause makes a
			
 
				-recursive call for each child node. This kind of recursive function is
			
 
				-so common that it has a name: \emph{structural recursion}.  In
			
 
				-general, when a recursive function is defined using a sequence of
			
 
				-match clauses that correspond to a grammar, and the body of each
			
 
				-clause makes a recursive call on each child node, then we say the
			
 
				-function is defined by structural recursion\footnote{This principle of
			
 
				-  structuring code according to the data definition is advocated in
			
 
				-  the book \emph{How to Design Programs}
			
 
				+an arbitrary value and determines whether or not it is an $R_0$
			
 
				+expression.
			
 
				+%
			
 
				+When a recursive function is defined using a sequence of match clauses
			
 
				+that correspond to a grammar, and the body of each clause makes a
			
 
				+recursive call on each child node, then we say the function is defined
			
 
				+by \emph{structural recursion}\footnote{This principle of structuring
			
 
				+  code according to the data definition is advocated in the book
			
 
				+  \emph{How to Design Programs}
			
 
				   \url{http://www.ccs.neu.edu/home/matthias/HtDP2e/}.}. Below we also
			
 
				-define a second function, named \code{R0?}, that determines whether an
			
 
				-S-expression is an $R_0$ program.  In general we can expect to write
			
 
				-one recursive function to handle each non-terminal in the grammar.
			
 
				+define a second function, named \code{R0?}, that determines whether a
			
 
				+value is an $R_0$ program.  In general we can expect to write one
			
 
				+recursive function to handle each non-terminal in a grammar.
			
 
				 %
			
 
				 \begin{center}
			
 
				 \begin{minipage}{0.7\textwidth}
			
@@ -779,13 +811,13 @@ You may be tempted to merge the two functions into one, like this:
 
				 \end{minipage}
			
 
				 \end{center}
			
 
				 %
			
 
				-Sometimes such a trick will save a few lines of code, especially when it comes
			
 
				-to the {\tt program} wrapper.  Yet this style is generally \emph{not}
			
 
				-recommended because it can get you into trouble.
			
 
				+Sometimes such a trick will save a few lines of code, especially when
			
 
				+it comes to the \code{Program} wrapper.  Yet this style is generally
			
 
				+\emph{not} recommended because it can get you into trouble.
			
 
				 %
			
 
				-For instance, the above function is subtly wrong:
			
 
				-\lstinline{(R0? (Program '() (Program '() (Int 3))))} will return true, when it
			
 
				-should return false.
			
 
				+For example, the above function is subtly wrong:
			
 
				+\lstinline{(R0? (Program '() (Program '() (Int 3))))}
			
 
				+will return true, when it should return false.
			
 
				 
			
 
				 %% NOTE FIXME - must check for consistency on this issue throughout.
			
 
				 
			
@@ -798,16 +830,16 @@ specification of the language. For example, the Scheme language is
 
				 defined in the report by \cite{SPERBER:2009aa}. The Racket language is
			
 
				 defined in its reference manual~\citep{plt-tr}. In this book we use an
			
 
				 interpreter to define the meaning of each language that we consider,
			
 
				-following Reynolds' advice in this
			
 
				-regard~\citep{reynolds72:_def_interp}. An interpreter that is
			
 
				-designated (by some people) as the definition of a language is called
			
 
				-a \emph{definitional interpreter}.  Here we warm up by creating a
			
 
				-definitional interpreter for the $R_0$ language, which serves as a
			
 
				-second example of structural recursion. The \texttt{interp-R0}
			
 
				-function is defined in Figure~\ref{fig:interp-R0}. The body of the
			
 
				-function is a match on the input program followed by a call to the
			
 
				-\lstinline{interp-exp} helper function, which in turn has one match
			
 
				-clause per grammar rule for $R_0$ expressions.
			
 
				+following Reynolds' advice~\citep{reynolds72:_def_interp}. An
			
 
				+interpreter that is designated (by some people) as the definition of a
			
 
				+language is called a \emph{definitional interpreter}.  We warm up by
			
 
				+creating a definitional interpreter for the $R_0$ language, which
			
 
				+serves as a second example of structural recursion. The
			
 
				+\texttt{interp-R0} function is defined in
			
 
				+Figure~\ref{fig:interp-R0}. The body of the function is a match on the
			
 
				+input program followed by a call to the \lstinline{interp-exp} helper
			
 
				+function, which in turn has one match clause per grammar rule for
			
 
				+$R_0$ expressions.
			
 
				 
			
 
				 \begin{figure}[tbp]
			
 
				 \begin{lstlisting}
			
@@ -858,8 +890,12 @@ As mentioned previously, the $R_0$ language does not support
 
				 arbitrarily-large integers, but only $63$-bit integers, so we
			
 
				 interpret the arithmetic operations of $R_0$ using fixnum arithmetic
			
 
				 in Racket.
			
 
				-Suppose $n = 999999999999999999$, which indeed fits in $63$-bits.
			
 
				-What happens when we run the following program in our interpreter?
			
 
				+Suppose
			
 
				+\[
			
 
				+  n = 999999999999999999
			
 
				+\]
			
 
				+which indeed fits in $63$-bits.  What happens when we run the
			
 
				+following program in our interpreter?
			
 
				 \begin{lstlisting}
			
 
				 (+ (+ (+ |$n$| |$n$|) (+ |$n$| |$n$|)) (+ (+ |$n$| |$n$|) (+ |$n$| |$n$|)))))
			
 
				 \end{lstlisting}
			
@@ -883,15 +919,15 @@ program \eqref{eq:arith-prog} performs a \key{read} and then subtracts
 
				 \begin{lstlisting}
			
 
				 (interp-R0 ast1.1)
			
 
				 \end{lstlisting}
			
 
				-and the input the integer \code{50} we get the answer to life, the
			
 
				+and if the input is \code{50}, then we get the answer to life, the
			
 
				 universe, and everything: \code{42}!\footnote{\emph{The Hitchhiker's
			
 
				     Guide to the Galaxy} by Douglas Adams.}
			
 
				 
			
 
				 We include the \key{read} operation in $R_0$ so a clever student
			
 
				 cannot implement a compiler for $R_0$ that simply runs the interpreter
			
 
				 during compilation to obtain the output and then generates the trivial
			
 
				-code to produce the output. (Yes, a clever student did this in a
			
 
				-previous version of the course.)
			
 
				+code to produce the output. (Yes, a clever student did this in the
			
 
				+first instance of this course.)
			
 
				 
			
 
				 The job of a compiler is to translate a program in one language into a
			
 
				 program in another language so that the output program behaves the
			
@@ -1025,17 +1061,35 @@ code.
 
				 \label{sec:s0}
			
 
				 
			
 
				 The $R_1$ language extends the $R_0$ language
			
 
				-(Figure~\ref{fig:r0-syntax}) with variable definitions.  The syntax of
			
 
				-the $R_1$ language is defined by the grammar in
			
 
				-Figure~\ref{fig:r1-syntax}.  The non-terminal \Var{} may be any Racket
			
 
				-identifier. As in $R_0$, \key{read} is a nullary operator, \key{-} is
			
 
				-a unary operator, and \key{+} is a binary operator.  Similar to $R_0$,
			
 
				-the $R_1$ language includes the \key{Program} struct to mark the top
			
 
				-of the program. The $\itm{info}$ field of the \key{Program} struct
			
 
				-contains an \emph{association list} (a list of key-value pairs) that
			
 
				-is used to communicate auxiliary data from one compiler pass the
			
 
				-next. Despite the simplicity of the $R_1$ language, it is rich enough
			
 
				-to exhibit several compilation techniques.
			
 
				+(Figures~\ref{fig:r0-concrete-syntax} and \ref{fig:r0-syntax}) with
			
 
				+variable definitions.  The syntax of the $R_1$ language is defined by
			
 
				+the grammar in Figure~\ref{fig:r1-syntax}.  The non-terminal \Var{}
			
 
				+may be any Racket identifier. As in $R_0$, \key{read} is a nullary
			
 
				+operator, \key{-} is a unary operator, and \key{+} is a binary
			
 
				+operator.  Similar to $R_0$, the $R_1$ language includes the
			
 
				+\key{Program} struct to mark the top of the program. The $\itm{info}$
			
 
				+field of the \key{Program} struct contains an \emph{association list}
			
 
				+(a list of key-value pairs) that is used to communicate auxiliary data
			
 
				+from one compiler pass the next. Despite the simplicity of the $R_1$
			
 
				+language, it is rich enough to exhibit several compilation techniques.
			
 
				+
			
 
				+\begin{figure}[btp]
			
 
				+\centering
			
 
				+\fbox{
			
 
				+\begin{minipage}{0.96\textwidth}
			
 
				+\[
			
 
				+\begin{array}{rcl}
			
 
				+  \Exp &::=& \Int \mid (\key{read}) \mid (\key{-}\;\Exp) \mid (\key{+} \; \Exp\;\Exp)
			
 
				+        \mid (\key{-}\;\Exp\;\Exp) \\
			
 
				+     &\mid&  \Var \mid \key{(let}~\key{([}\Var ~\Exp \key{])}~ \Exp \key{)} \\
			
 
				+  R_1 &::=& \Exp
			
 
				+\end{array}
			
 
				+\]
			
 
				+\end{minipage}
			
 
				+}
			
 
				+\caption{The concrete syntax of $R_1$, a language of integers and variables.}
			
 
				+\label{fig:r1-concrete-syntax}
			
 
				+\end{figure}
			
 
				 
			
 
				 \begin{figure}[btp]
			
 
				 \centering
			
@@ -3426,30 +3480,31 @@ programs to make sure that your move biasing is working properly.
 
				   live range splitting~\citep{Cooper:1998ly}. \\ --Jeremy}
			
 
				 
			
 
				 
			
 
				+
			
 
				 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
			
 
				 \chapter{Booleans and Control Flow}
			
 
				 \label{ch:bool-types}
			
 
				 
			
 
				 The $R_0$ and $R_1$ languages only had a single kind of value, the
			
 
				-integers. In this Chapter we add a second kind of value, the Booleans,
			
 
				+integers. In this chapter we add a second kind of value, the Booleans,
			
 
				 to create the $R_2$ language. The Boolean values \emph{true} and
			
 
				 \emph{false} are written \key{\#t} and \key{\#f} respectively in
			
 
				-Racket.  We also introduce several operations that involve Booleans
			
 
				-(\key{and}, \key{not}, \key{eq?}, \key{<}, etc.) and the conditional
			
 
				-\key{if} expression. With the addition of \key{if} expressions,
			
 
				-programs can have non-trivial control flow which has an impact on
			
 
				-several parts of the compiler. Also, because we now have two kinds of
			
 
				-values, we need to worry about programs that apply an operation to the
			
 
				-wrong kind of value, such as \code{(not 1)}.
			
 
				+Racket.  The $R_2$ language includes several operations that involve
			
 
				+Booleans (\key{and}, \key{not}, \key{eq?}, \key{<}, etc.) and the
			
 
				+conditional \key{if} expression. With the addition of \key{if}
			
 
				+expressions, programs can have non-trivial control flow which has an
			
 
				+impact on several parts of the compiler. Also, because we now have two
			
 
				+kinds of values, we need to worry about programs that apply an
			
 
				+operation to the wrong kind of value, such as \code{(not 1)}.
			
 
				 
			
 
				 There are two language design options for such situations.  One option
			
 
				 is to signal an error and the other is to provide a wider
			
 
				 interpretation of the operation. The Racket language uses a mixture of
			
 
				 these two options, depending on the operation and the kind of
			
 
				 value. For example, the result of \code{(not 1)} in Racket is
			
 
				-\code{\#f} because Racket treats non-zero integers like \code{\#t}. On
			
 
				-the other hand, \code{(car 1)} results in a run-time error in Racket
			
 
				-stating that \code{car} expects a pair.
			
 
				+\code{\#f} because Racket treats non-zero integers as if they were
			
 
				+\code{\#t}. On the other hand, \code{(car 1)} results in a run-time
			
 
				+error in Racket stating that \code{car} expects a pair.
			
 
				 
			
 
				 The Typed Racket language makes similar design choices as Racket,
			
 
				 except much of the error detection happens at compile time instead of
			
@@ -3480,14 +3535,15 @@ conditional control flow.
 
				 \section{The $R_2$ Language}
			
 
				 \label{sec:r2-lang}
			
 
				 
			
 
				-The syntax of the $R_2$ language is defined in
			
 
				-Figure~\ref{fig:r2-syntax}. It includes all of $R_1$ (shown in gray),
			
 
				-the Boolean literals \code{\#t} and \code{\#f}, and the conditional
			
 
				-\code{if} expression. Also, we expand the operators to include
			
 
				-subtraction, \key{and}, \key{or} and \key{not}, the \key{eq?}
			
 
				-operations for comparing two integers or two Booleans, and the
			
 
				-\key{<}, \key{<=}, \key{>}, and \key{>=} operations for comparing
			
 
				-integers.
			
 
				+The concrete syntax of the $R_2$ language is defined in
			
 
				+Figure~\ref{fig:r2-concretesyntax} and the abstract syntax is defined
			
 
				+in Figure~\ref{fig:r2-syntax}. The $R_2$ language includes all of
			
 
				+$R_1$ (shown in gray), the Boolean literals \code{\#t} and \code{\#f},
			
 
				+and the conditional \code{if} expression. Also, we expand the
			
 
				+operators to include subtraction, \key{and}, \key{or} and \key{not},
			
 
				+the \key{eq?}  operations for comparing two integers or two Booleans,
			
 
				+and the \key{<}, \key{<=}, \key{>}, and \key{>=} operations for
			
 
				+comparing integers.
			
 
				 
			
 
				 \begin{figure}[tp]
			
 
				 \centering
			
@@ -3502,12 +3558,36 @@ integers.
 
				       \mid (\key{and}\;\Exp\;\Exp) \mid (\key{or}\;\Exp\;\Exp)
			
 
				       \mid (\key{not}\;\Exp) \\
			
 
				       &\mid& (\itm{cmp}\;\Exp\;\Exp) \mid \IF{\Exp}{\Exp}{\Exp} \\
			
 
				-  R_2 &::=& (\key{program} \; \itm{info}\; \Exp)
			
 
				+  R_2 &::=& \Exp
			
 
				+\end{array}
			
 
				+\]
			
 
				+\end{minipage}
			
 
				+}
			
 
				+\caption{The concrete syntax of $R_2$, extending $R_1$
			
 
				+  (Figure~\ref{fig:r1-concrete-syntax}) with Booleans and conditionals.}
			
 
				+\label{fig:r2-concrete-syntax}
			
 
				+\end{figure}
			
 
				+
			
 
				+\begin{figure}[tp]
			
 
				+\centering
			
 
				+\fbox{
			
 
				+\begin{minipage}{0.96\textwidth}
			
 
				+\[
			
 
				+\begin{array}{lcl}
			
 
				+  \itm{cmp} &::= & \key{eq?} \mid \key{<} \mid \key{<=} \mid \key{>} \mid \key{>=} \\
			
 
				+\Exp &::=& \gray{\INT{\Int} \mid \READ{} \mid \NEG{\Exp}} \\
			
 
				+     &\mid& \gray{\ADD{\Exp}{\Exp}  
			
 
				+      \mid  \VAR{\Var} \mid \LET{\Var}{\Exp}{\Exp}} \\
			
 
				+     &\mid& \key{\#t} \mid \key{\#f} 
			
 
				+      \mid (\key{and}\;\Exp\;\Exp) \mid (\key{or}\;\Exp\;\Exp)
			
 
				+      \mid (\key{not}\;\Exp) \\
			
 
				+      &\mid& (\itm{cmp}\;\Exp\;\Exp) \mid \IF{\Exp}{\Exp}{\Exp} \\
			
 
				+  R_2 &::=& \PROGRAM{\key{'()}}{\Exp}
			
 
				 \end{array}
			
 
				 \]
			
 
				 \end{minipage}
			
 
				 }
			
 
				-\caption{The syntax of $R_2$, extending $R_1$
			
 
				+\caption{The abstract syntax of $R_2$, extending $R_1$
			
 
				   (Figure~\ref{fig:r1-syntax}) with Booleans and conditionals.}
			
 
				 \label{fig:r2-syntax}
			
 
				 \end{figure}