4 жил өмнө · faddd74023
--- a/book.tex
+++ b/book.tex
@@ -483,7 +483,7 @@ When compiling a program such as \eqref{eq:arith-prog}, we need to
 
															 know that the operation associated with the root node is addition and
														
 
															 we need to be able to access its two children. Racket provides pattern
														
 
															 matching over structures to support these kinds of queries, as we
														
 
															-shall see in Section~\ref{sec:pattern-matching}.
														
 
															+see in Section~\ref{sec:pattern-matching}.
														
 
															 In this book, we often write down the concrete syntax of a program
														
 
															 even when we really have in mind the AST because the concrete syntax
														
@@ -502,7 +502,7 @@ programs), so one cannot simply describe a language by listing all of
 
															 the programs in the language. Instead we write down a set of rules, a
														
 
															 \emph{grammar}, for building programs. Grammars are often used to
														
 
															 define the concrete syntax of a language, but they can also be used to
														
 
															-describe the abstract syntax. We shall write our rules in a variant of
														
 
															+describe the abstract syntax. We write our rules in a variant of
														
 
															 Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.
														
 
															 \index{Backus-Naur Form}\index{BNF}
														
 
															 As an example, we describe a small language, named $R_0$, that consists of
														
@@ -1072,7 +1072,7 @@ Appendix~\ref{appendix:utilities}.\\
 
															 This chapter is about compiling the subset of Racket that includes
														
 
															 integer arithmetic and local variable binding, which we name $R_1$, to
														
 
															-x86-64 assembly code~\citep{Intel:2015aa}.  Henceforth we shall refer
														
 
															+x86-64 assembly code~\citep{Intel:2015aa}.  Henceforth we refer
														
 
															 to x86-64 simply as x86.  The chapter begins with a description of the
														
 
															 $R_1$ language (Section~\ref{sec:s0}) followed by a description of x86
														
 
															 (Section~\ref{sec:x86}). The x86 assembly language is large, so we
														
@@ -1699,7 +1699,7 @@ the other hand, it may turn out to be impossible to make sure that all
 
															 such variables are assigned to registers, and then one must redo the
														
 
															 selection of instructions. Some compilers handle this problem by
														
 
															 iteratively repeating these two passes until a good solution is found.
														
 
															-We shall use a simpler approach in which \key{select-instructions}
														
 
															+We use a simpler approach in which \key{select-instructions}
														
 
															 comes first, followed by the \key{assign-homes}, then a third
														
 
															 pass named \key{patch-instructions} that uses a reserved register to
														
 
															 patch-up outstanding problems regarding instructions with too many
														
@@ -1830,31 +1830,6 @@ C_0 & ::= & \PROGRAM{\itm{info}}{\CFG{\key{(}\itm{label}\,\key{.}\,\Tail\key{)}\
 
															 \end{figure}
														
 
															-%% The \key{select-instructions} pass is optimistic in the sense that it
														
 
															-%% treats variables as if they were all mapped to registers. The
														
 
															-%% \key{select-instructions} pass generates a program that consists of
														
 
															-%% x86 instructions but that still uses variables, so it is an
														
 
															-%% intermediate language that is technically different than x86, which
														
 
															-%% explains the asterisks in the diagram above.
														
 
															-
														
 
															-%% In this Chapter we shall take the easy road to implementing
														
 
															-%% \key{assign-homes} and simply map all variables to stack locations.
														
 
															-%% The topic of Chapter~\ref{ch:register-allocation-r1} is implementing a
														
 
															-%% smarter approach in which we make a best-effort to map variables to
														
 
															-%% registers, resorting to the stack only when necessary.
														
 
															-
														
 
															-%% Once variables have been assigned to their homes, we can finalize the
														
 
															-%% instruction selection by dealing with an idiosyncrasy of x86
														
 
															-%% assembly. Many x86 instructions have two arguments but only one of the
														
 
															-%% arguments may be a memory reference (and the stack is a part of
														
 
															-%% memory).  Because some variables may get mapped to stack locations,
														
 
															-%% some of our generated instructions may violate this restriction.  The
														
 
															-%% purpose of the \key{patch-instructions} pass is to fix this problem by
														
 
															-%% replacing every violating instruction with a short sequence of
														
 
															-%% instructions that use the \key{rax} register. Once we have implemented
														
 
															-%% a good register allocator (Chapter~\ref{ch:register-allocation-r1}), the
														
 
															-%% need to patch instructions will be relatively rare.
														
 
															-
														
 
															 \subsection{The dialects of x86}
														
 
															 The x86$^{*}_0$ language, pronounced ``pseudo x86'', is the output of
														
@@ -2791,7 +2766,7 @@ conclusion:
 
															 \index{liveness analysis}
														
 
															 A variable or register is \emph{live} at a program point if its
														
 
															-current value is used at some later point in the program.  We shall
														
 
															+current value is used at some later point in the program.  We 
														
 
															 refer to variables and registers collectively as \emph{locations}.
														
 
															 %
														
 
															 Consider the following code fragment in which there are two writes to
														
@@ -3978,7 +3953,7 @@ reports a compile-time error because Typed Racket expects the type of
 
															 the argument to be of the form \code{(Listof T)} or \code{(Pairof T1 T2)}.
														
 
															 For the $R_2$ language we choose to be more like Typed Racket in that
														
 
															-we shall perform type checking during compilation. In
														
 
															+we perform type checking during compilation. In
														
 
															 Chapter~\ref{ch:type-dynamic} we study the alternative choice, that
														
 
															 is, how to compile a dynamically typed language like Racket.  The
														
 
															 $R_2$ language is a subset of Typed Racket but by no means includes
														
@@ -4583,7 +4558,7 @@ duplicating code. The solution is straightforward if we think at the
 
															 level of x86 assembly: we can label the code for each of the branches
														
 
															 and insert jumps in all the places that need to execute the
														
 
															 branches. Put another way, we need to move away from abstract syntax
														
 
															-\emph{trees} and instead use \emph{graphs}. In particular, we shall
														
 
															+\emph{trees} and instead use \emph{graphs}. In particular, we 
														
 
															 use a standard program representation called a \emph{control flow
														
 
															   graph} (CFG), due to Frances Elizabeth \citet{Allen:1970uq}.
														
 
															 \index{control-flow graph}
														
@@ -4724,7 +4699,7 @@ $B_3$ to refer to blocks for the purposes of our discussion, but they
 
															 should not be confused with the labels for the blocks that appear in
														
 
															 the generated code. We initially construct unlabeled blocks; we only
														
 
															 attach labels to blocks when we add them to the control-flow graph, as
														
 
															-we shall see in the next case.
														
 
															+we see in the next case.
														
 
															 Next consider the case for \key{if} in the \code{explicate-assign}
														
 
															 function. The context of the \key{if} is an assignment to some
														
@@ -5589,7 +5564,7 @@ the \code{void?} predicate that returns \code{\#t} when applied to
 
															 \end{figure}
														
 
															 Figure~\ref{fig:typecheck-R3} shows the type checker for $R_3$, which
														
 
															-deserves some explanation. As we shall see in Section~\ref{sec:GC}, we
														
 
															+deserves some explanation. As we see in Section~\ref{sec:GC}, we
														
 
															 need to know which variables contain pointers into the heap, that is,
														
 
															 which variables contain vectors. Also, when allocating a vector, we
														
 
															 need to know which elements of the vector are pointers. We can obtain
														
@@ -5752,7 +5727,7 @@ such algorithms take into account the possibility of cycles by marking
 
															 which vertices have already been visited, so as to ensure termination
														
 
															 of the algorithm. These search algorithms also use a data structure
														
 
															 such as a stack or queue as a to-do list to keep track of the vertices
														
 
															-that need to be visited. We shall use breadth-first search and a trick
														
 
															+that need to be visited. We use breadth-first search and a trick
														
 
															 due to \citet{Cheney:1970aa} for simultaneously representing the queue
														
 
															 and copying tuples into the ToSpace.
														
@@ -6302,7 +6277,7 @@ used to determine the pointer mask region of the tag.
 
															 The \code{collect} form is compiled to a call to the \code{collect}
														
 
															 function in the runtime. The arguments to \code{collect} are 1) the
														
 
															 top of the root stack and 2) the number of bytes that need to be
														
 
															-allocated.  We shall use another dedicated register, \code{r15}, to
														
 
															+allocated.  We use another dedicated register, \code{r15}, to
														
 
															 store the pointer to the top of the root stack. So \code{r15} is not
														
 
															 available for use by the register allocator.
														
 
															 \begin{lstlisting}
														
@@ -7962,7 +7937,7 @@ functions because they use different values for \code{x}. Applying
 
															 \end{figure}
														
 
															-The approach that we shall take for implementing lexically scoped
														
 
															+The approach that we take for implementing lexically scoped
														
 
															 functions is to compile them into top-level function definitions,
														
 
															 translating from $R_5$ into $R_4$.  However, the compiler will need to
														
 
															 provide special treatment for variable occurrences such as \code{x}
														
@@ -7997,7 +7972,7 @@ code, an arrangement called a \emph{flat closure} (which we shorten to
 
															 just ``closure'').  \index{closure}\index{flat closure} Fortunately,
														
 
															 we have all the ingredients to make closures, Chapter~\ref{ch:tuples}
														
 
															 gave us vectors and Chapter~\ref{ch:functions} gave us function
														
 
															-pointers. The function pointer shall reside at index $0$ and the
														
 
															+pointers. The function pointer resides at index $0$ and the
														
 
															 values for the free variables will fill in the rest of the vector.
														
 
															 Let us revisit the example in Figure~\ref{fig:lexical-scoping} to see
														
@@ -8196,7 +8171,7 @@ definitions is accomplished in the pass \code{convert-to-closures}
 
															 that comes after \code{reveal-functions} and before
														
 
															 \code{limit-functions}. 
														
 
															-As usual, we shall implement the pass as a recursive function over the
														
 
															+As usual, we implement the pass as a recursive function over the
														
 
															 AST. All of the action is in the clauses for \key{lambda} and
														
 
															 \key{Apply}. We transform a \key{lambda} expression into an expression
														
 
															 that creates a closure, that is, creates a vector whose first element
														
@@ -8551,11 +8526,11 @@ result of \code{(read)}.
 
															 The way around this problem is to include information about a value's
														
 
															 runtime type in the value itself, so that this information can be
														
 
															-inspected by operators such as \code{not}.  In particular, we shall
														
 
															+inspected by operators such as \code{not}.  In particular, we 
														
 
															 steal the 3 right-most bits from our 64-bit values to encode the
														
 
															-runtime type.  We shall use $001$ to identify integers, $100$ for
														
 
															+runtime type.  We use $001$ to identify integers, $100$ for
														
 
															 Booleans, $010$ for vectors, $011$ for procedures, and $101$ for the
														
 
															-void value. We shall refer to these 3 bits as the \emph{tag} and we
														
 
															+void value. We refer to these 3 bits as the \emph{tag} and we
														
 
															 define the following auxiliary function.
														
 
															 \begin{align*}
														
 
															 \itm{tagof}(\key{Integer}) &= 001 \\
														
@@ -8565,7 +8540,7 @@ define the following auxiliary function.
 
															 \itm{tagof}((\ldots \key{->} \ldots)) &= 011 \\
														
 
															 \itm{tagof}(\key{Void}) &= 101
														
 
															 \end{align*}
														
 
															-(We shall say more about the new \key{Vectorof} type shortly.)
														
 
															+(We say more about the new \key{Vectorof} type shortly.)
														
 
															 This stealing of 3 bits comes at some
														
 
															 price: our integers are reduced to ranging from $-2^{60}$ to
														
 
															 $2^{60}$. The stealing does not adversely affect vectors and
														
@@ -8600,7 +8575,7 @@ pointers to tuples. On the other hand, if one of the first three bits
 
															 is set, then we have a tagged value, and inspecting the tag can
														
 
															 differentiation between vectors ($010$) and the other kinds of values.
														
 
															-We shall implement our untyped language $R_7$ by compiling it to $R_6$
														
 
															+We implement our untyped language $R_7$ by compiling it to $R_6$
														
 
															 (Section~\ref{sec:compile-r7}), but first we describe the how to
														
 
															 extend our compiler to handle the new features of $R_6$
														
 
															 (Sections~\ref{sec:shrink-r6}, \ref{sec:select-r6}, and