Jeremy Siek 4 лет назад
Родитель
Сommit
faddd74023
1 измененных файлов с 19 добавлено и 44 удалено
  1. 19 44
      book.tex

+ 19 - 44
book.tex

@@ -483,7 +483,7 @@ When compiling a program such as \eqref{eq:arith-prog}, we need to
 know that the operation associated with the root node is addition and
 know that the operation associated with the root node is addition and
 we need to be able to access its two children. Racket provides pattern
 we need to be able to access its two children. Racket provides pattern
 matching over structures to support these kinds of queries, as we
 matching over structures to support these kinds of queries, as we
-shall see in Section~\ref{sec:pattern-matching}.
+see in Section~\ref{sec:pattern-matching}.
 
 
 In this book, we often write down the concrete syntax of a program
 In this book, we often write down the concrete syntax of a program
 even when we really have in mind the AST because the concrete syntax
 even when we really have in mind the AST because the concrete syntax
@@ -502,7 +502,7 @@ programs), so one cannot simply describe a language by listing all of
 the programs in the language. Instead we write down a set of rules, a
 the programs in the language. Instead we write down a set of rules, a
 \emph{grammar}, for building programs. Grammars are often used to
 \emph{grammar}, for building programs. Grammars are often used to
 define the concrete syntax of a language, but they can also be used to
 define the concrete syntax of a language, but they can also be used to
-describe the abstract syntax. We shall write our rules in a variant of
+describe the abstract syntax. We write our rules in a variant of
 Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.
 Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.
 \index{Backus-Naur Form}\index{BNF}
 \index{Backus-Naur Form}\index{BNF}
 As an example, we describe a small language, named $R_0$, that consists of
 As an example, we describe a small language, named $R_0$, that consists of
@@ -1072,7 +1072,7 @@ Appendix~\ref{appendix:utilities}.\\
 
 
 This chapter is about compiling the subset of Racket that includes
 This chapter is about compiling the subset of Racket that includes
 integer arithmetic and local variable binding, which we name $R_1$, to
 integer arithmetic and local variable binding, which we name $R_1$, to
-x86-64 assembly code~\citep{Intel:2015aa}.  Henceforth we shall refer
+x86-64 assembly code~\citep{Intel:2015aa}.  Henceforth we refer
 to x86-64 simply as x86.  The chapter begins with a description of the
 to x86-64 simply as x86.  The chapter begins with a description of the
 $R_1$ language (Section~\ref{sec:s0}) followed by a description of x86
 $R_1$ language (Section~\ref{sec:s0}) followed by a description of x86
 (Section~\ref{sec:x86}). The x86 assembly language is large, so we
 (Section~\ref{sec:x86}). The x86 assembly language is large, so we
@@ -1699,7 +1699,7 @@ the other hand, it may turn out to be impossible to make sure that all
 such variables are assigned to registers, and then one must redo the
 such variables are assigned to registers, and then one must redo the
 selection of instructions. Some compilers handle this problem by
 selection of instructions. Some compilers handle this problem by
 iteratively repeating these two passes until a good solution is found.
 iteratively repeating these two passes until a good solution is found.
-We shall use a simpler approach in which \key{select-instructions}
+We use a simpler approach in which \key{select-instructions}
 comes first, followed by the \key{assign-homes}, then a third
 comes first, followed by the \key{assign-homes}, then a third
 pass named \key{patch-instructions} that uses a reserved register to
 pass named \key{patch-instructions} that uses a reserved register to
 patch-up outstanding problems regarding instructions with too many
 patch-up outstanding problems regarding instructions with too many
@@ -1830,31 +1830,6 @@ C_0 & ::= & \PROGRAM{\itm{info}}{\CFG{\key{(}\itm{label}\,\key{.}\,\Tail\key{)}\
 \end{figure}
 \end{figure}
 
 
 
 
-%% The \key{select-instructions} pass is optimistic in the sense that it
-%% treats variables as if they were all mapped to registers. The
-%% \key{select-instructions} pass generates a program that consists of
-%% x86 instructions but that still uses variables, so it is an
-%% intermediate language that is technically different than x86, which
-%% explains the asterisks in the diagram above.
-
-%% In this Chapter we shall take the easy road to implementing
-%% \key{assign-homes} and simply map all variables to stack locations.
-%% The topic of Chapter~\ref{ch:register-allocation-r1} is implementing a
-%% smarter approach in which we make a best-effort to map variables to
-%% registers, resorting to the stack only when necessary.
-
-%% Once variables have been assigned to their homes, we can finalize the
-%% instruction selection by dealing with an idiosyncrasy of x86
-%% assembly. Many x86 instructions have two arguments but only one of the
-%% arguments may be a memory reference (and the stack is a part of
-%% memory).  Because some variables may get mapped to stack locations,
-%% some of our generated instructions may violate this restriction.  The
-%% purpose of the \key{patch-instructions} pass is to fix this problem by
-%% replacing every violating instruction with a short sequence of
-%% instructions that use the \key{rax} register. Once we have implemented
-%% a good register allocator (Chapter~\ref{ch:register-allocation-r1}), the
-%% need to patch instructions will be relatively rare.
-
 \subsection{The dialects of x86}
 \subsection{The dialects of x86}
 
 
 The x86$^{*}_0$ language, pronounced ``pseudo x86'', is the output of
 The x86$^{*}_0$ language, pronounced ``pseudo x86'', is the output of
@@ -2791,7 +2766,7 @@ conclusion:
 \index{liveness analysis}
 \index{liveness analysis}
 
 
 A variable or register is \emph{live} at a program point if its
 A variable or register is \emph{live} at a program point if its
-current value is used at some later point in the program.  We shall
+current value is used at some later point in the program.  We 
 refer to variables and registers collectively as \emph{locations}.
 refer to variables and registers collectively as \emph{locations}.
 %
 %
 Consider the following code fragment in which there are two writes to
 Consider the following code fragment in which there are two writes to
@@ -3978,7 +3953,7 @@ reports a compile-time error because Typed Racket expects the type of
 the argument to be of the form \code{(Listof T)} or \code{(Pairof T1 T2)}.
 the argument to be of the form \code{(Listof T)} or \code{(Pairof T1 T2)}.
 
 
 For the $R_2$ language we choose to be more like Typed Racket in that
 For the $R_2$ language we choose to be more like Typed Racket in that
-we shall perform type checking during compilation. In
+we perform type checking during compilation. In
 Chapter~\ref{ch:type-dynamic} we study the alternative choice, that
 Chapter~\ref{ch:type-dynamic} we study the alternative choice, that
 is, how to compile a dynamically typed language like Racket.  The
 is, how to compile a dynamically typed language like Racket.  The
 $R_2$ language is a subset of Typed Racket but by no means includes
 $R_2$ language is a subset of Typed Racket but by no means includes
@@ -4583,7 +4558,7 @@ duplicating code. The solution is straightforward if we think at the
 level of x86 assembly: we can label the code for each of the branches
 level of x86 assembly: we can label the code for each of the branches
 and insert jumps in all the places that need to execute the
 and insert jumps in all the places that need to execute the
 branches. Put another way, we need to move away from abstract syntax
 branches. Put another way, we need to move away from abstract syntax
-\emph{trees} and instead use \emph{graphs}. In particular, we shall
+\emph{trees} and instead use \emph{graphs}. In particular, we 
 use a standard program representation called a \emph{control flow
 use a standard program representation called a \emph{control flow
   graph} (CFG), due to Frances Elizabeth \citet{Allen:1970uq}.
   graph} (CFG), due to Frances Elizabeth \citet{Allen:1970uq}.
 \index{control-flow graph}
 \index{control-flow graph}
@@ -4724,7 +4699,7 @@ $B_3$ to refer to blocks for the purposes of our discussion, but they
 should not be confused with the labels for the blocks that appear in
 should not be confused with the labels for the blocks that appear in
 the generated code. We initially construct unlabeled blocks; we only
 the generated code. We initially construct unlabeled blocks; we only
 attach labels to blocks when we add them to the control-flow graph, as
 attach labels to blocks when we add them to the control-flow graph, as
-we shall see in the next case.
+we see in the next case.
 
 
 Next consider the case for \key{if} in the \code{explicate-assign}
 Next consider the case for \key{if} in the \code{explicate-assign}
 function. The context of the \key{if} is an assignment to some
 function. The context of the \key{if} is an assignment to some
@@ -5589,7 +5564,7 @@ the \code{void?} predicate that returns \code{\#t} when applied to
 \end{figure}
 \end{figure}
 
 
 Figure~\ref{fig:typecheck-R3} shows the type checker for $R_3$, which
 Figure~\ref{fig:typecheck-R3} shows the type checker for $R_3$, which
-deserves some explanation. As we shall see in Section~\ref{sec:GC}, we
+deserves some explanation. As we see in Section~\ref{sec:GC}, we
 need to know which variables contain pointers into the heap, that is,
 need to know which variables contain pointers into the heap, that is,
 which variables contain vectors. Also, when allocating a vector, we
 which variables contain vectors. Also, when allocating a vector, we
 need to know which elements of the vector are pointers. We can obtain
 need to know which elements of the vector are pointers. We can obtain
@@ -5752,7 +5727,7 @@ such algorithms take into account the possibility of cycles by marking
 which vertices have already been visited, so as to ensure termination
 which vertices have already been visited, so as to ensure termination
 of the algorithm. These search algorithms also use a data structure
 of the algorithm. These search algorithms also use a data structure
 such as a stack or queue as a to-do list to keep track of the vertices
 such as a stack or queue as a to-do list to keep track of the vertices
-that need to be visited. We shall use breadth-first search and a trick
+that need to be visited. We use breadth-first search and a trick
 due to \citet{Cheney:1970aa} for simultaneously representing the queue
 due to \citet{Cheney:1970aa} for simultaneously representing the queue
 and copying tuples into the ToSpace.
 and copying tuples into the ToSpace.
 
 
@@ -6302,7 +6277,7 @@ used to determine the pointer mask region of the tag.
 The \code{collect} form is compiled to a call to the \code{collect}
 The \code{collect} form is compiled to a call to the \code{collect}
 function in the runtime. The arguments to \code{collect} are 1) the
 function in the runtime. The arguments to \code{collect} are 1) the
 top of the root stack and 2) the number of bytes that need to be
 top of the root stack and 2) the number of bytes that need to be
-allocated.  We shall use another dedicated register, \code{r15}, to
+allocated.  We use another dedicated register, \code{r15}, to
 store the pointer to the top of the root stack. So \code{r15} is not
 store the pointer to the top of the root stack. So \code{r15} is not
 available for use by the register allocator.
 available for use by the register allocator.
 \begin{lstlisting}
 \begin{lstlisting}
@@ -7962,7 +7937,7 @@ functions because they use different values for \code{x}. Applying
 \end{figure}
 \end{figure}
 
 
 
 
-The approach that we shall take for implementing lexically scoped
+The approach that we take for implementing lexically scoped
 functions is to compile them into top-level function definitions,
 functions is to compile them into top-level function definitions,
 translating from $R_5$ into $R_4$.  However, the compiler will need to
 translating from $R_5$ into $R_4$.  However, the compiler will need to
 provide special treatment for variable occurrences such as \code{x}
 provide special treatment for variable occurrences such as \code{x}
@@ -7997,7 +7972,7 @@ code, an arrangement called a \emph{flat closure} (which we shorten to
 just ``closure'').  \index{closure}\index{flat closure} Fortunately,
 just ``closure'').  \index{closure}\index{flat closure} Fortunately,
 we have all the ingredients to make closures, Chapter~\ref{ch:tuples}
 we have all the ingredients to make closures, Chapter~\ref{ch:tuples}
 gave us vectors and Chapter~\ref{ch:functions} gave us function
 gave us vectors and Chapter~\ref{ch:functions} gave us function
-pointers. The function pointer shall reside at index $0$ and the
+pointers. The function pointer resides at index $0$ and the
 values for the free variables will fill in the rest of the vector.
 values for the free variables will fill in the rest of the vector.
 
 
 Let us revisit the example in Figure~\ref{fig:lexical-scoping} to see
 Let us revisit the example in Figure~\ref{fig:lexical-scoping} to see
@@ -8196,7 +8171,7 @@ definitions is accomplished in the pass \code{convert-to-closures}
 that comes after \code{reveal-functions} and before
 that comes after \code{reveal-functions} and before
 \code{limit-functions}. 
 \code{limit-functions}. 
 
 
-As usual, we shall implement the pass as a recursive function over the
+As usual, we implement the pass as a recursive function over the
 AST. All of the action is in the clauses for \key{lambda} and
 AST. All of the action is in the clauses for \key{lambda} and
 \key{Apply}. We transform a \key{lambda} expression into an expression
 \key{Apply}. We transform a \key{lambda} expression into an expression
 that creates a closure, that is, creates a vector whose first element
 that creates a closure, that is, creates a vector whose first element
@@ -8551,11 +8526,11 @@ result of \code{(read)}.
 
 
 The way around this problem is to include information about a value's
 The way around this problem is to include information about a value's
 runtime type in the value itself, so that this information can be
 runtime type in the value itself, so that this information can be
-inspected by operators such as \code{not}.  In particular, we shall
+inspected by operators such as \code{not}.  In particular, we 
 steal the 3 right-most bits from our 64-bit values to encode the
 steal the 3 right-most bits from our 64-bit values to encode the
-runtime type.  We shall use $001$ to identify integers, $100$ for
+runtime type.  We use $001$ to identify integers, $100$ for
 Booleans, $010$ for vectors, $011$ for procedures, and $101$ for the
 Booleans, $010$ for vectors, $011$ for procedures, and $101$ for the
-void value. We shall refer to these 3 bits as the \emph{tag} and we
+void value. We refer to these 3 bits as the \emph{tag} and we
 define the following auxiliary function.
 define the following auxiliary function.
 \begin{align*}
 \begin{align*}
 \itm{tagof}(\key{Integer}) &= 001 \\
 \itm{tagof}(\key{Integer}) &= 001 \\
@@ -8565,7 +8540,7 @@ define the following auxiliary function.
 \itm{tagof}((\ldots \key{->} \ldots)) &= 011 \\
 \itm{tagof}((\ldots \key{->} \ldots)) &= 011 \\
 \itm{tagof}(\key{Void}) &= 101
 \itm{tagof}(\key{Void}) &= 101
 \end{align*}
 \end{align*}
-(We shall say more about the new \key{Vectorof} type shortly.)
+(We say more about the new \key{Vectorof} type shortly.)
 This stealing of 3 bits comes at some
 This stealing of 3 bits comes at some
 price: our integers are reduced to ranging from $-2^{60}$ to
 price: our integers are reduced to ranging from $-2^{60}$ to
 $2^{60}$. The stealing does not adversely affect vectors and
 $2^{60}$. The stealing does not adversely affect vectors and
@@ -8600,7 +8575,7 @@ pointers to tuples. On the other hand, if one of the first three bits
 is set, then we have a tagged value, and inspecting the tag can
 is set, then we have a tagged value, and inspecting the tag can
 differentiation between vectors ($010$) and the other kinds of values.
 differentiation between vectors ($010$) and the other kinds of values.
 
 
-We shall implement our untyped language $R_7$ by compiling it to $R_6$
+We implement our untyped language $R_7$ by compiling it to $R_6$
 (Section~\ref{sec:compile-r7}), but first we describe the how to
 (Section~\ref{sec:compile-r7}), but first we describe the how to
 extend our compiler to handle the new features of $R_6$
 extend our compiler to handle the new features of $R_6$
 (Sections~\ref{sec:shrink-r6}, \ref{sec:select-r6}, and
 (Sections~\ref{sec:shrink-r6}, \ref{sec:select-r6}, and