3 years ago · 0ed8493f18
--- a/book.tex
+++ b/book.tex
@@ -10486,7 +10486,7 @@ registers or on the procedure call stack.
 
															 Sections~\ref{sec:expose-allocation} through \ref{sec:print-x86-gc}
														
 
															 discuss all the necessary changes and additions to the compiler
														
 
															-passes, including a new compiler pass named \code{expose-allocation}.
														
 
															+passes, including a new compiler pass named \code{expose\_allocation}.
														
 
															 \section{The \LangVec{} Language}
														
 
															 \label{sec:r3}
														
@@ -11094,7 +11094,7 @@ which we call a \emph{root stack}\index{subject}{root stack}
 
															 (a.k.a. ``shadow
														
 
															 stack'')~\citep{Siebert:2001aa,Henderson:2002aa,Baker:2009aa}. That
														
 
															 is, when a local variable needs to be spilled and is of type
														
 
															-\racket{\code{Vector}}\python{\code{tuple}}, then we put it on the
														
 
															+\racket{\code{Vector}}\python{\code{TupleType}}, then we put it on the
														
 
															 root stack instead of the normal procedure call stack. Furthermore, we
														
 
															 always spill tuple-typed variables if they are live during a call to
														
 
															 the collector, thereby ensuring that no pointers are in registers
														
@@ -11217,13 +11217,13 @@ succeed.
 
															 The introduction of garbage collection has a non-trivial impact on our
														
 
															 compiler passes. We introduce a new compiler pass named
														
 
															-\code{expose-allocation}. We make
														
 
															-significant changes to \code{select\_instructions},
														
 
															-\code{build\_interference}, \code{allocate\_registers}, and
														
 
															-\code{print\_x86} and make minor changes in several more passes.  The
														
 
															-following program will serve as our running example.  It creates two
														
 
															-tuples, one nested inside the other. Both tuples have length one. The
														
 
															-program accesses the element in the inner tuple tuple.
														
 
															+\code{expose\_allocation}. We make significant changes to
														
 
															+\code{select\_instructions}, \code{build\_interference},
														
 
															+\code{allocate\_registers}, and \code{prelude\_and\_conclusion} and
														
 
															+make minor changes in several more passes.  The following program will
														
 
															+serve as our running example.  It creates two tuples, one nested
														
 
															+inside the other. Both tuples have length one. The program accesses
														
 
															+the element in the inner tuple tuple.
														
 
															 % tests/vectors_test_17.rkt
														
 
															 {\if\edition\racketEd
														
 
															 \begin{lstlisting}
														
@@ -11242,20 +11242,22 @@ print( ((42,),)[0][0] )
 
															 \label{sec:shrink-Lvec}
														
 
															 Recall that the \code{shrink} pass translates the primitives operators
														
 
															-into a smaller set of primitives. Because this pass comes after type
														
 
															-checking, but before the passes that require the type information in
														
 
															-the \code{HasType} AST nodes, the \code{shrink} pass must be modified
														
 
															-to wrap \code{HasType} around each AST node that it generates.
														
 
															+into a smaller set of primitives.
														
 
															+%
														
 
															+This pass comes after type checking and the type checker adds a
														
 
															+\code{HasType} AST node around each \code{vector} AST node, so you'll
														
 
															+need to add a case for \code{HasType} to the \code{shrink} pass.
														
 
															 \fi}
														
 
															 \section{Expose Allocation}
														
 
															 \label{sec:expose-allocation}
														
 
															-The pass \code{expose\_allocation} lowers tuple creation
														
 
															-into a conditional call to the collector followed by the
														
 
															-allocation.  We choose to place the \code{expose\_allocation} pass
														
 
															-before \code{remove\_complex\_operands} because the code generated by
														
 
															+The pass \code{expose\_allocation} lowers tuple creation into a
														
 
															+conditional call to the collector followed by allocating the
														
 
															+appropriate amount of memory and initializing it.  We choose to place
														
 
															+the \code{expose\_allocation} pass before
														
 
															+\code{remove\_complex\_operands} because the code generated by
														
 
															 \code{expose\_allocation} contains complex operands.
														
 
															 The output of \code{expose\_allocation} is a language \LangAlloc{}
														
@@ -11285,10 +11287,13 @@ of tuple creation.
 
															 \fi}
														
 
															-The \CCOLLECT{n} form runs the garbage collector, requesting $n$
														
 
															-bytes. During instruction selection, it will become a call to the
														
 
															-\code{collect} function in \code{runtime.c}.  The \CALLOCATE{n}{T}
														
 
															-form creates a tuple with space for $n$ elements, but they are not
														
 
															+The \CCOLLECT{$n$} form runs the garbage collector, requesting that it
														
 
															+make sure that there are $n$ bytes ready to be allocated. During
														
 
															+instruction selection, the \CCOLLECT{$n$} form will become a call to
														
 
															+the \code{collect} function in \code{runtime.c}.
														
 
															+%
														
 
															+The \CALLOCATE{$n$}{$T$} form obtains memory for $n$ elements (and
														
 
															+space at the front for the 64 bit tag), but the elements are not
														
 
															 initialized.  \index{subject}{allocate} The $T$ parameter is the type
														
 
															 of the tuple:
														
 
															 %
														
@@ -11305,10 +11310,15 @@ as \code{free\_ptr}.
 
															 The following shows the transformation of tuple creation into 1) a
														
 
															 sequence of temporary variables bindings for the initializing
														
 
															 expressions, 2) a conditional call to \code{collect}, 3) a call to
														
 
															-\code{allocate}, and 4) the initialization of the vector. The
														
 
															-\itm{len} placeholder refers to the length of the vector and
														
 
															+\code{allocate}, and 4) the initialization of the tuple. The
														
 
															+\itm{len} placeholder refers to the length of the tuple and
														
 
															 \itm{bytes} is how many total bytes need to be allocated for the
														
 
															-vector, which is 8 for the tag plus \itm{len} times 8.
														
 
															+tuple, which is 8 for the tag plus \itm{len} times 8.
														
 
															+%
														
 
															+\python{The \itm{type} needed for the second argument of the
														
 
															+  \code{allocate} form can be obtained from the \code{has\_type} field
														
 
															+  of the tuple AST node, which is stored there by running the type
														
 
															+  checker for \LangVec{} immediately before this pass.}
														
 
															 %
														
 
															 {\if\edition\racketEd
														
 
															 \begin{lstlisting}
														
@@ -11345,10 +11355,10 @@ vector, which is 8 for the tag plus \itm{len} times 8.
 
															 \end{lstlisting}
														
 
															 \fi}
														
 
															 %
														
 
															-The placement of the initializing expressions $e_0,\ldots,e_{n-1}$
														
 
															-prior to the \code{allocate} is important, as those expressions may
														
 
															-trigger garbage collection and we cannot have an allocated but
														
 
															-uninitialized tuple on the heap during a collection.
														
 
															+\noindent The sequencing of the initializing expressions
														
 
															+$e_0,\ldots,e_{n-1}$ prior to the \code{allocate} is important, as
														
 
															+they may trigger garbage collection and we cannot have an allocated
														
 
															+but uninitialized tuple on the heap during a collection.
														
 
															 Figure~\ref{fig:expose-alloc-output} shows the output of the
														
 
															 \code{expose\_allocation} pass on our running example.
														
@@ -11395,7 +11405,7 @@ where $T_1$ is
 
															               0
														
 
															           else:
														
 
															               collect(16)
														
 
															-          tmp.2 = allocate(1, tuple[tuple[int]])
														
 
															+          tmp.2 = allocate(1, TupleType(TupleType([int])))
														
 
															           tmp.2[0] = tmp.1
														
 
															           tmp.2
														
 
															 \end{lstlisting}
														
@@ -11407,7 +11417,7 @@ and $T_2$ is
 
															               0
														
 
															           else:
														
 
															               collect(16)
														
 
															-          tmp.4 = allocate(1, tuple[int])
														
 
															+          tmp.4 = allocate(1, TupleType([int]))
														
 
															           tmp.4[0] = tmp.3
														
 
															           tmp.4
														
 
															 \end{lstlisting}
														
@@ -11429,9 +11439,9 @@ should be treated as complex operands.
 
															 %
														
 
															 {\if\edition\pythonEd
														
 
															 %
														
 
															-The expressions \code{Allocate}, \code{GlobalValue}, \code{Begin}, and
														
 
															-\code{Subscript} should be treated as complex operands.  The
														
 
															-sub-expressions of \code{Subscript} must be atomic.
														
 
															+The expressions \code{allocate}, \code{global\_value}, \code{begin},
														
 
															+and tuple access should be treated as complex operands.  The
														
 
															+sub-expressions of tuple access must be atomic.
														
 
															 %
														
 
															 \fi}
														
 
															 %% A new case for
														
@@ -11571,10 +11581,10 @@ expressions of \LangCVec{} include \key{allocate},
 
															 %
														
 
															 and \key{global\_value}.
														
 
															 %
														
 
															-\python{It also includes the \code{collect} statment and
														
 
															+\python{\LangCVec{} also includes the \code{collect} statement and
														
 
															 assignment to a tuple element.}
														
 
															 %
														
 
															-\racket{It also includes the new \code{collect} statement.}
														
 
															+\racket{\LangCVec{} also includes the new \code{collect} statement.}
														
 
															 %
														
 
															 The \code{explicate\_control} pass can treat these new forms much like
														
 
															 the other forms that we've already encoutered.
														
@@ -11642,7 +11652,7 @@ register \code{r11} ensures that offset expression
 
															 removing \code{r11} from consideration by the register allocating.
														
 
															 Why not use \code{rax} instead of \code{r11}? Suppose we instead used
														
 
															-\code{rax}. Then the generated code for \code{vector-set!} would be
														
 
															+\code{rax}. Then the generated code for tuple assignment would be
														
 
															 \begin{lstlisting}
														
 
															 movq |$\itm{tup}'$|, %rax
														
 
															 movq |$\itm{rhs}'$|, |$8(n+1)$|(%rax)
														
@@ -11669,10 +11679,15 @@ being allocated, which is $8(\itm{len}+1)$ bytes because each element
 
															 is 8 bytes (64 bits) and we use 8 bytes for the tag.  We then
														
 
															 initialize the \itm{tag} and finally copy the address in \code{r11} to
														
 
															 the left-hand-side. Refer to Figure~\ref{fig:tuple-rep} to see how the
														
 
															-tag is organized. We recommend using the Racket operations
														
 
															+tag is organized.
														
 
															+%
														
 
															+\racket{We recommend using the Racket operations
														
 
															 \code{bitwise-ior} and \code{arithmetic-shift} to compute the tag
														
 
															-during compilation.  The type annotation in the \code{vector} form is
														
 
															-used to determine the pointer mask region of the tag.
														
 
															+during compilation.}
														
 
															+%
														
 
															+The type annotation in the \code{allocate} form is used to determine
														
 
															+the pointer mask region of the tag.
														
 
															+%
														
 
															 {\if\edition\racketEd
														
 
															 \begin{lstlisting}
														
 
															    |$\itm{lhs}$| = (allocate |$\itm{len}$| (Vector |$\itm{type} \ldots$|));
														
@@ -11685,7 +11700,7 @@ used to determine the pointer mask region of the tag.
 
															 \fi}
														
 
															 {\if\edition\pythonEd    
														
 
															 \begin{lstlisting}
														
 
															-   |$\itm{lhs}$| = allocate(|$\itm{len}$|, tuple[|$\itm{type}, \ldots$]|);
														
 
															+   |$\itm{lhs}$| = allocate(|$\itm{len}$|, TupleType([|$\itm{type}, \ldots$])|);
														
 
															    |$\Longrightarrow$|
														
 
															    movq free_ptr(%rip), %r11
														
 
															    addq |$8(\itm{len}+1)$|, free_ptr(%rip)
														
@@ -11755,8 +11770,7 @@ available for use by the register allocator.
 
															 The concrete and abstract syntax of the \LangXGlobal{} language is
														
 
															 defined in Figures~\ref{fig:x86-2-concrete} and \ref{fig:x86-2}.  It
														
 
															-differs from \LangXIf{} just in the addition of the form for global
														
 
															-variables.
														
 
															+differs from \LangXIf{} just in the addition of global variables.
														
 
															 %
														
 
															 Figure~\ref{fig:select-instr-output-gc} shows the output of the
														
 
															 \code{select\_instructions} pass on the running example.
														
@@ -11838,28 +11852,34 @@ block40:
 
															 As discussed earlier in this chapter, the garbage collector needs to
														
 
															 access all the pointers in the root set, that is, all variables that
														
 
															-are vectors. It will be the responsibility of the register allocator
														
 
															+are tuples. It will be the responsibility of the register allocator
														
 
															 to make sure that:
														
 
															 \begin{enumerate}
														
 
															-\item the root stack is used for spilling vector-typed variables, and
														
 
															-\item if a vector-typed variable is live during a call to the
														
 
															+\item the root stack is used for spilling tuple-typed variables, and
														
 
															+\item if a tuple-typed variable is live during a call to the
														
 
															   collector, it must be spilled to ensure it is visible to the
														
 
															   collector.
														
 
															 \end{enumerate}
														
 
															 The later responsibility can be handled during construction of the
														
 
															 interference graph, by adding interference edges between the call-live
														
 
															-vector-typed variables and all the callee-saved registers. (They
														
 
															-already interfere with the caller-saved registers.)  The type
														
 
															-information for variables is in the \code{Program} form, so we
														
 
															-recommend adding another parameter to the \code{build\_interference}
														
 
															-function to communicate this alist.
														
 
															-
														
 
															-The spilling of vector-typed variables to the root stack can be
														
 
															-handled after graph coloring, when choosing how to assign the colors
														
 
															-(integers) to registers and stack locations. The \code{Program} output
														
 
															-of this pass changes to also record the number of spills to the root
														
 
															-stack.
														
 
															+tuple-typed variables and all the callee-saved registers. (They
														
 
															+already interfere with the caller-saved registers.)
														
 
															+%
														
 
															+\racket{The type information for variables is in the \code{Program}
														
 
															+  form, so we recommend adding another parameter to the
														
 
															+  \code{build\_interference} function to communicate this alist.}
														
 
															+%
														
 
															+\python{The type information for variables is generated by the type
														
 
															+  checker for \LangCVec{}, stored a field named \code{var\_types} in
														
 
															+  the \code{CProgram} AST mode. You'll need to propagate that
														
 
															+  information so that it is available in this pass.}
														
 
															+
														
 
															+The spilling of tuple-typed variables to the root stack can be handled
														
 
															+after graph coloring, when choosing how to assign the colors
														
 
															+(integers) to registers and stack locations. The
														
 
															+\racket{\code{Program}}\python{\code{CProgram}} output of this pass
														
 
															+changes to also record the number of spills to the root stack.
														
 
															 % build-interference
														
 
															 %
														
@@ -11874,7 +11894,7 @@ stack.
 
															-\section{Generate Prelude and Conclusion}
														
 
															+\section{Prelude and Conclusion}
														
 
															 \label{sec:print-x86-gc}
														
 
															 \label{sec:prelude-conclusion-x86-gc}
														
 
															 \index{subject}{prelude}\index{subject}{conclusion}
														
--- a/defs.tex
+++ b/defs.tex
@@ -151,7 +151,7 @@
 
															 \newcommand{\COLLECT}[1]{\LP\key{Collect}~#1\RP}
														
 
															 \newcommand{\CCOLLECT}[1]{\LP\key{collect}~#1\RP}
														
 
															 \newcommand{\ALLOCATE}[2]{\LP\key{Allocate}~#1~#2\RP}
														
 
															-\newcommand{\CALLOCATE}[2]{\LP\key{allocate}~#1\RP}
														
 
															+\newcommand{\CALLOCATE}[2]{\LP\key{allocate}~#1~#2\RP}
														
 
															 \newcommand{\GLOBAL}[1]{\LP\key{Global}~#1\RP}
														
 
															 \newcommand{\CGLOBAL}[1]{#1\key{(\%rip)}}
														
 
															 \newcommand{\GLOBALVALUE}[1]{\LP\key{GlobalValue}~#1\RP}
														
@@ -197,11 +197,11 @@
 
															 \newcommand{\COR}[2]{#1~\key{or}~#2}
														
 
															 \newcommand{\INTTY}{{\key{int}}}
														
 
															 \newcommand{\BOOLTY}{{\key{bool}}}
														
 
															-\newcommand{\VECTY}[1]{{\key{Tuple}\LS #1 \RS}}
														
 
															+\newcommand{\VECTY}[1]{{\key{TupleType}\LP\LS #1 \RS\RP}}
														
 
															 \newcommand{\COLLECT}[1]{\key{Collect}\LP#1\RP}
														
 
															 \newcommand{\CCOLLECT}[1]{\key{collect}\LP#1\RP}
														
 
															 \newcommand{\ALLOCATE}[2]{\key{Allocate}\LP#1,#2\RP}
														
 
															-\newcommand{\CALLOCATE}[2]{\key{allocate}\LP#1\RP}
														
 
															+\newcommand{\CALLOCATE}[2]{\key{allocate}\LP#1,#2\RP}
														
 
															 \newcommand{\GLOBALVALUE}[1]{\key{GlobalValue}\LP#1\RP}
														
 
															 \newcommand{\CGLOBALVALUE}[1]{\key{global\_value}\LP#1\RP}
														
 
															 \newcommand{\GLOBAL}[1]{\key{Global}\LP#1\RP}