3 ani în urmă · 0ed8493f18
--- a/book.tex
+++ b/book.tex
@@ -10486,7 +10486,7 @@ registers or on the procedure call stack.
 
				 
			
 
				 Sections~\ref{sec:expose-allocation} through \ref{sec:print-x86-gc}
			
 
				 discuss all the necessary changes and additions to the compiler
			
 
				-passes, including a new compiler pass named \code{expose-allocation}.
			
 
				+passes, including a new compiler pass named \code{expose\_allocation}.
			
 
				 
			
 
				 \section{The \LangVec{} Language}
			
 
				 \label{sec:r3}
			
@@ -11094,7 +11094,7 @@ which we call a \emph{root stack}\index{subject}{root stack}
 
				 (a.k.a. ``shadow
			
 
				 stack'')~\citep{Siebert:2001aa,Henderson:2002aa,Baker:2009aa}. That
			
 
				 is, when a local variable needs to be spilled and is of type
			
 
				-\racket{\code{Vector}}\python{\code{tuple}}, then we put it on the
			
 
				+\racket{\code{Vector}}\python{\code{TupleType}}, then we put it on the
			
 
				 root stack instead of the normal procedure call stack. Furthermore, we
			
 
				 always spill tuple-typed variables if they are live during a call to
			
 
				 the collector, thereby ensuring that no pointers are in registers
			
@@ -11217,13 +11217,13 @@ succeed.
 
				 
			
 
				 The introduction of garbage collection has a non-trivial impact on our
			
 
				 compiler passes. We introduce a new compiler pass named
			
 
				-\code{expose-allocation}. We make
			
 
				-significant changes to \code{select\_instructions},
			
 
				-\code{build\_interference}, \code{allocate\_registers}, and
			
 
				-\code{print\_x86} and make minor changes in several more passes.  The
			
 
				-following program will serve as our running example.  It creates two
			
 
				-tuples, one nested inside the other. Both tuples have length one. The
			
 
				-program accesses the element in the inner tuple tuple.
			
 
				+\code{expose\_allocation}. We make significant changes to
			
 
				+\code{select\_instructions}, \code{build\_interference},
			
 
				+\code{allocate\_registers}, and \code{prelude\_and\_conclusion} and
			
 
				+make minor changes in several more passes.  The following program will
			
 
				+serve as our running example.  It creates two tuples, one nested
			
 
				+inside the other. Both tuples have length one. The program accesses
			
 
				+the element in the inner tuple tuple.
			
 
				 % tests/vectors_test_17.rkt
			
 
				 {\if\edition\racketEd
			
 
				 \begin{lstlisting}
			
@@ -11242,20 +11242,22 @@ print( ((42,),)[0][0] )
 
				 \label{sec:shrink-Lvec}
			
 
				 
			
 
				 Recall that the \code{shrink} pass translates the primitives operators
			
 
				-into a smaller set of primitives. Because this pass comes after type
			
 
				-checking, but before the passes that require the type information in
			
 
				-the \code{HasType} AST nodes, the \code{shrink} pass must be modified
			
 
				-to wrap \code{HasType} around each AST node that it generates.
			
 
				+into a smaller set of primitives.
			
 
				+%
			
 
				+This pass comes after type checking and the type checker adds a
			
 
				+\code{HasType} AST node around each \code{vector} AST node, so you'll
			
 
				+need to add a case for \code{HasType} to the \code{shrink} pass.
			
 
				 
			
 
				 \fi}
			
 
				 
			
 
				 \section{Expose Allocation}
			
 
				 \label{sec:expose-allocation}
			
 
				 
			
 
				-The pass \code{expose\_allocation} lowers tuple creation
			
 
				-into a conditional call to the collector followed by the
			
 
				-allocation.  We choose to place the \code{expose\_allocation} pass
			
 
				-before \code{remove\_complex\_operands} because the code generated by
			
 
				+The pass \code{expose\_allocation} lowers tuple creation into a
			
 
				+conditional call to the collector followed by allocating the
			
 
				+appropriate amount of memory and initializing it.  We choose to place
			
 
				+the \code{expose\_allocation} pass before
			
 
				+\code{remove\_complex\_operands} because the code generated by
			
 
				 \code{expose\_allocation} contains complex operands.
			
 
				 
			
 
				 The output of \code{expose\_allocation} is a language \LangAlloc{}
			
@@ -11285,10 +11287,13 @@ of tuple creation.
 
				 
			
 
				 \fi}
			
 
				 
			
 
				-The \CCOLLECT{n} form runs the garbage collector, requesting $n$
			
 
				-bytes. During instruction selection, it will become a call to the
			
 
				-\code{collect} function in \code{runtime.c}.  The \CALLOCATE{n}{T}
			
 
				-form creates a tuple with space for $n$ elements, but they are not
			
 
				+The \CCOLLECT{$n$} form runs the garbage collector, requesting that it
			
 
				+make sure that there are $n$ bytes ready to be allocated. During
			
 
				+instruction selection, the \CCOLLECT{$n$} form will become a call to
			
 
				+the \code{collect} function in \code{runtime.c}.
			
 
				+%
			
 
				+The \CALLOCATE{$n$}{$T$} form obtains memory for $n$ elements (and
			
 
				+space at the front for the 64 bit tag), but the elements are not
			
 
				 initialized.  \index{subject}{allocate} The $T$ parameter is the type
			
 
				 of the tuple:
			
 
				 %
			
@@ -11305,10 +11310,15 @@ as \code{free\_ptr}.
 
				 The following shows the transformation of tuple creation into 1) a
			
 
				 sequence of temporary variables bindings for the initializing
			
 
				 expressions, 2) a conditional call to \code{collect}, 3) a call to
			
 
				-\code{allocate}, and 4) the initialization of the vector. The
			
 
				-\itm{len} placeholder refers to the length of the vector and
			
 
				+\code{allocate}, and 4) the initialization of the tuple. The
			
 
				+\itm{len} placeholder refers to the length of the tuple and
			
 
				 \itm{bytes} is how many total bytes need to be allocated for the
			
 
				-vector, which is 8 for the tag plus \itm{len} times 8.
			
 
				+tuple, which is 8 for the tag plus \itm{len} times 8.
			
 
				+%
			
 
				+\python{The \itm{type} needed for the second argument of the
			
 
				+  \code{allocate} form can be obtained from the \code{has\_type} field
			
 
				+  of the tuple AST node, which is stored there by running the type
			
 
				+  checker for \LangVec{} immediately before this pass.}
			
 
				 %
			
 
				 {\if\edition\racketEd
			
 
				 \begin{lstlisting}
			
@@ -11345,10 +11355,10 @@ vector, which is 8 for the tag plus \itm{len} times 8.
 
				 \end{lstlisting}
			
 
				 \fi}
			
 
				 %
			
 
				-The placement of the initializing expressions $e_0,\ldots,e_{n-1}$
			
 
				-prior to the \code{allocate} is important, as those expressions may
			
 
				-trigger garbage collection and we cannot have an allocated but
			
 
				-uninitialized tuple on the heap during a collection.
			
 
				+\noindent The sequencing of the initializing expressions
			
 
				+$e_0,\ldots,e_{n-1}$ prior to the \code{allocate} is important, as
			
 
				+they may trigger garbage collection and we cannot have an allocated
			
 
				+but uninitialized tuple on the heap during a collection.
			
 
				 
			
 
				 Figure~\ref{fig:expose-alloc-output} shows the output of the
			
 
				 \code{expose\_allocation} pass on our running example.
			
@@ -11395,7 +11405,7 @@ where $T_1$ is
 
				               0
			
 
				           else:
			
 
				               collect(16)
			
 
				-          tmp.2 = allocate(1, tuple[tuple[int]])
			
 
				+          tmp.2 = allocate(1, TupleType(TupleType([int])))
			
 
				           tmp.2[0] = tmp.1
			
 
				           tmp.2
			
 
				 \end{lstlisting}
			
@@ -11407,7 +11417,7 @@ and $T_2$ is
 
				               0
			
 
				           else:
			
 
				               collect(16)
			
 
				-          tmp.4 = allocate(1, tuple[int])
			
 
				+          tmp.4 = allocate(1, TupleType([int]))
			
 
				           tmp.4[0] = tmp.3
			
 
				           tmp.4
			
 
				 \end{lstlisting}
			
@@ -11429,9 +11439,9 @@ should be treated as complex operands.
 
				 %
			
 
				 {\if\edition\pythonEd
			
 
				 %
			
 
				-The expressions \code{Allocate}, \code{GlobalValue}, \code{Begin}, and
			
 
				-\code{Subscript} should be treated as complex operands.  The
			
 
				-sub-expressions of \code{Subscript} must be atomic.
			
 
				+The expressions \code{allocate}, \code{global\_value}, \code{begin},
			
 
				+and tuple access should be treated as complex operands.  The
			
 
				+sub-expressions of tuple access must be atomic.
			
 
				 %
			
 
				 \fi}
			
 
				 %% A new case for
			
@@ -11571,10 +11581,10 @@ expressions of \LangCVec{} include \key{allocate},
 
				 %
			
 
				 and \key{global\_value}.
			
 
				 %
			
 
				-\python{It also includes the \code{collect} statment and
			
 
				+\python{\LangCVec{} also includes the \code{collect} statement and
			
 
				 assignment to a tuple element.}
			
 
				 %
			
 
				-\racket{It also includes the new \code{collect} statement.}
			
 
				+\racket{\LangCVec{} also includes the new \code{collect} statement.}
			
 
				 %
			
 
				 The \code{explicate\_control} pass can treat these new forms much like
			
 
				 the other forms that we've already encoutered.
			
@@ -11642,7 +11652,7 @@ register \code{r11} ensures that offset expression
 
				 removing \code{r11} from consideration by the register allocating.
			
 
				 
			
 
				 Why not use \code{rax} instead of \code{r11}? Suppose we instead used
			
 
				-\code{rax}. Then the generated code for \code{vector-set!} would be
			
 
				+\code{rax}. Then the generated code for tuple assignment would be
			
 
				 \begin{lstlisting}
			
 
				 movq |$\itm{tup}'$|, %rax
			
 
				 movq |$\itm{rhs}'$|, |$8(n+1)$|(%rax)
			
@@ -11669,10 +11679,15 @@ being allocated, which is $8(\itm{len}+1)$ bytes because each element
 
				 is 8 bytes (64 bits) and we use 8 bytes for the tag.  We then
			
 
				 initialize the \itm{tag} and finally copy the address in \code{r11} to
			
 
				 the left-hand-side. Refer to Figure~\ref{fig:tuple-rep} to see how the
			
 
				-tag is organized. We recommend using the Racket operations
			
 
				+tag is organized.
			
 
				+%
			
 
				+\racket{We recommend using the Racket operations
			
 
				 \code{bitwise-ior} and \code{arithmetic-shift} to compute the tag
			
 
				-during compilation.  The type annotation in the \code{vector} form is
			
 
				-used to determine the pointer mask region of the tag.
			
 
				+during compilation.}
			
 
				+%
			
 
				+The type annotation in the \code{allocate} form is used to determine
			
 
				+the pointer mask region of the tag.
			
 
				+%
			
 
				 {\if\edition\racketEd
			
 
				 \begin{lstlisting}
			
 
				    |$\itm{lhs}$| = (allocate |$\itm{len}$| (Vector |$\itm{type} \ldots$|));
			
@@ -11685,7 +11700,7 @@ used to determine the pointer mask region of the tag.
 
				 \fi}
			
 
				 {\if\edition\pythonEd    
			
 
				 \begin{lstlisting}
			
 
				-   |$\itm{lhs}$| = allocate(|$\itm{len}$|, tuple[|$\itm{type}, \ldots$]|);
			
 
				+   |$\itm{lhs}$| = allocate(|$\itm{len}$|, TupleType([|$\itm{type}, \ldots$])|);
			
 
				    |$\Longrightarrow$|
			
 
				    movq free_ptr(%rip), %r11
			
 
				    addq |$8(\itm{len}+1)$|, free_ptr(%rip)
			
@@ -11755,8 +11770,7 @@ available for use by the register allocator.
 
				 
			
 
				 The concrete and abstract syntax of the \LangXGlobal{} language is
			
 
				 defined in Figures~\ref{fig:x86-2-concrete} and \ref{fig:x86-2}.  It
			
 
				-differs from \LangXIf{} just in the addition of the form for global
			
 
				-variables.
			
 
				+differs from \LangXIf{} just in the addition of global variables.
			
 
				 %
			
 
				 Figure~\ref{fig:select-instr-output-gc} shows the output of the
			
 
				 \code{select\_instructions} pass on the running example.
			
@@ -11838,28 +11852,34 @@ block40:
 
				 
			
 
				 As discussed earlier in this chapter, the garbage collector needs to
			
 
				 access all the pointers in the root set, that is, all variables that
			
 
				-are vectors. It will be the responsibility of the register allocator
			
 
				+are tuples. It will be the responsibility of the register allocator
			
 
				 to make sure that:
			
 
				 \begin{enumerate}
			
 
				-\item the root stack is used for spilling vector-typed variables, and
			
 
				-\item if a vector-typed variable is live during a call to the
			
 
				+\item the root stack is used for spilling tuple-typed variables, and
			
 
				+\item if a tuple-typed variable is live during a call to the
			
 
				   collector, it must be spilled to ensure it is visible to the
			
 
				   collector.
			
 
				 \end{enumerate}
			
 
				 
			
 
				 The later responsibility can be handled during construction of the
			
 
				 interference graph, by adding interference edges between the call-live
			
 
				-vector-typed variables and all the callee-saved registers. (They
			
 
				-already interfere with the caller-saved registers.)  The type
			
 
				-information for variables is in the \code{Program} form, so we
			
 
				-recommend adding another parameter to the \code{build\_interference}
			
 
				-function to communicate this alist.
			
 
				-
			
 
				-The spilling of vector-typed variables to the root stack can be
			
 
				-handled after graph coloring, when choosing how to assign the colors
			
 
				-(integers) to registers and stack locations. The \code{Program} output
			
 
				-of this pass changes to also record the number of spills to the root
			
 
				-stack.
			
 
				+tuple-typed variables and all the callee-saved registers. (They
			
 
				+already interfere with the caller-saved registers.)
			
 
				+%
			
 
				+\racket{The type information for variables is in the \code{Program}
			
 
				+  form, so we recommend adding another parameter to the
			
 
				+  \code{build\_interference} function to communicate this alist.}
			
 
				+%
			
 
				+\python{The type information for variables is generated by the type
			
 
				+  checker for \LangCVec{}, stored a field named \code{var\_types} in
			
 
				+  the \code{CProgram} AST mode. You'll need to propagate that
			
 
				+  information so that it is available in this pass.}
			
 
				+
			
 
				+The spilling of tuple-typed variables to the root stack can be handled
			
 
				+after graph coloring, when choosing how to assign the colors
			
 
				+(integers) to registers and stack locations. The
			
 
				+\racket{\code{Program}}\python{\code{CProgram}} output of this pass
			
 
				+changes to also record the number of spills to the root stack.
			
 
				 
			
 
				 % build-interference
			
 
				 %
			
@@ -11874,7 +11894,7 @@ stack.
 
				 
			
 
				 
			
 
				 
			
 
				-\section{Generate Prelude and Conclusion}
			
 
				+\section{Prelude and Conclusion}
			
 
				 \label{sec:print-x86-gc}
			
 
				 \label{sec:prelude-conclusion-x86-gc}
			
 
				 \index{subject}{prelude}\index{subject}{conclusion}
			
--- a/defs.tex
+++ b/defs.tex
@@ -151,7 +151,7 @@
 
				 \newcommand{\COLLECT}[1]{\LP\key{Collect}~#1\RP}
			
 
				 \newcommand{\CCOLLECT}[1]{\LP\key{collect}~#1\RP}
			
 
				 \newcommand{\ALLOCATE}[2]{\LP\key{Allocate}~#1~#2\RP}
			
 
				-\newcommand{\CALLOCATE}[2]{\LP\key{allocate}~#1\RP}
			
 
				+\newcommand{\CALLOCATE}[2]{\LP\key{allocate}~#1~#2\RP}
			
 
				 \newcommand{\GLOBAL}[1]{\LP\key{Global}~#1\RP}
			
 
				 \newcommand{\CGLOBAL}[1]{#1\key{(\%rip)}}
			
 
				 \newcommand{\GLOBALVALUE}[1]{\LP\key{GlobalValue}~#1\RP}
			
@@ -197,11 +197,11 @@
 
				 \newcommand{\COR}[2]{#1~\key{or}~#2}
			
 
				 \newcommand{\INTTY}{{\key{int}}}
			
 
				 \newcommand{\BOOLTY}{{\key{bool}}}
			
 
				-\newcommand{\VECTY}[1]{{\key{Tuple}\LS #1 \RS}}
			
 
				+\newcommand{\VECTY}[1]{{\key{TupleType}\LP\LS #1 \RS\RP}}
			
 
				 \newcommand{\COLLECT}[1]{\key{Collect}\LP#1\RP}
			
 
				 \newcommand{\CCOLLECT}[1]{\key{collect}\LP#1\RP}
			
 
				 \newcommand{\ALLOCATE}[2]{\key{Allocate}\LP#1,#2\RP}
			
 
				-\newcommand{\CALLOCATE}[2]{\key{allocate}\LP#1\RP}
			
 
				+\newcommand{\CALLOCATE}[2]{\key{allocate}\LP#1,#2\RP}
			
 
				 \newcommand{\GLOBALVALUE}[1]{\key{GlobalValue}\LP#1\RP}
			
 
				 \newcommand{\CGLOBALVALUE}[1]{\key{global\_value}\LP#1\RP}
			
 
				 \newcommand{\GLOBAL}[1]{\key{Global}\LP#1\RP}