|
@@ -10486,7 +10486,7 @@ registers or on the procedure call stack.
|
|
|
|
|
|
Sections~\ref{sec:expose-allocation} through \ref{sec:print-x86-gc}
|
|
|
discuss all the necessary changes and additions to the compiler
|
|
|
-passes, including a new compiler pass named \code{expose-allocation}.
|
|
|
+passes, including a new compiler pass named \code{expose\_allocation}.
|
|
|
|
|
|
\section{The \LangVec{} Language}
|
|
|
\label{sec:r3}
|
|
@@ -11094,7 +11094,7 @@ which we call a \emph{root stack}\index{subject}{root stack}
|
|
|
(a.k.a. ``shadow
|
|
|
stack'')~\citep{Siebert:2001aa,Henderson:2002aa,Baker:2009aa}. That
|
|
|
is, when a local variable needs to be spilled and is of type
|
|
|
-\racket{\code{Vector}}\python{\code{tuple}}, then we put it on the
|
|
|
+\racket{\code{Vector}}\python{\code{TupleType}}, then we put it on the
|
|
|
root stack instead of the normal procedure call stack. Furthermore, we
|
|
|
always spill tuple-typed variables if they are live during a call to
|
|
|
the collector, thereby ensuring that no pointers are in registers
|
|
@@ -11217,13 +11217,13 @@ succeed.
|
|
|
|
|
|
The introduction of garbage collection has a non-trivial impact on our
|
|
|
compiler passes. We introduce a new compiler pass named
|
|
|
-\code{expose-allocation}. We make
|
|
|
-significant changes to \code{select\_instructions},
|
|
|
-\code{build\_interference}, \code{allocate\_registers}, and
|
|
|
-\code{print\_x86} and make minor changes in several more passes. The
|
|
|
-following program will serve as our running example. It creates two
|
|
|
-tuples, one nested inside the other. Both tuples have length one. The
|
|
|
-program accesses the element in the inner tuple tuple.
|
|
|
+\code{expose\_allocation}. We make significant changes to
|
|
|
+\code{select\_instructions}, \code{build\_interference},
|
|
|
+\code{allocate\_registers}, and \code{prelude\_and\_conclusion} and
|
|
|
+make minor changes in several more passes. The following program will
|
|
|
+serve as our running example. It creates two tuples, one nested
|
|
|
+inside the other. Both tuples have length one. The program accesses
|
|
|
+the element in the inner tuple tuple.
|
|
|
% tests/vectors_test_17.rkt
|
|
|
{\if\edition\racketEd
|
|
|
\begin{lstlisting}
|
|
@@ -11242,20 +11242,22 @@ print( ((42,),)[0][0] )
|
|
|
\label{sec:shrink-Lvec}
|
|
|
|
|
|
Recall that the \code{shrink} pass translates the primitives operators
|
|
|
-into a smaller set of primitives. Because this pass comes after type
|
|
|
-checking, but before the passes that require the type information in
|
|
|
-the \code{HasType} AST nodes, the \code{shrink} pass must be modified
|
|
|
-to wrap \code{HasType} around each AST node that it generates.
|
|
|
+into a smaller set of primitives.
|
|
|
+%
|
|
|
+This pass comes after type checking and the type checker adds a
|
|
|
+\code{HasType} AST node around each \code{vector} AST node, so you'll
|
|
|
+need to add a case for \code{HasType} to the \code{shrink} pass.
|
|
|
|
|
|
\fi}
|
|
|
|
|
|
\section{Expose Allocation}
|
|
|
\label{sec:expose-allocation}
|
|
|
|
|
|
-The pass \code{expose\_allocation} lowers tuple creation
|
|
|
-into a conditional call to the collector followed by the
|
|
|
-allocation. We choose to place the \code{expose\_allocation} pass
|
|
|
-before \code{remove\_complex\_operands} because the code generated by
|
|
|
+The pass \code{expose\_allocation} lowers tuple creation into a
|
|
|
+conditional call to the collector followed by allocating the
|
|
|
+appropriate amount of memory and initializing it. We choose to place
|
|
|
+the \code{expose\_allocation} pass before
|
|
|
+\code{remove\_complex\_operands} because the code generated by
|
|
|
\code{expose\_allocation} contains complex operands.
|
|
|
|
|
|
The output of \code{expose\_allocation} is a language \LangAlloc{}
|
|
@@ -11285,10 +11287,13 @@ of tuple creation.
|
|
|
|
|
|
\fi}
|
|
|
|
|
|
-The \CCOLLECT{n} form runs the garbage collector, requesting $n$
|
|
|
-bytes. During instruction selection, it will become a call to the
|
|
|
-\code{collect} function in \code{runtime.c}. The \CALLOCATE{n}{T}
|
|
|
-form creates a tuple with space for $n$ elements, but they are not
|
|
|
+The \CCOLLECT{$n$} form runs the garbage collector, requesting that it
|
|
|
+make sure that there are $n$ bytes ready to be allocated. During
|
|
|
+instruction selection, the \CCOLLECT{$n$} form will become a call to
|
|
|
+the \code{collect} function in \code{runtime.c}.
|
|
|
+%
|
|
|
+The \CALLOCATE{$n$}{$T$} form obtains memory for $n$ elements (and
|
|
|
+space at the front for the 64 bit tag), but the elements are not
|
|
|
initialized. \index{subject}{allocate} The $T$ parameter is the type
|
|
|
of the tuple:
|
|
|
%
|
|
@@ -11305,10 +11310,15 @@ as \code{free\_ptr}.
|
|
|
The following shows the transformation of tuple creation into 1) a
|
|
|
sequence of temporary variables bindings for the initializing
|
|
|
expressions, 2) a conditional call to \code{collect}, 3) a call to
|
|
|
-\code{allocate}, and 4) the initialization of the vector. The
|
|
|
-\itm{len} placeholder refers to the length of the vector and
|
|
|
+\code{allocate}, and 4) the initialization of the tuple. The
|
|
|
+\itm{len} placeholder refers to the length of the tuple and
|
|
|
\itm{bytes} is how many total bytes need to be allocated for the
|
|
|
-vector, which is 8 for the tag plus \itm{len} times 8.
|
|
|
+tuple, which is 8 for the tag plus \itm{len} times 8.
|
|
|
+%
|
|
|
+\python{The \itm{type} needed for the second argument of the
|
|
|
+ \code{allocate} form can be obtained from the \code{has\_type} field
|
|
|
+ of the tuple AST node, which is stored there by running the type
|
|
|
+ checker for \LangVec{} immediately before this pass.}
|
|
|
%
|
|
|
{\if\edition\racketEd
|
|
|
\begin{lstlisting}
|
|
@@ -11345,10 +11355,10 @@ vector, which is 8 for the tag plus \itm{len} times 8.
|
|
|
\end{lstlisting}
|
|
|
\fi}
|
|
|
%
|
|
|
-The placement of the initializing expressions $e_0,\ldots,e_{n-1}$
|
|
|
-prior to the \code{allocate} is important, as those expressions may
|
|
|
-trigger garbage collection and we cannot have an allocated but
|
|
|
-uninitialized tuple on the heap during a collection.
|
|
|
+\noindent The sequencing of the initializing expressions
|
|
|
+$e_0,\ldots,e_{n-1}$ prior to the \code{allocate} is important, as
|
|
|
+they may trigger garbage collection and we cannot have an allocated
|
|
|
+but uninitialized tuple on the heap during a collection.
|
|
|
|
|
|
Figure~\ref{fig:expose-alloc-output} shows the output of the
|
|
|
\code{expose\_allocation} pass on our running example.
|
|
@@ -11395,7 +11405,7 @@ where $T_1$ is
|
|
|
0
|
|
|
else:
|
|
|
collect(16)
|
|
|
- tmp.2 = allocate(1, tuple[tuple[int]])
|
|
|
+ tmp.2 = allocate(1, TupleType(TupleType([int])))
|
|
|
tmp.2[0] = tmp.1
|
|
|
tmp.2
|
|
|
\end{lstlisting}
|
|
@@ -11407,7 +11417,7 @@ and $T_2$ is
|
|
|
0
|
|
|
else:
|
|
|
collect(16)
|
|
|
- tmp.4 = allocate(1, tuple[int])
|
|
|
+ tmp.4 = allocate(1, TupleType([int]))
|
|
|
tmp.4[0] = tmp.3
|
|
|
tmp.4
|
|
|
\end{lstlisting}
|
|
@@ -11429,9 +11439,9 @@ should be treated as complex operands.
|
|
|
%
|
|
|
{\if\edition\pythonEd
|
|
|
%
|
|
|
-The expressions \code{Allocate}, \code{GlobalValue}, \code{Begin}, and
|
|
|
-\code{Subscript} should be treated as complex operands. The
|
|
|
-sub-expressions of \code{Subscript} must be atomic.
|
|
|
+The expressions \code{allocate}, \code{global\_value}, \code{begin},
|
|
|
+and tuple access should be treated as complex operands. The
|
|
|
+sub-expressions of tuple access must be atomic.
|
|
|
%
|
|
|
\fi}
|
|
|
%% A new case for
|
|
@@ -11571,10 +11581,10 @@ expressions of \LangCVec{} include \key{allocate},
|
|
|
%
|
|
|
and \key{global\_value}.
|
|
|
%
|
|
|
-\python{It also includes the \code{collect} statment and
|
|
|
+\python{\LangCVec{} also includes the \code{collect} statement and
|
|
|
assignment to a tuple element.}
|
|
|
%
|
|
|
-\racket{It also includes the new \code{collect} statement.}
|
|
|
+\racket{\LangCVec{} also includes the new \code{collect} statement.}
|
|
|
%
|
|
|
The \code{explicate\_control} pass can treat these new forms much like
|
|
|
the other forms that we've already encoutered.
|
|
@@ -11642,7 +11652,7 @@ register \code{r11} ensures that offset expression
|
|
|
removing \code{r11} from consideration by the register allocating.
|
|
|
|
|
|
Why not use \code{rax} instead of \code{r11}? Suppose we instead used
|
|
|
-\code{rax}. Then the generated code for \code{vector-set!} would be
|
|
|
+\code{rax}. Then the generated code for tuple assignment would be
|
|
|
\begin{lstlisting}
|
|
|
movq |$\itm{tup}'$|, %rax
|
|
|
movq |$\itm{rhs}'$|, |$8(n+1)$|(%rax)
|
|
@@ -11669,10 +11679,15 @@ being allocated, which is $8(\itm{len}+1)$ bytes because each element
|
|
|
is 8 bytes (64 bits) and we use 8 bytes for the tag. We then
|
|
|
initialize the \itm{tag} and finally copy the address in \code{r11} to
|
|
|
the left-hand-side. Refer to Figure~\ref{fig:tuple-rep} to see how the
|
|
|
-tag is organized. We recommend using the Racket operations
|
|
|
+tag is organized.
|
|
|
+%
|
|
|
+\racket{We recommend using the Racket operations
|
|
|
\code{bitwise-ior} and \code{arithmetic-shift} to compute the tag
|
|
|
-during compilation. The type annotation in the \code{vector} form is
|
|
|
-used to determine the pointer mask region of the tag.
|
|
|
+during compilation.}
|
|
|
+%
|
|
|
+The type annotation in the \code{allocate} form is used to determine
|
|
|
+the pointer mask region of the tag.
|
|
|
+%
|
|
|
{\if\edition\racketEd
|
|
|
\begin{lstlisting}
|
|
|
|$\itm{lhs}$| = (allocate |$\itm{len}$| (Vector |$\itm{type} \ldots$|));
|
|
@@ -11685,7 +11700,7 @@ used to determine the pointer mask region of the tag.
|
|
|
\fi}
|
|
|
{\if\edition\pythonEd
|
|
|
\begin{lstlisting}
|
|
|
- |$\itm{lhs}$| = allocate(|$\itm{len}$|, tuple[|$\itm{type}, \ldots$]|);
|
|
|
+ |$\itm{lhs}$| = allocate(|$\itm{len}$|, TupleType([|$\itm{type}, \ldots$])|);
|
|
|
|$\Longrightarrow$|
|
|
|
movq free_ptr(%rip), %r11
|
|
|
addq |$8(\itm{len}+1)$|, free_ptr(%rip)
|
|
@@ -11755,8 +11770,7 @@ available for use by the register allocator.
|
|
|
|
|
|
The concrete and abstract syntax of the \LangXGlobal{} language is
|
|
|
defined in Figures~\ref{fig:x86-2-concrete} and \ref{fig:x86-2}. It
|
|
|
-differs from \LangXIf{} just in the addition of the form for global
|
|
|
-variables.
|
|
|
+differs from \LangXIf{} just in the addition of global variables.
|
|
|
%
|
|
|
Figure~\ref{fig:select-instr-output-gc} shows the output of the
|
|
|
\code{select\_instructions} pass on the running example.
|
|
@@ -11838,28 +11852,34 @@ block40:
|
|
|
|
|
|
As discussed earlier in this chapter, the garbage collector needs to
|
|
|
access all the pointers in the root set, that is, all variables that
|
|
|
-are vectors. It will be the responsibility of the register allocator
|
|
|
+are tuples. It will be the responsibility of the register allocator
|
|
|
to make sure that:
|
|
|
\begin{enumerate}
|
|
|
-\item the root stack is used for spilling vector-typed variables, and
|
|
|
-\item if a vector-typed variable is live during a call to the
|
|
|
+\item the root stack is used for spilling tuple-typed variables, and
|
|
|
+\item if a tuple-typed variable is live during a call to the
|
|
|
collector, it must be spilled to ensure it is visible to the
|
|
|
collector.
|
|
|
\end{enumerate}
|
|
|
|
|
|
The later responsibility can be handled during construction of the
|
|
|
interference graph, by adding interference edges between the call-live
|
|
|
-vector-typed variables and all the callee-saved registers. (They
|
|
|
-already interfere with the caller-saved registers.) The type
|
|
|
-information for variables is in the \code{Program} form, so we
|
|
|
-recommend adding another parameter to the \code{build\_interference}
|
|
|
-function to communicate this alist.
|
|
|
-
|
|
|
-The spilling of vector-typed variables to the root stack can be
|
|
|
-handled after graph coloring, when choosing how to assign the colors
|
|
|
-(integers) to registers and stack locations. The \code{Program} output
|
|
|
-of this pass changes to also record the number of spills to the root
|
|
|
-stack.
|
|
|
+tuple-typed variables and all the callee-saved registers. (They
|
|
|
+already interfere with the caller-saved registers.)
|
|
|
+%
|
|
|
+\racket{The type information for variables is in the \code{Program}
|
|
|
+ form, so we recommend adding another parameter to the
|
|
|
+ \code{build\_interference} function to communicate this alist.}
|
|
|
+%
|
|
|
+\python{The type information for variables is generated by the type
|
|
|
+ checker for \LangCVec{}, stored a field named \code{var\_types} in
|
|
|
+ the \code{CProgram} AST mode. You'll need to propagate that
|
|
|
+ information so that it is available in this pass.}
|
|
|
+
|
|
|
+The spilling of tuple-typed variables to the root stack can be handled
|
|
|
+after graph coloring, when choosing how to assign the colors
|
|
|
+(integers) to registers and stack locations. The
|
|
|
+\racket{\code{Program}}\python{\code{CProgram}} output of this pass
|
|
|
+changes to also record the number of spills to the root stack.
|
|
|
|
|
|
% build-interference
|
|
|
%
|
|
@@ -11874,7 +11894,7 @@ stack.
|
|
|
|
|
|
|
|
|
|
|
|
-\section{Generate Prelude and Conclusion}
|
|
|
+\section{Prelude and Conclusion}
|
|
|
\label{sec:print-x86-gc}
|
|
|
\label{sec:prelude-conclusion-x86-gc}
|
|
|
\index{subject}{prelude}\index{subject}{conclusion}
|