|
@@ -11085,7 +11085,7 @@ which we call a \emph{root stack}\index{subject}{root stack}
|
|
|
(a.k.a. ``shadow
|
|
|
stack'')~\citep{Siebert:2001aa,Henderson:2002aa,Baker:2009aa}. That
|
|
|
is, when a local variable needs to be spilled and is of type
|
|
|
-\racket{\code{Vector}}\python{\code{tuple}}, then we put it on the
|
|
|
+\racket{\code{Vector}}\python{\code{TupleType}}, then we put it on the
|
|
|
root stack instead of the normal procedure call stack. Furthermore, we
|
|
|
always spill tuple-typed variables if they are live during a call to
|
|
|
the collector, thereby ensuring that no pointers are in registers
|
|
@@ -11208,13 +11208,13 @@ succeed.
|
|
|
|
|
|
The introduction of garbage collection has a non-trivial impact on our
|
|
|
compiler passes. We introduce a new compiler pass named
|
|
|
-\code{expose-allocation}. We make
|
|
|
-significant changes to \code{select\_instructions},
|
|
|
-\code{build\_interference}, \code{allocate\_registers}, and
|
|
|
-\code{print\_x86} and make minor changes in several more passes. The
|
|
|
-following program will serve as our running example. It creates two
|
|
|
-tuples, one nested inside the other. Both tuples have length one. The
|
|
|
-program accesses the element in the inner tuple tuple.
|
|
|
+\code{expose\_allocation}. We make significant changes to
|
|
|
+\code{select\_instructions}, \code{build\_interference},
|
|
|
+\code{allocate\_registers}, and \code{prelude\_and\_conclusion} and
|
|
|
+make minor changes in several more passes. The following program will
|
|
|
+serve as our running example. It creates two tuples, one nested
|
|
|
+inside the other. Both tuples have length one. The program accesses
|
|
|
+the element in the inner tuple tuple.
|
|
|
% tests/vectors_test_17.rkt
|
|
|
{\if\edition\racketEd
|
|
|
\begin{lstlisting}
|
|
@@ -11243,10 +11243,11 @@ to wrap \code{HasType} around each AST node that it generates.
|
|
|
\section{Expose Allocation}
|
|
|
\label{sec:expose-allocation}
|
|
|
|
|
|
-The pass \code{expose\_allocation} lowers tuple creation
|
|
|
-into a conditional call to the collector followed by the
|
|
|
-allocation. We choose to place the \code{expose\_allocation} pass
|
|
|
-before \code{remove\_complex\_operands} because the code generated by
|
|
|
+The pass \code{expose\_allocation} lowers tuple creation into a
|
|
|
+conditional call to the collector followed by allocating the
|
|
|
+appropriate amount of memory and initializing it. We choose to place
|
|
|
+the \code{expose\_allocation} pass before
|
|
|
+\code{remove\_complex\_operands} because the code generated by
|
|
|
\code{expose\_allocation} contains complex operands.
|
|
|
|
|
|
The output of \code{expose\_allocation} is a language \LangAlloc{}
|
|
@@ -11276,10 +11277,13 @@ of tuple creation.
|
|
|
|
|
|
\fi}
|
|
|
|
|
|
-The \CCOLLECT{$n$} form runs the garbage collector, requesting $n$
|
|
|
-bytes. During instruction selection, it will become a call to the
|
|
|
-\code{collect} function in \code{runtime.c}. The \CALLOCATE{$n$}{$T$}
|
|
|
-form creates a tuple with space for $n$ elements, but they are not
|
|
|
+The \CCOLLECT{$n$} form runs the garbage collector, requesting that it
|
|
|
+make sure that there are $n$ bytes ready to be allocated. During
|
|
|
+instruction selection, the \CCOLLECT{$n$} form will become a call to
|
|
|
+the \code{collect} function in \code{runtime.c}.
|
|
|
+%
|
|
|
+The \CALLOCATE{$n$}{$T$} form obtains memory for $n$ elements (and
|
|
|
+space at the front for the 64 bit tag), but the elements are not
|
|
|
initialized. \index{subject}{allocate} The $T$ parameter is the type
|
|
|
of the tuple:
|
|
|
%
|
|
@@ -11296,14 +11300,14 @@ as \code{free\_ptr}.
|
|
|
The following shows the transformation of tuple creation into 1) a
|
|
|
sequence of temporary variables bindings for the initializing
|
|
|
expressions, 2) a conditional call to \code{collect}, 3) a call to
|
|
|
-\code{allocate}, and 4) the initialization of the vector. The
|
|
|
-\itm{len} placeholder refers to the length of the vector and
|
|
|
+\code{allocate}, and 4) the initialization of the tuple. The
|
|
|
+\itm{len} placeholder refers to the length of the tuple and
|
|
|
\itm{bytes} is how many total bytes need to be allocated for the
|
|
|
-vector, which is 8 for the tag plus \itm{len} times 8.
|
|
|
+tuple, which is 8 for the tag plus \itm{len} times 8.
|
|
|
%
|
|
|
\python{The \itm{type} needed for the second argument of the
|
|
|
\code{allocate} form can be obtained from the \code{has\_type} field
|
|
|
- of the tuple expression, which is stored there by running the type
|
|
|
+ of the tuple AST node, which is stored there by running the type
|
|
|
checker for \LangVec{} immediately before this pass.}
|
|
|
%
|
|
|
{\if\edition\racketEd
|
|
@@ -11341,10 +11345,10 @@ vector, which is 8 for the tag plus \itm{len} times 8.
|
|
|
\end{lstlisting}
|
|
|
\fi}
|
|
|
%
|
|
|
-\noindent The placement of the initializing expressions
|
|
|
+\noindent The sequencing of the initializing expressions
|
|
|
$e_0,\ldots,e_{n-1}$ prior to the \code{allocate} is important, as
|
|
|
-those expressions may trigger garbage collection and we cannot have an
|
|
|
-allocated but uninitialized tuple on the heap during a collection.
|
|
|
+they may trigger garbage collection and we cannot have an allocated
|
|
|
+but uninitialized tuple on the heap during a collection.
|
|
|
|
|
|
Figure~\ref{fig:expose-alloc-output} shows the output of the
|
|
|
\code{expose\_allocation} pass on our running example.
|
|
@@ -11425,9 +11429,9 @@ should be treated as complex operands.
|
|
|
%
|
|
|
{\if\edition\pythonEd
|
|
|
%
|
|
|
-The expressions \code{Allocate}, \code{GlobalValue}, \code{Begin}, and
|
|
|
-\code{Subscript} should be treated as complex operands. The
|
|
|
-sub-expressions of \code{Subscript} must be atomic.
|
|
|
+The expressions \code{allocate}, \code{global\_value}, \code{begin},
|
|
|
+and tuple access should be treated as complex operands. The
|
|
|
+sub-expressions of tuple access must be atomic.
|
|
|
%
|
|
|
\fi}
|
|
|
%% A new case for
|
|
@@ -11567,10 +11571,10 @@ expressions of \LangCVec{} include \key{allocate},
|
|
|
%
|
|
|
and \key{global\_value}.
|
|
|
%
|
|
|
-\python{It also includes the \code{collect} statment and
|
|
|
+\python{\LangCVec{} also includes the \code{collect} statement and
|
|
|
assignment to a tuple element.}
|
|
|
%
|
|
|
-\racket{It also includes the new \code{collect} statement.}
|
|
|
+\racket{\LangCVec{} also includes the new \code{collect} statement.}
|
|
|
%
|
|
|
The \code{explicate\_control} pass can treat these new forms much like
|
|
|
the other forms that we've already encoutered.
|
|
@@ -11638,7 +11642,7 @@ register \code{r11} ensures that offset expression
|
|
|
removing \code{r11} from consideration by the register allocating.
|
|
|
|
|
|
Why not use \code{rax} instead of \code{r11}? Suppose we instead used
|
|
|
-\code{rax}. Then the generated code for \code{vector-set!} would be
|
|
|
+\code{rax}. Then the generated code for tuple assignment would be
|
|
|
\begin{lstlisting}
|
|
|
movq |$\itm{tup}'$|, %rax
|
|
|
movq |$\itm{rhs}'$|, |$8(n+1)$|(%rax)
|
|
@@ -11756,8 +11760,7 @@ available for use by the register allocator.
|
|
|
|
|
|
The concrete and abstract syntax of the \LangXGlobal{} language is
|
|
|
defined in Figures~\ref{fig:x86-2-concrete} and \ref{fig:x86-2}. It
|
|
|
-differs from \LangXIf{} just in the addition of the form for global
|
|
|
-variables.
|
|
|
+differs from \LangXIf{} just in the addition of global variables.
|
|
|
%
|
|
|
Figure~\ref{fig:select-instr-output-gc} shows the output of the
|
|
|
\code{select\_instructions} pass on the running example.
|
|
@@ -11862,11 +11865,11 @@ already interfere with the caller-saved registers.)
|
|
|
the \code{CProgram} AST mode. You'll need to propagate that
|
|
|
information so that it is available in this pass.}
|
|
|
|
|
|
-The spilling of tuple-typed variables to the root stack can be
|
|
|
-handled after graph coloring, when choosing how to assign the colors
|
|
|
-(integers) to registers and stack locations. The \code{Program} output
|
|
|
-of this pass changes to also record the number of spills to the root
|
|
|
-stack.
|
|
|
+The spilling of tuple-typed variables to the root stack can be handled
|
|
|
+after graph coloring, when choosing how to assign the colors
|
|
|
+(integers) to registers and stack locations. The
|
|
|
+\racket{\code{Program}}\python{\code{CProgram}} output of this pass
|
|
|
+changes to also record the number of spills to the root stack.
|
|
|
|
|
|
% build-interference
|
|
|
%
|