Răsfoiți Sursa

Merge branch 'master' into pjt

Peter Thiemann 3 ani în urmă
părinte
comite
0ed8493f18
2 a modificat fișierele cu 80 adăugiri și 60 ștergeri
  1. 77 57
      book.tex
  2. 3 3
      defs.tex

+ 77 - 57
book.tex

@@ -10486,7 +10486,7 @@ registers or on the procedure call stack.
 
 Sections~\ref{sec:expose-allocation} through \ref{sec:print-x86-gc}
 discuss all the necessary changes and additions to the compiler
-passes, including a new compiler pass named \code{expose-allocation}.
+passes, including a new compiler pass named \code{expose\_allocation}.
 
 \section{The \LangVec{} Language}
 \label{sec:r3}
@@ -11094,7 +11094,7 @@ which we call a \emph{root stack}\index{subject}{root stack}
 (a.k.a. ``shadow
 stack'')~\citep{Siebert:2001aa,Henderson:2002aa,Baker:2009aa}. That
 is, when a local variable needs to be spilled and is of type
-\racket{\code{Vector}}\python{\code{tuple}}, then we put it on the
+\racket{\code{Vector}}\python{\code{TupleType}}, then we put it on the
 root stack instead of the normal procedure call stack. Furthermore, we
 always spill tuple-typed variables if they are live during a call to
 the collector, thereby ensuring that no pointers are in registers
@@ -11217,13 +11217,13 @@ succeed.
 
 The introduction of garbage collection has a non-trivial impact on our
 compiler passes. We introduce a new compiler pass named
-\code{expose-allocation}. We make
-significant changes to \code{select\_instructions},
-\code{build\_interference}, \code{allocate\_registers}, and
-\code{print\_x86} and make minor changes in several more passes.  The
-following program will serve as our running example.  It creates two
-tuples, one nested inside the other. Both tuples have length one. The
-program accesses the element in the inner tuple tuple.
+\code{expose\_allocation}. We make significant changes to
+\code{select\_instructions}, \code{build\_interference},
+\code{allocate\_registers}, and \code{prelude\_and\_conclusion} and
+make minor changes in several more passes.  The following program will
+serve as our running example.  It creates two tuples, one nested
+inside the other. Both tuples have length one. The program accesses
+the element in the inner tuple tuple.
 % tests/vectors_test_17.rkt
 {\if\edition\racketEd
 \begin{lstlisting}
@@ -11242,20 +11242,22 @@ print( ((42,),)[0][0] )
 \label{sec:shrink-Lvec}
 
 Recall that the \code{shrink} pass translates the primitives operators
-into a smaller set of primitives. Because this pass comes after type
-checking, but before the passes that require the type information in
-the \code{HasType} AST nodes, the \code{shrink} pass must be modified
-to wrap \code{HasType} around each AST node that it generates.
+into a smaller set of primitives.
+%
+This pass comes after type checking and the type checker adds a
+\code{HasType} AST node around each \code{vector} AST node, so you'll
+need to add a case for \code{HasType} to the \code{shrink} pass.
 
 \fi}
 
 \section{Expose Allocation}
 \label{sec:expose-allocation}
 
-The pass \code{expose\_allocation} lowers tuple creation
-into a conditional call to the collector followed by the
-allocation.  We choose to place the \code{expose\_allocation} pass
-before \code{remove\_complex\_operands} because the code generated by
+The pass \code{expose\_allocation} lowers tuple creation into a
+conditional call to the collector followed by allocating the
+appropriate amount of memory and initializing it.  We choose to place
+the \code{expose\_allocation} pass before
+\code{remove\_complex\_operands} because the code generated by
 \code{expose\_allocation} contains complex operands.
 
 The output of \code{expose\_allocation} is a language \LangAlloc{}
@@ -11285,10 +11287,13 @@ of tuple creation.
 
 \fi}
 
-The \CCOLLECT{n} form runs the garbage collector, requesting $n$
-bytes. During instruction selection, it will become a call to the
-\code{collect} function in \code{runtime.c}.  The \CALLOCATE{n}{T}
-form creates a tuple with space for $n$ elements, but they are not
+The \CCOLLECT{$n$} form runs the garbage collector, requesting that it
+make sure that there are $n$ bytes ready to be allocated. During
+instruction selection, the \CCOLLECT{$n$} form will become a call to
+the \code{collect} function in \code{runtime.c}.
+%
+The \CALLOCATE{$n$}{$T$} form obtains memory for $n$ elements (and
+space at the front for the 64 bit tag), but the elements are not
 initialized.  \index{subject}{allocate} The $T$ parameter is the type
 of the tuple:
 %
@@ -11305,10 +11310,15 @@ as \code{free\_ptr}.
 The following shows the transformation of tuple creation into 1) a
 sequence of temporary variables bindings for the initializing
 expressions, 2) a conditional call to \code{collect}, 3) a call to
-\code{allocate}, and 4) the initialization of the vector. The
-\itm{len} placeholder refers to the length of the vector and
+\code{allocate}, and 4) the initialization of the tuple. The
+\itm{len} placeholder refers to the length of the tuple and
 \itm{bytes} is how many total bytes need to be allocated for the
-vector, which is 8 for the tag plus \itm{len} times 8.
+tuple, which is 8 for the tag plus \itm{len} times 8.
+%
+\python{The \itm{type} needed for the second argument of the
+  \code{allocate} form can be obtained from the \code{has\_type} field
+  of the tuple AST node, which is stored there by running the type
+  checker for \LangVec{} immediately before this pass.}
 %
 {\if\edition\racketEd
 \begin{lstlisting}
@@ -11345,10 +11355,10 @@ vector, which is 8 for the tag plus \itm{len} times 8.
 \end{lstlisting}
 \fi}
 %
-The placement of the initializing expressions $e_0,\ldots,e_{n-1}$
-prior to the \code{allocate} is important, as those expressions may
-trigger garbage collection and we cannot have an allocated but
-uninitialized tuple on the heap during a collection.
+\noindent The sequencing of the initializing expressions
+$e_0,\ldots,e_{n-1}$ prior to the \code{allocate} is important, as
+they may trigger garbage collection and we cannot have an allocated
+but uninitialized tuple on the heap during a collection.
 
 Figure~\ref{fig:expose-alloc-output} shows the output of the
 \code{expose\_allocation} pass on our running example.
@@ -11395,7 +11405,7 @@ where $T_1$ is
               0
           else:
               collect(16)
-          tmp.2 = allocate(1, tuple[tuple[int]])
+          tmp.2 = allocate(1, TupleType(TupleType([int])))
           tmp.2[0] = tmp.1
           tmp.2
 \end{lstlisting}
@@ -11407,7 +11417,7 @@ and $T_2$ is
               0
           else:
               collect(16)
-          tmp.4 = allocate(1, tuple[int])
+          tmp.4 = allocate(1, TupleType([int]))
           tmp.4[0] = tmp.3
           tmp.4
 \end{lstlisting}
@@ -11429,9 +11439,9 @@ should be treated as complex operands.
 %
 {\if\edition\pythonEd
 %
-The expressions \code{Allocate}, \code{GlobalValue}, \code{Begin}, and
-\code{Subscript} should be treated as complex operands.  The
-sub-expressions of \code{Subscript} must be atomic.
+The expressions \code{allocate}, \code{global\_value}, \code{begin},
+and tuple access should be treated as complex operands.  The
+sub-expressions of tuple access must be atomic.
 %
 \fi}
 %% A new case for
@@ -11571,10 +11581,10 @@ expressions of \LangCVec{} include \key{allocate},
 %
 and \key{global\_value}.
 %
-\python{It also includes the \code{collect} statment and
+\python{\LangCVec{} also includes the \code{collect} statement and
 assignment to a tuple element.}
 %
-\racket{It also includes the new \code{collect} statement.}
+\racket{\LangCVec{} also includes the new \code{collect} statement.}
 %
 The \code{explicate\_control} pass can treat these new forms much like
 the other forms that we've already encoutered.
@@ -11642,7 +11652,7 @@ register \code{r11} ensures that offset expression
 removing \code{r11} from consideration by the register allocating.
 
 Why not use \code{rax} instead of \code{r11}? Suppose we instead used
-\code{rax}. Then the generated code for \code{vector-set!} would be
+\code{rax}. Then the generated code for tuple assignment would be
 \begin{lstlisting}
 movq |$\itm{tup}'$|, %rax
 movq |$\itm{rhs}'$|, |$8(n+1)$|(%rax)
@@ -11669,10 +11679,15 @@ being allocated, which is $8(\itm{len}+1)$ bytes because each element
 is 8 bytes (64 bits) and we use 8 bytes for the tag.  We then
 initialize the \itm{tag} and finally copy the address in \code{r11} to
 the left-hand-side. Refer to Figure~\ref{fig:tuple-rep} to see how the
-tag is organized. We recommend using the Racket operations
+tag is organized.
+%
+\racket{We recommend using the Racket operations
 \code{bitwise-ior} and \code{arithmetic-shift} to compute the tag
-during compilation.  The type annotation in the \code{vector} form is
-used to determine the pointer mask region of the tag.
+during compilation.}
+%
+The type annotation in the \code{allocate} form is used to determine
+the pointer mask region of the tag.
+%
 {\if\edition\racketEd
 \begin{lstlisting}
    |$\itm{lhs}$| = (allocate |$\itm{len}$| (Vector |$\itm{type} \ldots$|));
@@ -11685,7 +11700,7 @@ used to determine the pointer mask region of the tag.
 \fi}
 {\if\edition\pythonEd    
 \begin{lstlisting}
-   |$\itm{lhs}$| = allocate(|$\itm{len}$|, tuple[|$\itm{type}, \ldots$]|);
+   |$\itm{lhs}$| = allocate(|$\itm{len}$|, TupleType([|$\itm{type}, \ldots$])|);
    |$\Longrightarrow$|
    movq free_ptr(%rip), %r11
    addq |$8(\itm{len}+1)$|, free_ptr(%rip)
@@ -11755,8 +11770,7 @@ available for use by the register allocator.
 
 The concrete and abstract syntax of the \LangXGlobal{} language is
 defined in Figures~\ref{fig:x86-2-concrete} and \ref{fig:x86-2}.  It
-differs from \LangXIf{} just in the addition of the form for global
-variables.
+differs from \LangXIf{} just in the addition of global variables.
 %
 Figure~\ref{fig:select-instr-output-gc} shows the output of the
 \code{select\_instructions} pass on the running example.
@@ -11838,28 +11852,34 @@ block40:
 
 As discussed earlier in this chapter, the garbage collector needs to
 access all the pointers in the root set, that is, all variables that
-are vectors. It will be the responsibility of the register allocator
+are tuples. It will be the responsibility of the register allocator
 to make sure that:
 \begin{enumerate}
-\item the root stack is used for spilling vector-typed variables, and
-\item if a vector-typed variable is live during a call to the
+\item the root stack is used for spilling tuple-typed variables, and
+\item if a tuple-typed variable is live during a call to the
   collector, it must be spilled to ensure it is visible to the
   collector.
 \end{enumerate}
 
 The later responsibility can be handled during construction of the
 interference graph, by adding interference edges between the call-live
-vector-typed variables and all the callee-saved registers. (They
-already interfere with the caller-saved registers.)  The type
-information for variables is in the \code{Program} form, so we
-recommend adding another parameter to the \code{build\_interference}
-function to communicate this alist.
-
-The spilling of vector-typed variables to the root stack can be
-handled after graph coloring, when choosing how to assign the colors
-(integers) to registers and stack locations. The \code{Program} output
-of this pass changes to also record the number of spills to the root
-stack.
+tuple-typed variables and all the callee-saved registers. (They
+already interfere with the caller-saved registers.)
+%
+\racket{The type information for variables is in the \code{Program}
+  form, so we recommend adding another parameter to the
+  \code{build\_interference} function to communicate this alist.}
+%
+\python{The type information for variables is generated by the type
+  checker for \LangCVec{}, stored a field named \code{var\_types} in
+  the \code{CProgram} AST mode. You'll need to propagate that
+  information so that it is available in this pass.}
+
+The spilling of tuple-typed variables to the root stack can be handled
+after graph coloring, when choosing how to assign the colors
+(integers) to registers and stack locations. The
+\racket{\code{Program}}\python{\code{CProgram}} output of this pass
+changes to also record the number of spills to the root stack.
 
 % build-interference
 %
@@ -11874,7 +11894,7 @@ stack.
 
 
 
-\section{Generate Prelude and Conclusion}
+\section{Prelude and Conclusion}
 \label{sec:print-x86-gc}
 \label{sec:prelude-conclusion-x86-gc}
 \index{subject}{prelude}\index{subject}{conclusion}

+ 3 - 3
defs.tex

@@ -151,7 +151,7 @@
 \newcommand{\COLLECT}[1]{\LP\key{Collect}~#1\RP}
 \newcommand{\CCOLLECT}[1]{\LP\key{collect}~#1\RP}
 \newcommand{\ALLOCATE}[2]{\LP\key{Allocate}~#1~#2\RP}
-\newcommand{\CALLOCATE}[2]{\LP\key{allocate}~#1\RP}
+\newcommand{\CALLOCATE}[2]{\LP\key{allocate}~#1~#2\RP}
 \newcommand{\GLOBAL}[1]{\LP\key{Global}~#1\RP}
 \newcommand{\CGLOBAL}[1]{#1\key{(\%rip)}}
 \newcommand{\GLOBALVALUE}[1]{\LP\key{GlobalValue}~#1\RP}
@@ -197,11 +197,11 @@
 \newcommand{\COR}[2]{#1~\key{or}~#2}
 \newcommand{\INTTY}{{\key{int}}}
 \newcommand{\BOOLTY}{{\key{bool}}}
-\newcommand{\VECTY}[1]{{\key{Tuple}\LS #1 \RS}}
+\newcommand{\VECTY}[1]{{\key{TupleType}\LP\LS #1 \RS\RP}}
 \newcommand{\COLLECT}[1]{\key{Collect}\LP#1\RP}
 \newcommand{\CCOLLECT}[1]{\key{collect}\LP#1\RP}
 \newcommand{\ALLOCATE}[2]{\key{Allocate}\LP#1,#2\RP}
-\newcommand{\CALLOCATE}[2]{\key{allocate}\LP#1\RP}
+\newcommand{\CALLOCATE}[2]{\key{allocate}\LP#1,#2\RP}
 \newcommand{\GLOBALVALUE}[1]{\key{GlobalValue}\LP#1\RP}
 \newcommand{\CGLOBALVALUE}[1]{\key{global\_value}\LP#1\RP}
 \newcommand{\GLOBAL}[1]{\key{Global}\LP#1\RP}