Просмотр исходного кода

added challenge exercise for generational collection

Jeremy Siek 4 лет назад
Родитель
Сommit
f138092524
1 измененных файлов с 78 добавлено и 7 удалено
  1. 78 7
      book.tex

+ 78 - 7
book.tex

@@ -5364,13 +5364,13 @@ collectors are that allocation is fast (just a comparison and pointer
 increment), there is no fragmentation, cyclic garbage is collected,
 and the time complexity of collection only depends on the amount of
 live data, and not on the amount of garbage~\citep{Wilson:1992fk}. The
-main disadvantage of a two-space copying collector is that it uses a
-lot of space, though that problem is ameliorated in generational
-collectors.  Racket and Scheme programs tend to allocate many small
-objects and generate a lot of garbage, so copying and generational
-collectors are a good fit.  Garbage collection is an active research
-topic, especially concurrent garbage
-collection~\citep{Tene:2011kx}. Researchers are continuously
+main disadvantages of a two-space copying collector is that it uses a
+lot of space and takes a long time to perform the copy, though these
+problems are ameliorated in generational collectors.  Racket and
+Scheme programs tend to allocate many small objects and generate a lot
+of garbage, so copying and generational collectors are a good fit.
+Garbage collection is an active research topic, especially concurrent
+garbage collection~\citep{Tene:2011kx}. Researchers are continuously
 developing new techniques and revisiting old
 trade-offs~\citep{Blackburn:2004aa,Jones:2011aa,Shahriyar:2013aa,Cutler:2015aa,Shidal:2015aa,Osterlund:2016aa,Jacek:2019aa,Gamari:2020aa}. Researchers
 meet every year at the International Symposium on Memory Management to
@@ -6302,7 +6302,78 @@ mark. The folowing example uses \code{set-point-x!} to change the
 
 \section{Challenge: Generational Collection}
 
+The copying collector described in Section~\ref{sec:GC} can incur
+significant runtime overhead because the call to \code{collect} takes
+time proportional to all of the live data. One way to reduce this
+overhead is to reduce how much data is inspected in each call to
+\code{collect}. In particular, researchers have observed that recently
+allocated data is more likely to become garbage then data that has
+survived one or more previous calls to \code{collect}. This insight
+motivated the creation of \emph{generational garbage collectors} that
+1) segragates data according to its age into two or more generations,
+2) allocates less space for younger generations, so collecting them is
+faster, and more space for the older generations, and 3) performs
+collection on the younger generations more frequently then for older
+generations~\citep{Wilson:1992fk}.
+
+For this challenge assignment, the goal is to adapt the copying
+collector implemented in \code{runtime.c} to use two generations, one
+for young data and one for old data. Each generation consists of a
+FromSpace and a ToSpace. The following is a sketch of how to adapt the
+\code{collect} function to use the two generations.
 
+\begin{enumerate}
+\item Copy the young generation's FromSpace to its ToSpace then switch
+  the role of the ToSpace and FromSpace
+\item If there is enough space for the requested number of bytes in
+  the young FromSpace, then return from \code{collect}.
+\item If there is not enough space in the young FromSpace for the
+  requested bytes, then move the data from the young generation to the
+  old one with the following steps:
+  \begin{enumerate}
+  \item If there is enough room in the old FromSpace, copy the young
+    FromSpace to the old FromSpace and then return.
+  \item If there is not enough room in the old FromSpace, then collect
+    the old generation by copying the old FromSpace to the old ToSpace
+    and swap the roles of the old FromSpace and ToSpace.
+  \item If there is enough room now, copy the young FromSpace to the
+    old FromSpace and return. Otherwise, allocate a larger FromSpace
+    and ToSpace for the old generation.  Copy the young FromSpace and
+    the old FromSpace into the larger FromSpace for the old
+    generation and then return.
+  \end{enumerate}
+\end{enumerate}
+
+We recommend that you generalize the \code{cheney} function so that it
+can be used for all the copies mentioned above: between the young
+FromSpace and ToSpace, between the old FromSpace and ToSpace, and
+between the young FromSpace and old FromSpace. This can be
+accomplished by adding parameters to \code{cheney} that replace its
+use of the global variables \code{fromspace\_begin},
+\code{fromspace\_end}, \code{tospace\_begin}, and \code{tospace\_end}.
+
+Note that the collection of the young generation does not traverse the
+old generation. This introduces a potential problem: there may be
+young data that is only reachable through pointers in the old
+generation. If these pointers are not taken into account, the
+collector could throw away young data that is live!  One solution,
+called \emph{pointer recording}, is to maintain a set of all the
+pointers from the old generation into the new generation and consider
+this set as part of the root set.  To maintain this set, the compiler
+must insert extra instructions around every \code{vector-set!}. If the
+vector being modified is in the old generation, and if the value being
+written is a pointer into the new generation, than that pointer must
+be added to the set. Also, if the value being overwritten was a
+pointer into the new generation, then that pointer should be removed
+from the set.
+
+\begin{exercise}\normalfont
+  Adapt the \code{collect} function in \code{runtime.c} to implement
+  generational garbage collection, as outlined in this section.
+  Update the code generation for \code{vector-set!} to implement
+  pointer recording. Make sure that your new compiler and runtime
+  passes your test suite.
+\end{exercise}
 
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%