Explorar o código

pass on Chapter 9 Arrays challenge

Andrew Tolmach %!s(int64=4) %!d(string=hai) anos
pai
achega
6f83eb69ce
Modificáronse 2 ficheiros con 94 adicións e 11 borrados
  1. 92 10
      book.tex
  2. 2 1
      defs.tex

+ 92 - 10
book.tex

@@ -10070,7 +10070,7 @@ instead of just once for the whole program.
 
 \ocaml{The frame size and rootframe size information produced by this pass
   now form the per-function information, rather than the per-program information.
-  The is also the right pass to replace the dummy entry and exit blocks by
+  This is also the right pass to replace the dummy entry and exit blocks by
   the real ones generated by the \code{func\_entry\_exit} function in \code{X86Fun.ml}
   The details of what these produce is described under ``Print x86'' below.}
 
@@ -13340,6 +13340,19 @@ the length is determined at run-time and all the elements have the same
 type (they are homogeneous). We use the term ``array'' for this later
 kind of sequence.
 
+\begin{ocamlx}
+  For this challenge, you will implement \LangArray{}, a variant of \LangFun{}
+  supporting arrays. Unlike previous chapters, all that will provided to you
+  are the concrete input grammar for \LangArray{} and
+  an improved version of \code{runtime.c} that supports garbage collection of
+  arrays (according to a particular specified memory layout convention).
+  It is up to you to produce appropriately modified versions of \code{RFun.ml}, \code{CFun.ml},
+  and \code{X86Fun.ml}, as well as the passes that glue them together, which you should
+  put in a file \code{Chapter9Arrays.ml}. You have complete freedom to add
+  features as needed to the intermediate languages and the set of X86 instructions
+  you use.
+\end{ocamlx} 
+
 The Racket language does not distinguish between tuples and arrays,
 they are both represented by vectors. However, Typed Racket
 distinguishes between tuples and arrays: the \code{Vector} type is for
@@ -13358,6 +13371,26 @@ useful in many examples involving arrays such as computing the
 inner-product of two arrays (Figure~\ref{fig:inner-product}).
 
 
+\begin{ocamlx}
+  For our version of \LangArray{} we will not overload existing operators.
+  Instead, we will use the following concrete syntax:
+\begin{itemize}
+  \item An array is created with \code{(array $e_1$ $e_2$)} where $e_1$ evaluates
+    to an integer representing the length of the array, and $e_2$ evaluates to the
+    initial value to be used for each array element.  
+  \item The length of an array is returned by \code{(len $e$)} where $e$ evaluates to an array.
+  \item Reading from the array is denoted by \code{(@ $e_1$ $e_2$)} where $e_1$ evaluates to an
+    array and $e_2$ evaluates to an integer index into the array (counting from 0).
+  \item Writing to the array is denoted by \code{(@:= $e_1$ $e_2$ $e_3$)}, where
+    $e_1$ evaluates to an array, $e_2$ to an integer index, and $e_3$ to the new value
+    to store.
+\end{itemize}
+As suggested, we will also add integer multiplication, written \code{(* $e_1$ $e_2$)}. 
+You may also find it useful to extend the AST for \LangArray{} with other constructs
+that are generated by various passes; as in previous chapters, any such extensions should not be
+parseable, but may appear in debug output.
+\end{ocamlx}
+
 \begin{figure}[tp]
 \centering
 \fbox{
@@ -13390,7 +13423,8 @@ inner-product of two arrays (Figure~\ref{fig:inner-product}).
 \]
 \end{minipage}
 }
-\caption{The concrete syntax of \LangArray{}, extending \LangLoop{} (Figure~\ref{fig:Rwhile-concrete-syntax}).}
+\caption{The concrete syntax of \LangArray{}, extending \LangLoop{} (Figure~\ref{fig:Rwhile-concrete-syntax}).\ocaml{See the text for
+  OCaml version of this.}}
 \label{fig:Rvecof-concrete-syntax}
 \end{figure}
 
@@ -13416,16 +13450,34 @@ inner-product of two arrays (Figure~\ref{fig:inner-product}).
     (+ (inner-product A B 2)
        30)))
 \end{lstlisting}
+\begin{lstlisting}[style=ocaml]
+(define innerproduct (A : (array int)) (B : (array int)) (n : int) : int 
+ (let i 0
+ (let prod 0 
+ (seq
+  (while (< i n)
+   (seq
+    (:= prod (+ prod ~\texttt{(*}~ (@ A i) (@ B i))))
+    (:=  i (+ i 1))))
+  prod))))  
+
+(let A (array 2 2)
+(let B (array 2 3)
+ (+ (innerproduct A B 2) 30)))
+\end{lstlisting}  
 \caption{Example program that computes the inner-product.}
 \label{fig:inner-product}
 \end{figure}
 
-
 The type checker for \LangArray{} is define in
 Figure~\ref{fig:type-check-Rvecof}. The result type of
-\code{make-vector} is \code{(Vectorof T)} where \code{T} is the type
+\code{make-vector} is \code{(Vectorof T)} \ocaml{\code{(array T)}} where \code{T} is the type
 of the intializing expression.  The length expression is required to
-have type \code{Integer}. The type checking of the operators
+have type \code{Integer}. \ocaml{The rest of this paragraph is only vaguely relevant:
+  we are introducing
+  new operators rather than overloading existing ones, and we are not
+  building on the language supporting \code{Any} types.}
+The type checking of the operators
 \code{vector-length}, \code{vector-ref}, and \code{vector-set!}  is
 updated to handle the situation where the vector has type
 \code{Vectorof}. In these cases we translate the operators to their
@@ -13526,6 +13578,8 @@ integers.
 \subsection{Data Representation}
 \label{sec:array-rep}
 
+\ocaml{You need to follow these guidelines precisely in order to use
+  the garbage collection code in \code{runtime.c}.}
 Just like tuples, we store arrays on the heap which means that the
 garbage collector will need to inspect arrays. An immediate thought is
 to use the same representation for arrays that we use for tuples.
@@ -13558,14 +13612,27 @@ an array:
 Recall that in Chapter~\ref{ch:Rdyn}, we use a $3$-bit tag to
 differentiate the kinds of values that have been injected into the
 \code{Any} type. We use the bit pattern \code{110} (or $6$ in decimal)
-to indicate that the value is an array.
+to indicate that the value is an array. \ocaml{We do not have \code{Any}
+  types.}
 
 In the following subsections we provide hints regarding how to update
-the passes to handle arrays.
+the passes to handle arrays. \ocaml{Some of these are not too relevant
+  to our version, though.}
 
 
 \subsection{Reveal Casts}
 
+\ocaml{Don't try to follow this section too closely; there are too many
+  different assumptions at play.  But you will indeed need to introduce code
+  to check at runtime that the indices of \code{@} and \code{@:=} operations
+  are within bounds, and also that the length parameter to \code{array}
+  creation is non-negative. If any of these conditions are violated, the
+  generated code should call a (newly provided) function in {\tt runtime.c} called
+  \code{fatal\_exit}; this will cause the program to terminate immediately
+  with return code 255.  By the way, there is a very cute way to check that
+  an index is less than a fixed bound \emph{and} non-negative using a single
+  \emph{unsigned} comparison instruction.}
+  
 The array-access operators \code{vectorof-ref} and
 \code{vectorof-set!} are similar to the \code{any-vector-ref} and
 \code{any-vector-set!} operators of Chapter~\ref{ch:Rdyn} in
@@ -13598,6 +13665,11 @@ front of the tuple or array.
 
 \subsection{Expose Allocation}
 
+\ocaml{The advice here is fairly relevant. As an alternative to
+  defining a new \code{AllocateArray} AST form, you may wish to
+  modify the existing \code{Alloc} form so that it can be
+  used for both tuples and arrays.}
+
 This pass should translate the \code{make-vector} operator into
 lower-level operations. In particular, the new AST node
 $\LP\key{AllocateArray}~\Exp~\Type\RP$ allocates an array of the
@@ -13635,14 +13707,22 @@ The instructions generated for \code{vectorof-ref} differ from those
 for \code{vector-ref} (Section~\ref{sec:select-instructions-gc}) in
 that the index is not a constant so the offset must be computed at
 runtime, similar to the instructions generated for
-\code{any-vector-of-ref} (Section~\ref{sec:select-Rany}).  The same is
+\code{any-vector-of-ref} (Section~\ref{sec:select-Rany}). \ocaml{You
+  might want to look at that section for inspiration; the relevant
+  code is actually for \code{any-vector-ref}. But by the way, remember that
+  multiplication by a constant power of two can be more cheaply done using a shift instruction.}
+The same is
 true for \code{vectorof-set!}.  Also, the \code{vectorof-set!} may
 appear in an assignment and as a stand-alone statement, so make sure
-to handle both situations in this pass.
+to handle both situations in this pass. \ocaml{This last statement
+  won't be true if you give \code{@:=} void type, as you presumably do!}
 
 Finally, the instructions for \code{any-vectorof-length} should be
 similar to those for \code{vectorof-length}, except that one must
-first project the array by writing zeroes into the $3$-bit tag
+first project the array by writing zeroes into the $3$-bit tag. \ocaml{Not relevant for us.}
+
+\ocaml{For multiplication, use the X86 {\tt imul} instruction. Note that it has a peculiarity:
+  its destination must be a register.}
 
 \begin{exercise}\normalfont
 
@@ -13652,6 +13732,8 @@ programs, including the one in Figure~\ref{fig:inner-product} and also
 a program that multiplies two matrices. Note that matrices are
 2-dimensional arrays, but those can be encoded into 1-dimensional
 arrays by laying out each row in the array, one after the next.
+\ocaml{Alternatively, they can be encoded as arrays of arrays.
+  You should try writing both versions of matrix multiply!}
   
 \end{exercise}
 

+ 2 - 1
defs.tex

@@ -30,7 +30,8 @@
 \newcommand{\LangLoopAlloc}{\ensuremath{R^{\ttm{Alloc}}_{\ttm{While}}}} %R'8
 \newcommand{\LangCLoop}{\ensuremath{C_{\circlearrowleft}}} %C7
 \newcommand{\LangLoopANF}{\ensuremath{R^{\mathsf{ANF}}_{\ttm{While}}}} %R8
-\newcommand{\LangArray}{\ensuremath{R^{\ttm{Vecof}}_{\ttm{While}}}} %R^s3
+%\newcommand{\LangArray}{\ensuremath{R^{\ttm{Vecof}}_{\ttm{While}}}} %R^s3
+\newcommand{\LangArray}{\ensuremath{R_{\ttm{Array}}}} %R^s3
 \newcommand{\LangGrad}{\ensuremath{R_{\ttm{?}}}} %R9
 \newcommand{\LangCast}{\ensuremath{R_{\ttm{cast}}}} %R9'
 \newcommand{\LangProxy}{\ensuremath{R_{\ttm{proxy}}}} %R8''