Jeremy Siek 4 vuotta sitten
vanhempi
commit
5f297b87b2
1 muutettua tiedostoa jossa 328 lisäystä ja 233 poistoa
  1. 328 233
      book.tex

+ 328 - 233
book.tex

@@ -929,14 +929,22 @@ It produces an error:
 fx+: result is not a fixnum
 \end{lstlisting}
 We establish the convention that if running the definitional
-interpreter on a program produces an error, then the meaning of that
-program is \emph{unspecified}. That means a compiler for the language
-is under no obligations regarding that program; it may or may not
-produce an executable, and if it does, that executable can do
-anything. This convention applies to the languages defined in this
-book, as a way to simplify the student's task of implementing them,
-but this convention is not applicable to all programming languages.
-\index{unspecified behavior}
+interpreter on a program produces an error other than
+\code{trapped-error}, then the meaning of that program is
+\emph{unspecified}\index{unspecified behavior}. That means a compiler
+for the language is under no obligations regarding that program; it
+may or may not produce an executable, and if it does, that executable
+can do anything.  On the other hand, if the error is a
+\code{trapped-error}, then the compiled program is also required to
+report that an error occurred. To signal an error, exit with a return
+code of \code{255}.  The interpreters in chapters
+\ref{ch:type-dynamic} and \ref{ch:gradual-typing} use
+\code{trapped-error}.
+
+%% This convention applies to the languages defined in this
+%% book, as a way to simplify the student's task of implementing them,
+%% but this convention is not applicable to all programming languages.
+%% 
 
 Moving on to the last feature of the $R_0$ language, the \key{read}
 operation prompts the user of the program for an integer.  Recall that
@@ -9024,9 +9032,9 @@ work of \citet{Keep:2012ab}.
 \index{dynamic typing}
 
 In this chapter we discuss the compilation of $R_7$, a dynamically
-typed language and a subset of the Racket language. This is in
-contrast to the previous chapters, which have studied the compilation
-of Typed Racket. In dynamically typed languages such as $R_7$, a given
+typed language that is a subset of Racket. This is in contrast to the
+previous chapters, which have studied the compilation of Typed
+Racket. In dynamically typed languages such as $R_7$, a given
 expression may produce a value of a different type each time it is
 executed. Consider the following example with a conditional \code{if}
 expression that may return a Boolean or an integer depending on the
@@ -9120,81 +9128,166 @@ defined in Figures~\ref{fig:r7-concrete-syntax} and
 %
 There is no type checker for $R_7$ because it is not a statically
 typed language (it's dynamically typed!).
-%
+
 The definitional interpreter for $R_7$ is presented in
-Figure~\ref{fig:interp-R7}.
+Figure~\ref{fig:interp-R7} and its auxiliary functions are defined in
+Figure~\ref{fig:interp-R7-aux}. Consider the match clause for
+\code{(Int n)}.  Instead of simply returning the integer \code{n} (as
+in the interpreter for $R_1$ in Figure~\ref{fig:interp-R1}), the
+interpreter for $R_7$ creates a \emph{tagged value}\index{tagged
+  value} that combines an underlying value with a tag that identifies
+what kind of value it is. We define the following struct
+to represented tagged values.
+\begin{lstlisting}
+(struct Tagged (value tag) #:transparent)
+\end{lstlisting}
+The tags are \code{Integer}, \code{Boolean}, \code{Void},
+\code{Vector}, and \code{Procedure}. Tags are closely related to types
+but don't always capture all the information that a type does. For
+example, a vector of type \code{(Vector Any Any)} is tagged with
+\code{Vector} and a procedure of type \code{(Any Any -> Any)}
+is tagged with \code{Procedure}.
+
+Next consider the match clause for \code{vector-ref}.  The
+\code{check-tag} auxiliary function (Figure~\ref{fig:interp-R7-aux})
+is used to ensure that the first argument is a vector and the second
+is an integer. If they are not, a \code{trapped-error} is raised.
+Recall from Section~\ref{sec:interp-R0} that when a definition
+interpreter raises a \code{trapped-error} error, the compiled code
+must also signal an error by exiting with return code \code{255}.  A
+\code{trapped-error} is also raised if the index is not less than
+length of the vector.
+
 
 \begin{figure}[tbp]
 \begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
-(define (interp-R7-exp env)
-  (lambda (ast)
-    (define recur (interp-R7-exp env))
-    (match ast
-      [(Var x) (lookup x env)]
-      [(Int n) `(tagged ,n Integer)]
-      [(Bool b) `(tagged ,b Boolean)]
-      [(Prim 'read '()) `(tagged ,(read-fixnum) Integer)]
-      [(Lambda xs rt body)
-       `(tagged (lambda ,xs ,body ,env) (,@(for/list ([x xs]) 'Any) -> Any))]
-      [(Prim 'vector es)
-       `(tagged ,(apply vector (for/list ([e es]) (recur e)))
-                (Vector ,@(for/list ([e es]) 'Any)))]
-      [(Prim 'vector-set! (list e1 n e2))
-       (define vec (value-of-any (recur e1)))
-       (define i (value-of-any (recur n)))
-       (vector-set! vec i (recur e2))
-       `(tagged ,(void) Void)]
-      [(Prim 'vector-ref (list e1 n))
-       (define vec (value-of-any (recur e1)))
-       (define i (value-of-any (recur n)))
-       (vector-ref vec i)]
-      [(Let x e body)
-       (define v (recur e))
-       ((interp-R7-exp (cons (cons x v) env)) body)]
-      [(Prim 'and (list e1 e2))
-       (recur (If e1 e2 (Bool #f)))]
-      [(Prim 'or (list e1 e2))
-       (define v1 (recur e1))
-       (match (value-of-any v1) [#f (recur e2)] [else v1])]
-      [(Prim 'eq? (list l r))
-       `(tagged ,(equal? (recur l) (recur r)) Boolean)]
-      [(If q t f)
-       (match (value-of-any (recur q)) [#f (recur f)] [else (recur t)])]
-      [(Prim op es)
-       (tag-value
-        (apply (interp-op op) (for/list ([e es]) (value-of-any (recur e)))))]
-      [(Apply f es)
-       (define new-args (map recur es))
-       (let ([f-val (value-of-any (recur f))])
-         (match f-val 
-           [`(function (,xs ...) ,body ,lam-env)
-            (define new-env (append (map cons xs new-args) lam-env))
-            ((interp-R7-exp new-env) body)]
-           [else (error "interp-R7-exp, expected function, not" f-val)]))]
-      )))
+(define ((interp-R7-exp env) ast)
+  (define recur (interp-R7-exp env))
+  (match ast
+    [(Var x) (lookup x env)]
+    [(Int n) (Tagged n 'Integer)]
+    [(Bool b) (Tagged b 'Boolean)]
+    [(Lambda xs rt body)
+     (Tagged `(function ,xs ,body ,env) 'Procedure)]
+    [(Prim 'vector es)
+     (Tagged (apply vector (for/list ([e es]) (recur e))) 'Vector)]
+    [(Prim 'vector-ref (list e1 e2))
+     (define vec (recur e1)) (define i (recur e2))
+     (check-tag vec 'Vector ast) (check-tag i 'Integer ast)
+     (unless (< (Tagged-value i) (vector-length (Tagged-value vec)))
+       (error 'trapped-error "index ~a too big\nin ~v" (Tagged-value i) ast))
+     (vector-ref (Tagged-value vec) (Tagged-value i))]
+    [(Prim 'vector-set! (list e1 e2 e3))
+     (define vec (recur e1)) (define i (recur e2)) (define arg (recur e3))
+     (check-tag vec 'Vector ast) (check-tag i 'Integer ast)
+     (unless (< (Tagged-value i) (vector-length (Tagged-value vec)))
+       (error 'trapped-error "index ~a too big\nin ~v" (Tagged-value i) ast))
+     (vector-set! (Tagged-value vec) (Tagged-value i) arg)
+     (Tagged (void) 'Void)]
+    [(Let x e body) ((interp-R7-exp (cons (cons x (recur e)) env)) body)]
+    [(Prim 'and (list e1 e2)) (recur (If e1 e2 (Bool #f)))]
+    [(Prim 'or (list e1 e2))
+     (define v1 (recur e1))
+     (match (Tagged-value v1) [#f (recur e2)] [else v1])]
+    [(Prim 'eq? (list l r)) (Tagged (equal? (recur l) (recur r)) 'Boolean)]
+    [(Prim op (list e1))
+     #:when (set-member? type-predicates op)
+     (tag-value ((interp-op op) (Tagged-value (recur e1))))]
+    [(Prim op es)
+     (define args (map recur es))
+     (define tags (for/list ([arg args]) (Tagged-tag arg)))
+     (unless (for/or ([expected-tags (op-tags op)])
+               (equal? expected-tags tags))
+       (error 'trapped-error "illegal argument tags ~a\nin ~v" tags ast))
+     (tag-value
+      (apply (interp-op op) (for/list ([a args]) (Tagged-value a))))]
+    [(If q t f)
+     (match (Tagged-value (recur q)) [#f (recur f)] [else (recur t)])]
+    [(Apply f es)
+     (define new-f (recur f)) (define args (map recur es))
+     (check-tag new-f 'Procedure ast) (define f-val (Tagged-value new-f))
+     (match f-val 
+       [`(function ,xs ,body ,lam-env)
+        (unless (eq? (length xs) (length args))
+         (error 'trapped-error "~a != ~a\nin ~v" (length args) (length xs) ast))
+        (define new-env (append (map cons xs args) lam-env))
+        ((interp-R7-exp new-env) body)]
+       [else (error "interp-R7-exp, expected function, not" f-val)])]))
 \end{lstlisting}
 \caption{Interpreter for the $R_7$ language.}
 \label{fig:interp-R7}
 \end{figure}
 
+\begin{figure}[tbp]
+\begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
+(define (interp-op op)
+  (match op
+    ['+ fx+]
+    ['- fx-]
+    ['read read-fixnum]
+    ['not (lambda (v) (match v [#t #f] [#f #t]))]
+    ['< (lambda (v1 v2)
+	  (cond [(and (fixnum? v1) (fixnum? v2)) (< v1 v2)]))]
+    ['<= (lambda (v1 v2)
+	   (cond [(and (fixnum? v1) (fixnum? v2)) (<= v1 v2)]))]
+    ['> (lambda (v1 v2)
+	  (cond [(and (fixnum? v1) (fixnum? v2)) (> v1 v2)]))]
+    ['>= (lambda (v1 v2)
+	   (cond [(and (fixnum? v1) (fixnum? v2)) (>= v1 v2)]))]
+    ['boolean? boolean?]
+    ['integer? fixnum?]
+    ['void? void?]
+    ['vector? vector?]
+    ['vector-length vector-length]
+    ['procedure? (match-lambda
+                   [`(functions ,xs ,body ,env) #t] [else #f])]
+    [else (error 'interp-op "unknown operator" op)]))
+
+(define (op-tags op)
+  (match op
+    ['+ '((Integer Integer))]
+    ['- '((Integer Integer) (Integer))]
+    ['read '(())]
+    ['not '((Boolean))]
+    ['< '((Integer Integer))]
+    ['<= '((Integer Integer))]
+    ['> '((Integer Integer))]
+    ['>= '((Integer Integer))]
+    ['vector-length '((Vector))]))
+
+(define type-predicates
+  (set 'boolean? 'integer? 'vector? 'procedure? 'void?))
+
+(define (tag-value v)
+  (cond [(boolean? v) (Tagged v 'Boolean)]
+        [(fixnum? v) (Tagged v 'Integer)]
+        [(procedure? v) (Tagged v 'Procedure)]
+        [(vector? v) (Tagged v 'Vector)]
+        [(void? v) (Tagged v 'Void)]
+        [else (error 'tag-value "unidentified value ~a" v)]))
+
+(define (check-tag val expected ast)
+  (define tag (Tagged-tag val))
+  (unless (eq? tag expected)
+    (error 'trapped-error "expected ~a, not ~a\nin ~v" expected tag ast)))
+\end{lstlisting}
+\caption{Auxiliary functions for the $R_7$ interpreter.}
+\label{fig:interp-R7-aux}
+\end{figure}
+
+\clearpage
 
-Let us consider how we might compile $R_7$ to x86, thinking about the
-first example above. Our bit-level representation of the Boolean
-\code{\#f} is zero and similarly for the integer \code{0}.  However,
-\code{(not \#f)} should produce \code{\#t} whereas \code{(not 0)}
-should produce \code{\#f}. Furthermore, the behavior of \code{not}, in
-general, cannot be determined at compile time, but depends on the
-runtime type of its input, as in the example above that depends on the
-result of \code{(read)}.
-
-The way around this problem is to include information about a value's
-runtime type in the value itself, so that this information can be
-inspected by operators such as \code{not}.  In particular, we 
-steal the 3 right-most bits from our 64-bit values to encode the
-runtime type.  We use $001$ to identify integers, $100$ for
-Booleans, $010$ for vectors, $011$ for procedures, and $101$ for the
-void value. We refer to these 3 bits as the \emph{tag} and we
-define the following auxiliary function.
+\section{Representation of Tagged Values}
+
+The interpreter for $R_7$ introduced a new kind of value, a tagged
+value. To compile $R_7$ to x86 we must decide how to represent tagged
+values at the bit level. Because almost every operation in $R_7$
+involves manipulating tagged values, the representation must be
+efficient. Recall that all of our values are 64 bits.  We shall steal
+the 3 right-most bits to encode the tag.  We use $001$ to identify
+integers, $100$ for Booleans, $010$ for vectors, $011$ for procedures,
+and $101$ for the void value. We define the following auxiliary
+function for mapping types to tag codes.
 \begin{align*}
 \itm{tagof}(\key{Integer}) &= 001 \\
 \itm{tagof}(\key{Boolean}) &= 100 \\
@@ -9202,78 +9295,24 @@ define the following auxiliary function.
 \itm{tagof}((\ldots \key{->} \ldots)) &= 011 \\
 \itm{tagof}(\key{Void}) &= 101
 \end{align*}
-This stealing of 3 bits comes at some
-price: our integers are reduced to ranging from $-2^{60}$ to
-$2^{60}$. The stealing does not adversely affect vectors and
-procedures because those values are addresses, and our addresses are
-8-byte aligned so the rightmost 3 bits are unused, they are always
-$000$. Thus, we do not lose information by overwriting the rightmost 3
-bits with the tag and we can simply zero-out the tag to recover the
-original address.
-
-In some sense, these tagged values are a new kind of value.  Indeed,
-we can extend our \emph{typed} language with tagged values by adding a
-new type to classify them, called \key{Any}, and with operations for
-creating and using tagged values, yielding the $R_6$ language that we
-define in Section~\ref{sec:r6-lang}. The $R_6$ language provides the
-fundamental support for polymorphism and runtime types that we need to
-support dynamic typing.
-
-There is an interesting interaction between tagged values and garbage
-collection.  A variable of type \code{Any} might refer to a vector and
-therefore it might be a root that needs to be inspected and copied
-during garbage collection. Thus, we need to treat variables of type
-\code{Any} in a similar way to variables of type \code{Vector} for
-purposes of register allocation, which we discuss in
-Section~\ref{sec:register-allocation-r6}. One concern is that, if a
-variable of type \code{Any} is spilled, it must be spilled to the root
-stack.  But this means that the garbage collector needs to be able to
-differentiate between (1) plain old pointers to tuples, (2) a tagged
-value that points to a tuple, and (3) a tagged value that is not a
-tuple. We enable this differentiation by choosing not to use the tag
-$000$ in $\itm{tagof}$. Instead, that bit pattern is reserved for
-identifying plain old pointers to tuples. That way, if one of the
-first three bits is set, then we have a tagged value and inspecting
-the tag can differentiation between vectors ($010$) and the other
-kinds of values.
-
-We implement our untyped language $R_7$ by compiling it to $R_6$
-(Section~\ref{sec:compile-r7}), but first we describe the $R_6$
-language.
-
-\section{The $R_6$ Language: Typed Racket $+$ \key{Any}}
+This stealing of 3 bits comes at some price: our integers are reduced
+to ranging from $-2^{60}$ to $2^{60}$. The stealing does not adversely
+affect vectors and procedures because those values are addresses, and
+our addresses are 8-byte aligned so the rightmost 3 bits are unused,
+they are always $000$. Thus, we do not lose information by overwriting
+the rightmost 3 bits with the tag and we can simply zero-out the tag
+to recover the original address.
+
+To make tagged values into first-class entities, we can give them a
+type, called \code{Any}, and define operations such as \code{Inject}
+and \code{Project} for creating and using them, yielding the $R_6$
+intermediate language. We describe how to compile $R_7$ to $R_6$ in
+Section~\ref{sec:compile-r7} but first we describe the $R_6$ language
+in greater detail.
+
+\section{The $R_6$ Language}
 \label{sec:r6-lang}
 
-\begin{figure}[tp]
-\centering
-\fbox{
-\begin{minipage}{0.97\textwidth}\small
-\[
-\begin{array}{lcl}
-  \Type &::=& \gray{\key{Integer} \mid \key{Boolean}
-     \mid \LP\key{Vector}\;\Type\ldots\RP \mid \key{Void}} \\
-    &\mid& \gray{\LP\Type\ldots \; \key{->}\; \Type\RP} \mid \key{Any} \\
-\FType &::=& \key{Integer} \mid \key{Boolean} \mid \key{Void} 
-      \mid \LP\key{Vector}\; \key{Any}\ldots\RP \\
-     &\mid& \LP\key{Any}\ldots \; \key{->}\; \key{Any}\RP\\
-\Exp &::=& \ldots \CINJECT{\Exp}{\FType}\RP \mid \CPROJECT{\Exp}{\FType}\\
-  &\mid& \LP\key{any-vector-length}\;\Exp\RP
-   \mid \LP\key{any-vector-ref}\;\Exp\;\Exp\RP \\
-  &\mid& \LP\key{any-vector-set!}\;\Exp\;\Exp\;\Exp\RP\\
-  &\mid& \LP\key{boolean?}\;\Exp\RP \mid \LP\key{integer?}\;\Exp\RP
-   \mid \LP\key{void?}\;\Exp\RP \\
-  &\mid& \LP\key{vector?}\;\Exp\RP \mid \LP\key{procedure?}\;\Exp\RP \\
-  \Def &::=& \gray{ \CDEF{\Var}{\LS\Var \key{:} \Type\RS\ldots}{\Type}{\Exp} } \\
-  R_6 &::=& \gray{\Def\ldots \; \Exp}
-\end{array}
-\]
-\end{minipage}
-}
-\caption{The concrete syntax of $R_6$, extending $R_5$ (Figure~\ref{fig:r5-syntax})
-  with \key{Any}.}
-\label{fig:r6-concrete-syntax}
-\end{figure}
-
 \begin{figure}[tp]
 \centering
 \fbox{
@@ -9299,18 +9338,17 @@ language.
 \end{figure}
 
 
-The concrete and abstract syntax of $R_6$ is defined in
-Figures~\ref{fig:r6-concrete-syntax} and \ref{fig:r6-syntax}.  The
-$\LP\key{inject}\; e\; T\RP$ form converts the value produced by
-expression $e$ of type $T$ into a tagged value.  The
-$\LP\key{project}\;e\;T\RP$ form converts the tagged value produced by
-expression $e$ into a value of type $T$ or else halts the program if
-the type tag is not equivalent to $T$.
+The abstract syntax of $R_6$ is defined in Figure~\ref{fig:r6-syntax}.
+(The concrete syntax of $R_6$ is in the Appendix,
+Figure~\ref{fig:r6-concrete-syntax}.)  The $\INJECT{e}{T}$ form
+converts the value produced by expression $e$ of type $T$ into a
+tagged value.  The $\PROJECT{e}{T}$ form converts the tagged value
+produced by expression $e$ into a value of type $T$ or else halts the
+program if the type tag is not equivalent to $T$.
 %
-Note that in both \key{inject} and \key{project}, the type $T$ is
-restricted to the flat types $\FType$, which simplifies the
-implementation and corresponds with what is needed for compiling
-untyped Racket.
+Note that in both \code{Inject} and \code{Project}, the type $T$ is
+restricted to a flat type $\FType$, which simplifies the
+implementation and corresponds with what is needed for compiling $R_7$.
 
 The \code{any-vector} operators adapt the vector operations so that
 they can be applied to a value of type \code{Any}.  They also
@@ -9318,9 +9356,9 @@ generalize the vector operations in that the index is not restricted
 to be a literal integer in the grammar but is allowed to be any
 expression.
 
-The type predicates such as $\LP\key{boolean?}\,e\RP$ expect the
-expression $e$ to produce a tagged value; they return \key{\#t} if the
-tag corresponds to the predicate and they return \key{\#f} otherwise.
+The type predicates such as \key{boolean?} expect their argument to
+produce a tagged value; they return \key{\#t} if the tag corresponds
+to the predicate and they return \key{\#f} otherwise.
 
 The type checker for $R_6$ is shown in
 Figures~\ref{fig:type-check-R6-part-1} and
@@ -9484,6 +9522,12 @@ auxiliary function \code{apply-project} is in Figure~\ref{fig:apply-project}.
          (match-lambda
            [`(tagged ,v^ ,tg)  tg]
            [v  (error 'interp-op "expected tagged value, not ~a" v)])]
+        ['any-vector-ref (lambda (v i)
+                           (match v [`(tagged ,v^ ,tg) (vector-ref v^ i)]))]
+        ['any-vector-set! (lambda (v i a)
+                            (match v [`(tagged ,v^ ,tg) (vector-set! v^ i a)]))]
+        ['any-vector-length (lambda (v)
+                            (match v [`(tagged ,v^ ,tg) (vector-length v^)]))]
         [else (super interp-op op)]))
 
     (define/override ((interp-exp env) e)
@@ -9508,32 +9552,34 @@ auxiliary function \code{apply-project} is in Figure~\ref{fig:apply-project}.
 
 \begin{figure}[tbp]
 \begin{lstlisting}
-(define (apply-project v ty2)
+(define/public (apply-inject v tg) (Tagged v tg))
+
+(define/public (apply-project v ty2)
   (define tag2 (any-tag ty2))
   (match v
-    [`(tagged ,v1 ,tag1)
-     (cond [(eq? tag1 tag2)
-            (match ty2
-              [`(Vector ,ts ...)
-               (cond [(eq? (vector-length v1) (length ts)) v1]
-                     [else
-                      (error 'apply-project
-                             "length ~a does not match vector type length ~a"
-                             (vector-length v1) (length ts))])]
-              [`(,ts ... -> ,rt)
-               (match v1
-                 [`(function ,xs ,body ,env)
-                  (cond [(eq? (length xs) (length ts)) v1]
-                        [else
-                         (error 'apply-project
-                                "arity ~a does not match type arity ~a"
-                                (length xs) (length ts))])]
-                 [else (error 'apply-project "expected a function, not ~a" v1)])]
-              [else v1])]
-           [else (error 'apply-project "tag mismatch ~a != ~a" tag1 tag2)])]
+    [(Tagged v1 tag1)
+     (cond
+       [(eq? tag1 tag2)
+        (match ty2
+          [`(Vector ,ts ...)
+           (define l1 ((interp-op 'vector-length) v1))
+           (cond
+             [(eq? l1 (length ts)) v1]
+             [else (error 'apply-project "vector length mismatch, ~a != ~a"
+                          l1 (length ts))])]
+          [`(,ts ... -> ,rt)
+           (match v1
+             [`(function ,xs ,body ,env)
+              (cond [(eq? (length xs) (length ts)) v1]
+                    [else
+                     (error 'apply-project "arity mismatch ~a != ~a"
+                            (length xs) (length ts))])]
+             [else (error 'apply-project "expected function not ~a" v1)])]
+          [else v1])]
+       [else (error 'apply-project "tag mismatch ~a != ~a" tag1 tag2)])]
     [else (error 'apply-project "expected tagged value, not ~a" v)]))
 \end{lstlisting}
-  \caption{Auxiliary function to apply a projection.}
+  \caption{Auxiliary functions for injection and projection.}
   \label{fig:apply-project}
 \end{figure}
 
@@ -9784,9 +9830,9 @@ of an integer, as in $C_2$ (Figure~\ref{fig:c2-syntax}).
 \[
 \begin{array}{lcl}
 \Exp &::= & \ldots
-   \mid \BINOP{\key{'vector-ref}}{\Atm}{\Atm}  \\
-   &\mid& (\key{Prim}~\key{'vector-set!}\,(\key{list}\,\Atm\,\Atm\,\Atm))
-   \mid \VALUEOF{\Exp}{\FType} \\
+   \mid \BINOP{\key{'any-vector-ref}}{\Atm}{\Atm}  \\
+   &\mid& (\key{Prim}~\key{'any-vector-set!}\,(\key{list}\,\Atm\,\Atm\,\Atm))\\
+   &\mid& \VALUEOF{\Exp}{\FType} \\
 \Stmt &::=& \gray{ \ASSIGN{\VAR{\Var}}{\Exp} 
   \mid \LP\key{Collect} \,\itm{int}\RP }\\
 \Tail &::= & \gray{ \RETURN{\Exp} \mid \SEQ{\Stmt}{\Tail} 
@@ -9809,45 +9855,9 @@ C_4 & ::= & \gray{ \PROGRAMDEFS{\itm{info}}{\LP\Def\ldots\RP} }
 \section{Select Instructions}
 \label{sec:select-r6}
 
-\paragraph{Vector-ref}
-
-Recall that instruction selection for \code{vector-ref} in
-Section~\ref{sec:select-instructions-gc} depends on knowing the index $n$
-at compile time:
-\begin{lstlisting}
-(Assign |$\itm{lhs}$| (Prim 'vector-ref (list |$a_1$| (Int |$n$|))))
-|$\Longrightarrow$|
-movq |$a_1'$|, %r11
-movq |$\itm{offset}$|(%r11), |$\itm{lhs'}$|
-\end{lstlisting}
-where $\itm{offset} = 8(n+1)$.
-%
-In $R_6$ the index may be an arbitrary atom so instead of
-computing the offset at compile time, instructions need to be
-generated to compute the offset at runtime as follows. Note the use of
-the new instruction \code{imulq}.
-\begin{center}
-\begin{minipage}{0.96\textwidth}
-\begin{lstlisting}
-(Assign |$\itm{lhs}$| (Prim 'vector-ref (list |$a_1$| |$a_2$|)))
-|$\Longrightarrow$|
-movq |$a_2'$|, %r11
-addq $1, %r11
-imulq $8, %r11
-addq |$a_1'$|, %r11
-movq 0(%r11) |$\itm{lhs'}$|
-\end{lstlisting}
-\end{minipage}
-\end{center}
-
-\paragraph{Vector-set!}
-
-The above issue also applies to \code{vector-set!}.  The index may be
-an arbitrary expression so one must generate instructions to compute
-the offset at runtime.
-
-
-%% The same idea applies to `vector-set!`.
+In the \code{select-instructions} pass we translate the primitive
+operations on the \code{Any} type to x86 instructions that involve
+manipulating the 3 tag bits of the tagged value.
 
 \paragraph{Make-any}
 
@@ -9920,25 +9930,78 @@ andq |$e'$|, |\itm{lhs'}|
 %% devise a sequence of instructions to implement the type predicates
 %% \key{boolean?}, \key{integer?}, \key{vector?}, and \key{procedure?}.
 
+\paragraph{Any-vector-length}
+
+\begin{lstlisting}
+(Assign |$\itm{lhs}$| (Prim 'any-vector-length (list |$a_1$|)))
+|$\Longrightarrow$|
+movq |$\neg 111$|, %r11
+andq |$a_1'$|,  %r11
+movq 0(%r11), %r11
+andq $126, %r11
+sarq $1, %r11
+movq %r11, |$\itm{lhs'}$|
+\end{lstlisting}
+
+\paragraph{Any-vector-ref}
+
+The index may be an arbitrary atom so instead of computing the offset
+at compile time, instructions need to be generated to compute the
+offset at runtime as follows. Note the use of the new instruction
+\code{imulq}.
+\begin{center}
+\begin{minipage}{0.96\textwidth}
+\begin{lstlisting}
+(Assign |$\itm{lhs}$| (Prim 'any-vector-ref (list |$a_1$| |$a_2$|)))
+|$\Longrightarrow$|
+movq |$\neg 111$|, %r11
+andq |$a_1'$|, %r11
+movq |$a_2'$|, %rax
+addq $1, %rax
+imulq $8, %rax
+addq %rax, %r11
+movq 0(%r11) |$\itm{lhs'}$|
+\end{lstlisting}
+\end{minipage}
+\end{center}
+
+\paragraph{Any-vector-set!}
+
+The code generation for \code{any-vector-set!} is similar to the other
+\code{any-vector} operations.
+
 \section{Register Allocation for $R_6$}
 \label{sec:register-allocation-r6}
 \index{register allocation}
 
-At the beginning of this chapter we discussed how a variable of type
-\code{Any} might refer to a vector. Thus, the register allocator for
-$R_6$ needs to treat variable of type \code{Any} in the same way that
-it treats variables of type \code{Vector} for purposes of garbage
-collection. In particular,
+There is an interesting interaction between tagged values and garbage
+collection that has an impact on register allocation.  A variable of
+type \code{Any} might refer to a vector and therefore it might be a
+root that needs to be inspected and copied during garbage
+collection. Thus, we need to treat variables of type \code{Any} in a
+similar way to variables of type \code{Vector} for purposes of
+register allocation.  In particular,
 \begin{itemize}
 \item If a variable of type \code{Any} is live during a function call,
-  then it must be spilled. One way to accomplish this is to augment
-  \code{build-interference} to mark all variables that are live after
-  a \code{callq} as interfering with all the registers.
+  then it must be spilled. This can be accomplished by changing
+  \code{build-interference} to mark all variables of type \code{Any}
+  that are live after a \code{callq} as interfering with all the
+  registers.
 
 \item If a variable of type \code{Any} is spilled, it must be spilled
   to the root stack instead of the normal procedure call stack.
 \end{itemize}
 
+Another concern regarding the root stack is that the garbage collector
+needs to differentiate between (1) plain old pointers to tuples, (2) a
+tagged value that points to a tuple, and (3) a tagged value that is
+not a tuple. We enable this differentiation by choosing not to use the
+tag $000$ in the $\itm{tagof}$ function. Instead, that bit pattern is
+reserved for identifying plain old pointers to tuples. That way, if
+one of the first three bits is set, then we have a tagged value and
+inspecting the tag can differentiation between vectors ($010$) and the
+other kinds of values.
+
 \begin{exercise}\normalfont
 Expand your compiler to handle $R_6$ as discussed in the last few
 sections.  Create 5 new programs that use the \code{Any} type and the
@@ -9948,7 +10011,6 @@ previously created test programs.
 \end{exercise}
 
 
-
 \begin{exercise}\normalfont
 Expand your compiler to handle $R_7$ as outlined in this chapter.
 Create tests for $R_7$ by adapting ten of your previous test programs
@@ -11877,6 +11939,39 @@ registers.
 
 \section{Concrete Syntax for Intermediate Languages}
 
+The concrete syntax of $R_6$ is defined in
+Figure~\ref{fig:r6-concrete-syntax}.
+
+\begin{figure}[tp]
+\centering
+\fbox{
+\begin{minipage}{0.97\textwidth}\small
+\[
+\begin{array}{lcl}
+  \Type &::=& \gray{\key{Integer} \mid \key{Boolean}
+     \mid \LP\key{Vector}\;\Type\ldots\RP \mid \key{Void}} \\
+    &\mid& \gray{\LP\Type\ldots \; \key{->}\; \Type\RP} \mid \key{Any} \\
+\FType &::=& \key{Integer} \mid \key{Boolean} \mid \key{Void} 
+      \mid \LP\key{Vector}\; \key{Any}\ldots\RP \\
+     &\mid& \LP\key{Any}\ldots \; \key{->}\; \key{Any}\RP\\
+\Exp &::=& \ldots \CINJECT{\Exp}{\FType}\RP \mid \CPROJECT{\Exp}{\FType}\\
+  &\mid& \LP\key{any-vector-length}\;\Exp\RP
+   \mid \LP\key{any-vector-ref}\;\Exp\;\Exp\RP \\
+  &\mid& \LP\key{any-vector-set!}\;\Exp\;\Exp\;\Exp\RP\\
+  &\mid& \LP\key{boolean?}\;\Exp\RP \mid \LP\key{integer?}\;\Exp\RP
+   \mid \LP\key{void?}\;\Exp\RP \\
+  &\mid& \LP\key{vector?}\;\Exp\RP \mid \LP\key{procedure?}\;\Exp\RP \\
+  \Def &::=& \gray{ \CDEF{\Var}{\LS\Var \key{:} \Type\RS\ldots}{\Type}{\Exp} } \\
+  R_6 &::=& \gray{\Def\ldots \; \Exp}
+\end{array}
+\]
+\end{minipage}
+}
+\caption{The concrete syntax of $R_6$, extending $R_5$ (Figure~\ref{fig:r5-syntax})
+  with \key{Any}.}
+\label{fig:r6-concrete-syntax}
+\end{figure}
+
 The concrete syntax for $C_0$, $C_1$, $C_2$ and $C_3$ is
 defined in Figures~\ref{fig:c0-concrete-syntax},
 \ref{fig:c1-concrete-syntax}, \ref{fig:c2-concrete-syntax},