Jeremy Siek 3 år sedan
förälder
incheckning
fbabdc0c34
2 ändrade filer med 286 tillägg och 145 borttagningar
  1. 284 145
      book.tex
  2. 2 0
      defs.tex

+ 284 - 145
book.tex

@@ -220,7 +220,7 @@ concepts and algorithms used in compilers.
   programming languages with the addition of loops\racket{ and mutable
   variables}. This elicits the need for \emph{dataflow
     analysis} in the register allocator.
-\item Chapter~\ref{ch:Rvec} adds heap-allocated tuples, motivating
+\item Chapter~\ref{ch:Lvec} adds heap-allocated tuples, motivating
   \emph{garbage collection}.
 \item Chapter~\ref{ch:Rfun} adds functions that are first-class values
   but lack lexical scoping, similar to the C programming
@@ -277,7 +277,7 @@ emphasis on statically-typed imperative languages could include
 Chapter~\ref{ch:Rpoly} but skip Chapter~\ref{ch:Rdyn}. For compiler
 courses at universities on the quarter system, with 10 weeks, we
 recommend completing up through Chapter~\ref{ch:Rfun}.  (If pressed
-for time, one can skip Chapter~\ref{ch:Rvec} but still include
+for time, one can skip Chapter~\ref{ch:Lvec} but still include
 Chapter~\ref{ch:Rfun} by limiting the number of parameters allowed in
 functions.)  Figure~\ref{fig:chapter-dependences} depicts the
 dependencies between chapters.
@@ -295,7 +295,7 @@ University of Vermont.
   \node (C2) at (4,1.5) {\small Ch.~\ref{ch:Lvar} Variables};
   \node (C3) at (8,1.5) {\small Ch.~\ref{ch:register-allocation-Lvar} Registers};
   \node (C4) at (0,0) {\small Ch.~\ref{ch:Lif} Conditionals};
-  \node (C5) at (4,0) {\small Ch.~\ref{ch:Rvec} Tuples};
+  \node (C5) at (4,0) {\small Ch.~\ref{ch:Lvec} Tuples};
   \node (C6) at (8,0) {\small Ch.~\ref{ch:Rfun} Functions};
   \node (C9) at (0,-1.5) {\small Ch.~\ref{ch:Lwhile} Loops};
   \node (C8) at (4,-1.5) {\small Ch.~\ref{ch:Rdyn} Dynamic};
@@ -321,7 +321,7 @@ University of Vermont.
   \node (C2) at (4,1.5) {\small Ch.~\ref{ch:Lvar} Variables};
   \node (C3) at (8,1.5) {\small Ch.~\ref{ch:register-allocation-Lvar} Registers};
   \node (C4) at (0,0) {\small Ch.~\ref{ch:Lif} Conditionals};
-  \node (C5) at (4,0) {\small Ch.~\ref{ch:Rvec} Tuples};
+  \node (C5) at (4,0) {\small Ch.~\ref{ch:Lvec} Tuples};
   \node (C6) at (8,0) {\small Ch.~\ref{ch:Rfun} Functions};
   \node (C9) at (0,-1.5) {\small Ch.~\ref{ch:Lwhile} Loops};
   \node (C8) at (4,-1.5) {\small Ch.~\ref{ch:Rdyn} Dynamic};
@@ -4776,7 +4776,7 @@ particular, we assign $-1$ to \code{rax} and $-2$ to \code{rsp}.
 %% One might wonder why we include registers at all in the liveness
 %% analysis and interference graph. For example, we never allocate a
 %% variable to \code{rax} and \code{rsp}, so it would be harmless to
-%% leave them out.  As we see in Chapter~\ref{ch:Rvec}, when we begin
+%% leave them out.  As we see in Chapter~\ref{ch:Lvec}, when we begin
 %% to use register for passing arguments to functions, it will be
 %% necessary for those registers to appear in the interference graph
 %% because those registers will also be assigned to variables, and we
@@ -9281,7 +9281,7 @@ such as the case-of-case transformation of \citet{PeytonJones:1998}.
 
 % TODO: multi-graph
 
-\if\edition\racketEd
+{\if\edition\racketEd
 %
 In this chapter we study two features that are the hallmarks of
 imperative programming languages: loops and assignments to local
@@ -9311,22 +9311,31 @@ value of the right-hand-side.
 %
 The primary purpose of both the \code{while} loop and \code{set!} is
 to cause side effects, so they do not have a meaningful result
-value. Instead their result is the \code{void} value.  The expression
-\code{(void)} is an explicit way to create the \code{void} value and
-it has type \code{Void}.\footnote{Racket's \code{Void} type
-  corresponds to what is called the \code{Unit} type in the
-  programming languages literature. Racket's \code{Void} type is
-  inhabited by a single value \code{void} which corresponds to
-  \code{unit} or \code{()} in the
-  literature~\citep{Pierce:2002hj}.}. With the addition of
-side-effecting features such as \code{while} loop and \code{set!}, it
-is helpful to also include in a language feature for sequencing side
-effects: the \code{begin} expression. It consists of one or more
-subexpressions that are evaluated left-to-right.
+value. Instead their result is the \code{\#<void>} value.  The
+expression \code{(void)} is an explicit way to create the
+\code{\#<void>} value and it has type \code{Void}.  The
+\code{\#<void>} value can be passed around just like other values
+inside an \LangLoop{} program and a \code{\#<void>} value can be
+compared for equality with another \code{\#<void>} value. However,
+there are no other operations specific to the the \code{\#<void>}
+value in \LangLoop{}. In contrast, Racket defines the \code{void?}
+predicate that returns \code{\#t} when applied to \code{\#<void>} and
+\code{\#f} otherwise.
 %
-\fi
+\footnote{Racket's \code{Void} type corresponds to what is called the
+  \code{Unit} type in the programming languages literature. Racket's
+  \code{Void} type is inhabited by a single value \code{\#<void>}
+  which corresponds to \code{unit} or \code{()} in the
+  literature~\citep{Pierce:2002hj}.}.
+%
+With the addition of side-effecting features such as \code{while} loop
+and \code{set!}, it is helpful to also include in a language feature
+for sequencing side effects: the \code{begin} expression. It consists
+of one or more subexpressions that are evaluated left-to-right.
+%
+\fi}
 
-\if\edition\pythonEd
+{\if\edition\pythonEd
 %
 In this chapter we study loops, one of the hallmarks of imperative
 programming languages. The following example demonstrates the
@@ -9344,7 +9353,7 @@ The \code{while} loop consists of a condition expression and a body (a
 sequence of statements). The body is evaluated repeatedly so long as
 the condition remains true.
 %
-\fi
+\fi}
 
 \section{The \LangLoop{} Language}
 
@@ -9511,7 +9520,6 @@ function, where we repeatedly interpret the \code{body} so long as the
 {\if\edition\pythonEd
 \begin{lstlisting}
 class InterpLwhile(InterpLif):
-
   def interp_stmts(self, ss, env):
     if len(ss) == 0:
       return
@@ -10242,16 +10250,19 @@ for the compilation of \LangLoop{}.
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \chapter{Tuples and Garbage Collection}
-\label{ch:Rvec}
+\label{ch:Lvec}
 \index{subject}{tuple}
 \index{subject}{vector}
+\index{subject}{allocate}
+\index{subject}{heap allocate}
 
 %% \margincomment{\scriptsize To do: Flesh out this chapter, e.g., make sure
 %%   all the IR grammars are spelled out! \\ --Jeremy}
 %% \margincomment{\scriptsize Be more explicit about how to deal with
 %%   the root stack. \\ --Jeremy}
 
-In this chapter we study the implementation of tuples\racket{, called vectors in Racket}.
+In this chapter we study the implementation of
+tuples\racket{, called vectors in Racket}.
 %
 This language feature is the first of ours to use the computer's
 \emph{heap}\index{subject}{heap} because the lifetime of a tuple is
@@ -10263,7 +10274,7 @@ needed, which is why we also study \emph{garbage collection}
 
 Section~\ref{sec:r3} introduces the \LangVec{} language including its
 interpreter and type checker. The \LangVec{} language extends the \LangLoop{}
-language of Chapter~\ref{ch:Lwhile} with tuple.
+language of Chapter~\ref{ch:Lwhile} with tuples.
 
 Section~\ref{sec:GC} describes a garbage collection algorithm based on
 copying live objects back and forth between two halves of the
@@ -10278,25 +10289,30 @@ passes, including a new compiler pass named \code{expose-allocation}.
 \section{The \LangVec{} Language}
 \label{sec:r3}
 
-Figure~\ref{fig:Rvec-concrete-syntax} defines the concrete syntax for
-\LangVec{} and Figure~\ref{fig:Rvec-syntax} defines the abstract syntax.
+Figure~\ref{fig:Lvec-concrete-syntax} defines the concrete syntax for
+\LangVec{} and Figure~\ref{fig:Lvec-syntax} defines the abstract syntax.
 %
 \racket{The \LangVec{} language includes the forms: \code{vector} for
   creating a tuple, \code{vector-ref} for reading an element of a
-  tuple, and \code{vector-set!} for writing to an element of a tuple.}
-%
-\python{The \LangVec{} language includes tuple creation via a
-  comma-separated list of expressions and it supports accessing an
-  element of a tuple with the square bracket notation, i.e.,
-  \code{t[n]} returns the nth element of the tuple \code{t}.}
-%
-The program in Figure~\ref{fig:vector-eg} shows the usage of
-tuples. We create a 3-tuple \code{t} and a 1-tuple that is stored at
+  tuple, \code{vector-set!} for writing to an element of a tuple, and
+  \code{vector-length} for obtaining the number of elements of a
+  tuple.}
+%
+\python{The \LangVec{} language adds 1) tuple creation via a
+  comma-separated list of expressions, 2) accessing an element of a
+  tuple with the square bracket notation, i.e., \code{t[n]} returns
+  the nth element of the tuple \code{t}, 3) the \code{is} comparison
+  operator, and 4) obtaining the number of elements (the length) of a
+  tuple.}
+%
+The program in Figure~\ref{fig:vector-eg} shows an example use of
+tuples. It creates a 3-tuple \code{t} and a 1-tuple that is stored at
 index $2$ of the 3-tuple, demonstrating that tuples are first-class
-values.  The element at index $1$ of \code{t} is \racket{\code{\#t}}\python{True}, so the
-``then'' branch of the \key{if} is taken.  The element at index $0$ of
-\code{t} is \code{40}, to which we add \code{2}, the element at index
-$0$ of the 1-tuple. So the result of the program is \code{42}.
+values.  The element at index $1$ of \code{t} is
+\racket{\code{\#t}}\python{True}, so the ``then'' branch of the
+\key{if} is taken.  The element at index $0$ of \code{t} is \code{40},
+to which we add \code{2}, the element at index $0$ of the 1-tuple. So
+the result of the program is \code{42}.
 
 \begin{figure}[tbp]
 \centering
@@ -10330,11 +10346,11 @@ $0$ of the 1-tuple. So the result of the program is \code{42}.
 {\if\edition\pythonEd
 \[
 \begin{array}{rcl}
-  \itm{binop} &::= & \key{+} \MID \key{-} \MID \key{and} \MID \key{or} \MID \key{==} \MID \key{!=} \MID \key{<} \MID \key{<=} \MID \key{>} \MID \key{>=} \\
+  \itm{binop} &::= & \key{+} \MID \key{-} \MID \key{and} \MID \key{or} \MID \key{==} \MID \key{!=} \MID \key{<} \MID \key{<=} \MID \key{>} \MID \key{>=} \MID \key{is} \\
   \itm{uniop} &::= & \key{-} \MID \key{not} \\
   \Exp &::=& \Int \MID \key{input\_int}\LP\RP \MID \CUNIOP{\itm{uniop}}{\Exp} \MID \CBINOP{\itm{binop}}{\Exp}{\Exp} \MID \Var{} \\
   &\MID&  \TRUE \MID \FALSE \MID \CIF{\Exp}{\Exp}{\Exp} \\
-  &\MID& \Exp \key{,} \ldots \MID \CGET{\Exp}{\Exp} \\
+  &\MID& \Exp \key{,} \ldots \key{,} \Exp \MID \CGET{\Exp}{\Exp} \MID \CLEN{\Exp} \\
   \Stmt &::=& \key{print}\LP \Exp \RP \MID \Exp \MID \CASSIGN{\Var}{\Exp}
          \MID \key{if}~ \Exp \key{:}~ \Stmt^{+} ~\key{else:}~ \Stmt^{+}\\
         &\MID& \key{while}~ \Exp \key{:}~ \Stmt^{+}\\
@@ -10346,7 +10362,7 @@ $0$ of the 1-tuple. So the result of the program is \code{42}.
 }
 \caption{The concrete syntax of \LangVec{}, extending \LangLoop{}
   (Figure~\ref{fig:Lwhile-concrete-syntax}).}
-\label{fig:Rvec-concrete-syntax}
+\label{fig:Lvec-concrete-syntax}
 \end{figure}
 
 \begin{figure}[tbp]
@@ -10393,7 +10409,7 @@ print( t[0] + t[2][0] if t[1] else 44 )
 \begin{array}{lcl}
 \itm{binop} &::=& \code{Add()} \MID \code{Sub()} \\
 \itm{boolop} &::=& \code{And()} \MID \code{Or()} \\
-\itm{cmp} &::= & \code{Eq()} \MID \code{NotEq()} \MID \code{Lt()} \MID \code{LtE()} \MID \code{Gt()} \MID \code{GtE()} \\
+\itm{cmp} &::= & \code{Eq()} \MID \code{NotEq()} \MID \code{Lt()} \MID \code{LtE()} \MID \code{Gt()} \MID \code{GtE()} \MID \code{Is()} \\
 \itm{uniop} &::=& \code{USub()} \MID \code{Not()} \\
 \itm{bool} &::=& \code{True} \MID \code{False} \\
 \Exp &::=& \INT{\Int} \MID \READ{} \MID \VAR{\Var} \\
@@ -10402,7 +10418,8 @@ print( t[0] + t[2][0] if t[1] else 44 )
      &\MID& \CMP{\Exp}{\itm{cmp}}{\Exp} 
      \MID \BOOLOP{\itm{boolop}}{\Exp}{\Exp}\\
      &\MID& \BOOL{\itm{bool}} \MID \IF{\Exp}{\Exp}{\Exp} \\
-     &\MID& \TUPLE{\Exp^{+}} \MID \GET{\Exp}{\Exp}\\
+     &\MID& \TUPLE{\Exp^{+}} \MID \GET{\Exp}{\Exp} \\
+     &\MID& \LEN{\Exp}\\
 \Stmt{} &::=& \PRINT{\Exp} \MID \EXPR{\Exp} \\
      &\MID& \ASSIGN{\VAR{\Var}}{\Exp} \MID \IFSTMT{\Exp}{\Stmt^{+}}{\Stmt^{+}}\\
      &\MID& \WHILESTMT{\Exp}{\Stmt^{+}}\\
@@ -10413,22 +10430,47 @@ print( t[0] + t[2][0] if t[1] else 44 )
 \end{minipage}
 }
 \caption{The abstract syntax of \LangVec{}.}
-\label{fig:Rvec-syntax}
+\label{fig:Lvec-syntax}
 \end{figure}
 
-\python{UNDER CONSTRUCTION}
+Tuples raises several interesting new issues.  First, variable binding
+performs a shallow-copy when dealing with tuples, which means that
+different variables can refer to the same tuple, that is, two
+variables can be \emph{aliases}\index{subject}{alias} for the same
+entity. Consider the following example in which both \code{t1} and
+\code{t2} refer to the same tuple value but \code{t3} refers to a
+different tuple value but with equal elements. The result of the
+program is \code{42}.
+
 
-\index{subject}{allocate}
-\index{subject}{heap allocate}
-Tuples are our first encounter with heap-allocated data, which raises
-several interesting issues. First, variable binding performs a
-shallow-copy when dealing with tuples, which means that different
-variables can refer to the same tuple, that is, different variables
-can be \emph{aliases} for the same entity. Consider the following
-example in which both \code{t1} and \code{t2} refer to the same tuple.
-Thus, the mutation through \code{t2} is visible when referencing the
-tuple from \code{t1}, so the result of this program is \code{42}.
-\index{subject}{alias}\index{subject}{mutation}
+\begin{center}
+\begin{minipage}{0.96\textwidth}
+{\if\edition\racketEd        
+\begin{lstlisting}
+(let ([t1 (vector 3 7)])
+  (let ([t2 t1])
+    (let ([t3 (vector 3 7)])
+       (if (and (eq? t1 t2) (not (eq? t1 t3)))
+          42
+          0))))
+\end{lstlisting}
+\fi}
+{\if\edition\pythonEd
+\begin{lstlisting}
+t1 = 3, 7
+t2 = t1
+t3 = 3, 7
+print( 42 if (t1 is t2) and not (t1 is t3) else 0)
+\end{lstlisting}
+\fi}
+\end{minipage}
+\end{center}
+
+{\if\edition\racketEd        
+Whether two variables are aliased or not affects what happens
+when the underlying tuple is mutated\index{subject}{mutation}.
+Consider the following example in which \code{t1} and \code{t2}
+again refer to the same tuple value.
 \begin{center}
 \begin{minipage}{0.96\textwidth}
 \begin{lstlisting}
@@ -10439,14 +10481,20 @@ tuple from \code{t1}, so the result of this program is \code{42}.
 \end{lstlisting}
 \end{minipage}
 \end{center}
+The mutation through \code{t2} is visible when referencing the tuple
+from \code{t1}, so the result of this program is \code{42}.
+\fi}
 
-The next issue concerns the lifetime of tuples. Of course, they are
-created by the \code{vector} form, but when does their lifetime end?
-Notice that \LangVec{} does not include an operation for deleting
-tuples. Furthermore, the lifetime of a tuple is not tied to any notion
-of static scoping. For example, the following program returns
-\code{42} even though the variable \code{w} goes out of scope prior to
-the \code{vector-ref} that reads from the vector it was bound to.
+The next issue concerns the lifetime of tuples. When does their
+lifetime end?  Notice that \LangVec{} does not include an operation
+for deleting tuples. Furthermore, the lifetime of a tuple is not tied
+to any notion of static scoping.
+%
+{\if\edition\racketEd        
+%
+For example, the following program returns \code{42} even though the
+variable \code{w} goes out of scope prior to the \code{vector-ref}
+that reads from the vector it was bound to.
 \begin{center}
 \begin{minipage}{0.96\textwidth}
 \begin{lstlisting}
@@ -10458,31 +10506,51 @@ the \code{vector-ref} that reads from the vector it was bound to.
 \end{lstlisting}
 \end{minipage}
 \end{center}
-
+\fi}
+%
+{\if\edition\pythonEd
+%
+For example, the following program returns \code{42} even though the
+variable \code{x} goes out of scope when the function returns, prior
+to reading the tuple element at index zero. (We study the compilation
+of functions in Chapter~\ref{ch:Rfun}.)
+%  
+\begin{center}
+\begin{minipage}{0.96\textwidth}
+\begin{lstlisting}
+def f():
+    x = 42, 43
+    return x
+t = f()
+print( t[0] )
+\end{lstlisting}
+\end{minipage}
+\end{center}
+\fi}
+%
 From the perspective of programmer-observable behavior, tuples live
 forever. Of course, if they really lived forever, then many programs
-would run out of memory.\footnote{The \LangVec{} language does not have
-  looping or recursive functions, so it is nigh impossible to write a
-  program in \LangVec{} that will run out of memory. However, we add
-  recursive functions in the next Chapter!} A Racket implementation
-must therefore perform automatic garbage collection.
-
-Figure~\ref{fig:interp-Rvec} shows the definitional interpreter for the
-\LangVec{} language. We define the \code{vector}, \code{vector-length},
-\code{vector-ref}, and \code{vector-set!} operations for \LangVec{} in
-terms of the corresponding operations in Racket. One subtle point is
-that the \code{vector-set!}  operation returns the \code{\#<void>}
-value. The \code{\#<void>} value can be passed around just like other
-values inside an \LangVec{} program and a \code{\#<void>} value can be
-compared for equality with another \code{\#<void>} value. However,
-there are no other operations specific to the the \code{\#<void>}
-value in \LangVec{}. In contrast, Racket defines the \code{void?} predicate
-that returns \code{\#t} when applied to \code{\#<void>} and \code{\#f}
-otherwise.
+would run out of memory. The language's runtime system must therefore
+perform automatic garbage collection.
+
+Figure~\ref{fig:interp-Lvec} shows the definitional interpreter for the
+\LangVec{} language.
+%
+\racket{We define the \code{vector}, \code{vector-ref},
+  \code{vector-set!}, and \code{vector-length} operations for
+  \LangVec{} in terms of the corresponding operations in Racket.  One
+  subtle point is that the \code{vector-set!}  operation returns the
+  \code{\#<void>} value.}
+%
+\python{We define tuple creation, element access, and the \code{len}
+  operator for \LangVec{} in terms of the corresponding operations in
+  Python.}
+
 
 \begin{figure}[tbp]
+{\if\edition\racketEd
 \begin{lstlisting}
-(define interp-Rvec_class
+(define interp-Lvec_class
   (class interp-Lif_class
     (super-new)
 
@@ -10510,31 +10578,60 @@ otherwise.
         ))
     ))
 
-(define (interp-Rvec p)
-  (send (new interp-Rvec_class) interp-program p))
+(define (interp-Lvec p)
+  (send (new interp-Lvec_class) interp-program p))
+\end{lstlisting}
+\fi}
+%
+{\if\edition\pythonEd
+\begin{lstlisting}
+class InterpLtup(InterpLwhile):
+  def interp_cmp(self, cmp):
+    match cmp:
+      case Is():
+        return lambda x, y: x is y
+      case _:
+        return super().interp_cmp(cmp)      
+  def interp_exp(self, e, env):
+    match e:
+      case Tuple(es, Load()):
+        return tuple([self.interp_exp(e, env) for e in es])
+      case Subscript(tup, index, Load()):
+        t = self.interp_exp(tup, env)
+        n = self.interp_exp(index, env)
+        return t[n]
+      case _:
+        return super().interp_exp(e, env)
 \end{lstlisting}
+\fi}
 \caption{Interpreter for the \LangVec{} language.}
-\label{fig:interp-Rvec}
+\label{fig:interp-Lvec}
 \end{figure}
 
-Figure~\ref{fig:type-check-Rvec} shows the type checker for \LangVec{}, which
-deserves some explanation. When allocating a vector, we need to know
-which elements of the vector are pointers (i.e. are also vectors). We
-can obtain this information during type checking. The type checker in
-Figure~\ref{fig:type-check-Rvec} not only computes the type of an
-expression, it also wraps every \key{vector} creation with the form
-$(\key{HasType}~e~T)$, where $T$ is the vector's type.
+Figure~\ref{fig:type-check-Lvec} shows the type checker for
+\LangVec{}, which deserves some explanation. When allocating a tuple,
+we need to know which elements of the tuple are pointers (i.e. are
+also tuple) for garbage collection purposes. We can obtain this
+information during type checking. The type checker in
+Figure~\ref{fig:type-check-Lvec} not only computes the type of an
+expression, it also
 %
+\racket{wraps every tuple creation with the form $(\key{HasType}~e~T)$,
+  where $T$ is the vector's type.
 To create the s-expression for the \code{Vector} type in
-Figure~\ref{fig:type-check-Rvec}, we use the
+Figure~\ref{fig:type-check-Lvec}, we use the
 \href{https://docs.racket-lang.org/reference/quasiquote.html}{unquote-splicing
   operator} \code{,@} to insert the list \code{t*} without its usual
-start and end parentheses.  \index{subject}{unquote-slicing}
+start and end parentheses. \index{subject}{unquote-slicing}}
+%
+\python{records the type of each tuple expression in a new field
+  named \code{has\_type}.}
 
 
 \begin{figure}[tp]
+{\if\edition\racketEd
 \begin{lstlisting}[basicstyle=\ttfamily\scriptsize]
-(define type-check-Rvec_class
+(define type-check-Lvec_class
   (class type-check-Lif_class
     (super-new)
     (inherit check-type-equal?)
@@ -10589,11 +10686,39 @@ start and end parentheses.  \index{subject}{unquote-slicing}
           )))
     ))
 
-(define (type-check-Rvec p)
-  (send (new type-check-Rvec_class) type-check-program p))
+(define (type-check-Lvec p)
+  (send (new type-check-Lvec_class) type-check-program p))
 \end{lstlisting}
+\fi}
+{\if\edition\pythonEd
+\begin{lstlisting}
+class TypeCheckLtup(TypeCheckLwhile):
+  def type_check_exp(self, e, env):
+    match e:
+      case Compare(left, [cmp], [right]) if isinstance(cmp, Is):
+        l = self.type_check_exp(left, env)
+        r = self.type_check_exp(right, env)
+        check_type_equal(l, r, e)
+        return bool
+      case Tuple(es, Load()):
+        ts = [self.type_check_exp(e, env) for e in es]
+        e.has_type = tuple(ts)
+        return e.has_type
+      case Subscript(tup, Constant(index), Load()):
+        tup_ty = self.type_check_exp(tup, env)
+        index_ty = self.type_check_exp(Constant(index), env)
+        check_type_equal(index_ty, int, index)
+        match tup_ty:
+          case tuple(ts):
+            return ts[index]
+          case _:
+            raise Exception('error: expected a tuple, not ' + repr(tup_ty))
+      case _:
+        return super().type_check_exp(e, env)
+\end{lstlisting}
+\fi}
 \caption{Type checker for the \LangVec{} language.}
-\label{fig:type-check-Rvec}
+\label{fig:type-check-Lvec}
 \end{figure}
 
 
@@ -10766,12 +10891,13 @@ within a 2-week time budget, we recommend a combination of options 1
 and 2, using separate strategies for the stack and the heap.
 
 Regarding the stack, we recommend using a separate stack for pointers,
-which we call a \emph{root stack}\index{subject}{root stack} (a.k.a. ``shadow
+which we call a \emph{root stack}\index{subject}{root stack}
+(a.k.a. ``shadow
 stack'')~\citep{Siebert:2001aa,Henderson:2002aa,Baker:2009aa}. That
 is, when a local variable needs to be spilled and is of type
-\code{(Vector $\Type_1 \ldots \Type_n$)}, then we put it on the root
-stack instead of the normal procedure call stack. Furthermore, we
-always spill vector-typed variables if they are live during a call to
+\racket{\code{Vector}}\python{\code{tuple}}, then we put it on the
+root stack instead of the normal procedure call stack. Furthermore, we
+always spill tuple-typed variables if they are live during a call to
 the collector, thereby ensuring that no pointers are in registers
 during a collection. Figure~\ref{fig:shadow-stack} reproduces the
 example from Figure~\ref{fig:copying-collector} and contrasts it with
@@ -10898,15 +11024,23 @@ significant changes to \code{select\_instructions},
 \code{print\_x86} and make minor changes in several more passes.  The
 following program will serve as our running example.  It creates two
 tuples, one nested inside the other. Both tuples have length one. The
-program accesses the element in the inner tuple tuple via two vector
-references.
+program accesses the element in the inner tuple tuple.
 % tests/s2_17.rkt
+{\if\edition\racketEd
 \begin{lstlisting}
 (vector-ref (vector-ref (vector (vector 42)) 0) 0)
 \end{lstlisting}
+\fi}
+{\if\edition\pythonEd
+\begin{lstlisting}
+((42,),)[0][0]
+\end{lstlisting}
+\fi}
+
 
+{\if\edition\racketEd
 \section{Shrink}
-\label{sec:shrink-Rvec}
+\label{sec:shrink-Lvec}
 
 Recall that the \code{shrink} pass translates the primitives operators
 into a smaller set of primitives. Because this pass comes after type
@@ -10914,10 +11048,13 @@ checking, but before the passes that require the type information in
 the \code{HasType} AST nodes, the \code{shrink} pass must be modified
 to wrap \code{HasType} around each AST node that it generates.
 
+\fi}
 
 \section{Expose Allocation}
 \label{sec:expose-allocation}
 
+\python{UNDER CONSTRUCTION}
+
 The pass \code{expose-allocation} lowers the \code{vector} creation
 form into a conditional call to the collector followed by the
 allocation.  We choose to place the \code{expose-allocation} pass
@@ -11020,7 +11157,7 @@ Figure~\ref{fig:expose-alloc-output} shows the output of the
 
 
 \section{Remove Complex Operands}
-\label{sec:remove-complex-opera-Rvec}
+\label{sec:remove-complex-opera-Lvec}
 
 The new forms \code{collect}, \code{allocate}, and \code{global-value}
 should all be treated as complex operands.
@@ -11028,7 +11165,7 @@ should all be treated as complex operands.
 %% \code{HasType} is needed and the case for \code{Prim} needs to be
 %% handled carefully to prevent the \code{Prim} node from being separated
 %% from its enclosing \code{HasType}.
-Figure~\ref{fig:Rvec-anf-syntax}
+Figure~\ref{fig:Lvec-anf-syntax}
 shows the grammar for the output language \LangVecANF{} of this
 pass, which is \LangVec{} in administrative normal form.
 
@@ -11055,7 +11192,7 @@ pass, which is \LangVec{} in administrative normal form.
 \end{minipage}
 }
 \caption{\LangVecANF{} is \LangVec{} in administrative normal form (ANF).}
-\label{fig:Rvec-anf-syntax}
+\label{fig:Lvec-anf-syntax}
 \end{figure}
 
 
@@ -11474,11 +11611,11 @@ conclusion:
 
 \begin{figure}[p]
 \begin{tikzpicture}[baseline=(current  bounding  box.center)]
-\node (Rvec) at (0,2)  {\large \LangVec{}};
-\node (Rvec-2) at (3,2)  {\large \LangVec{}};
-\node (Rvec-3) at (6,2)  {\large \LangVec{}};
-\node (Rvec-4) at (9,2)  {\large \LangVec{}};
-\node (Rvec-5) at (12,2)  {\large \LangAlloc{}};
+\node (Lvec) at (0,2)  {\large \LangVec{}};
+\node (Lvec-2) at (3,2)  {\large \LangVec{}};
+\node (Lvec-3) at (6,2)  {\large \LangVec{}};
+\node (Lvec-4) at (9,2)  {\large \LangVec{}};
+\node (Lvec-5) at (12,2)  {\large \LangAlloc{}};
 \node (C2-4) at (3,0)  {\large \LangCVec{}};
 
 \node (x86-2) at (3,-2)  {\large \LangXGlobalVar{}};
@@ -11489,12 +11626,12 @@ conclusion:
 \node (x86-5) at (9,-4) {\large \LangXGlobal{}};
 
 
-%\path[->,bend left=15] (Rvec) edge [above] node {\ttfamily\footnotesize type-check} (Rvec-2);
-\path[->,bend left=15] (Rvec) edge [above] node {\ttfamily\footnotesize shrink} (Rvec-2);
-\path[->,bend left=15] (Rvec-2) edge [above] node {\ttfamily\footnotesize uniquify} (Rvec-3);
-\path[->,bend left=15] (Rvec-3) edge [above] node {\ttfamily\footnotesize expose\_alloc.} (Rvec-4);
-\path[->,bend left=15] (Rvec-4) edge [above] node {\ttfamily\footnotesize remove\_complex.} (Rvec-5);
-\path[->,bend left=20] (Rvec-5) edge [left] node {\ttfamily\footnotesize explicate\_control} (C2-4);
+%\path[->,bend left=15] (Lvec) edge [above] node {\ttfamily\footnotesize type-check} (Lvec-2);
+\path[->,bend left=15] (Lvec) edge [above] node {\ttfamily\footnotesize shrink} (Lvec-2);
+\path[->,bend left=15] (Lvec-2) edge [above] node {\ttfamily\footnotesize uniquify} (Lvec-3);
+\path[->,bend left=15] (Lvec-3) edge [above] node {\ttfamily\footnotesize expose\_alloc.} (Lvec-4);
+\path[->,bend left=15] (Lvec-4) edge [above] node {\ttfamily\footnotesize remove\_complex.} (Lvec-5);
+\path[->,bend left=20] (Lvec-5) edge [left] node {\ttfamily\footnotesize explicate\_control} (C2-4);
 \path[->,bend left=15] (C2-4) edge [right] node {\ttfamily\footnotesize select\_instr.} (x86-2);
 \path[->,bend right=15] (x86-2) edge [left] node {\ttfamily\footnotesize uncover\_live} (x86-2-1);
 \path[->,bend right=15] (x86-2-1) edge [below] node {\ttfamily\footnotesize build\_inter.} (x86-2-2);
@@ -11503,10 +11640,10 @@ conclusion:
 \path[->,bend left=15] (x86-4) edge [right] node {\ttfamily\footnotesize print\_x86} (x86-5);
 \end{tikzpicture}
 \caption{Diagram of the passes for \LangVec{}, a language with tuples.}
-\label{fig:Rvec-passes}
+\label{fig:Lvec-passes}
 \end{figure}
 
-Figure~\ref{fig:Rvec-passes} gives an overview of all the passes needed
+Figure~\ref{fig:Lvec-passes} gives an overview of all the passes needed
 for the compilation of \LangVec{}.
 
 \section{Challenge: Simple Structures}
@@ -11552,7 +11689,7 @@ this case the definition of a \code{point} type.
 \end{minipage}
 }
 \caption{The concrete syntax of \LangStruct{}, extending \LangVec{}
-  (Figure~\ref{fig:Rvec-concrete-syntax}).}
+  (Figure~\ref{fig:Lvec-concrete-syntax}).}
 \label{fig:r3s-concrete-syntax}
 \end{figure}
 
@@ -11597,7 +11734,7 @@ mark. The following example uses \code{set-point-x!} to change the
 \section{Challenge: Arrays}
 \label{sec:arrays}
 
-In Chapter~\ref{ch:Rvec} we studied tuples, that is, sequences of
+In Chapter~\ref{ch:Lvec} we studied tuples, that is, sequences of
 elements whose length is determined at compile-time and where each
 element of a tuple may have a different type (they are
 heterogeous). This challenge is also about sequences, but this time
@@ -11610,7 +11747,7 @@ they are both represented by vectors. However, Typed Racket
 distinguishes between tuples and arrays: the \code{Vector} type is for
 tuples and the \code{Vectorof} type is for arrays.
 %
-Figure~\ref{fig:Rvecof-concrete-syntax} defines the concrete syntax
+Figure~\ref{fig:Lvecof-concrete-syntax} defines the concrete syntax
 for \LangArray{}, extending \LangLoop{} with the \code{Vectorof} type
 and the \code{make-vector} primitive operator for creating an array,
 whose arguments are the length of the array and an initial value for
@@ -11656,7 +11793,7 @@ inner-product of two arrays (Figure~\ref{fig:inner-product}).
 \end{minipage}
 }
 \caption{The concrete syntax of \LangArray{}, extending \LangLoop{} (Figure~\ref{fig:Lwhile-concrete-syntax}).}
-\label{fig:Rvecof-concrete-syntax}
+\label{fig:Lvecof-concrete-syntax}
 \end{figure}
 
 
@@ -11687,7 +11824,7 @@ inner-product of two arrays (Figure~\ref{fig:inner-product}).
 
 
 The type checker for \LangArray{} is define in
-Figure~\ref{fig:type-check-Rvecof}. The result type of
+Figure~\ref{fig:type-check-Lvecof}. The result type of
 \code{make-vector} is \code{(Vectorof T)} where \code{T} is the type
 of the intializing expression.  The length expression is required to
 have type \code{Integer}. The type checking of the operators
@@ -11704,7 +11841,7 @@ predicate.
 
 \begin{figure}[tbp]
 \begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
-(define type-check-Rvecof_class
+(define type-check-Lvecof_class
   (class type-check-Rwhile_class
     (super-new)
     (inherit check-type-equal?)
@@ -11753,38 +11890,38 @@ predicate.
           [else ((super type-check-exp env) e)])))
     ))
 
-(define (type-check-Rvecof p)
-  (send (new type-check-Rvecof_class) type-check-program p))
+(define (type-check-Lvecof p)
+  (send (new type-check-Lvecof_class) type-check-program p))
 \end{lstlisting}
 \caption{Type checker for the \LangArray{} language.}
-\label{fig:type-check-Rvecof}
+\label{fig:type-check-Lvecof}
 \end{figure}
 
 The interpreter for \LangArray{} is defined in
-Figure~\ref{fig:interp-Rvecof}.  The \code{make-vector} operator is
+Figure~\ref{fig:interp-Lvecof}.  The \code{make-vector} operator is
 implemented with Racket's \code{make-vector} function and
 multiplication is \code{fx*}, multiplication for \code{fixnum}
 integers.
 
 \begin{figure}[tbp]
 \begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
-(define interp-Rvecof_class
+(define interp-Lvecof_class
   (class interp-Rwhile_class
     (super-new)
 
     (define/override (interp-op op)
-      (verbose "Rvecof/interp-op" op)
+      (verbose "Lvecof/interp-op" op)
       (match op
         ['make-vector make-vector]
         ['* fx*]
         [else (super interp-op op)]))
     ))
 
-(define (interp-Rvecof p)
-  (send (new interp-Rvecof_class) interp-program p))
+(define (interp-Lvecof p)
+  (send (new interp-Lvecof_class) interp-program p))
 \end{lstlisting}
 \caption{Interpreter for \LangArray{}.}
-\label{fig:interp-Rvecof}
+\label{fig:interp-Lvecof}
 \end{figure}
 
 
@@ -12075,7 +12212,7 @@ inside each other.
 \]
 \end{minipage}
 }
-\caption{The concrete syntax of \LangFun{}, extending \LangVec{} (Figure~\ref{fig:Rvec-concrete-syntax}).}
+\caption{The concrete syntax of \LangFun{}, extending \LangVec{} (Figure~\ref{fig:Lvec-concrete-syntax}).}
 \label{fig:Rfun-concrete-syntax}
 \end{figure}
 
@@ -12098,7 +12235,7 @@ inside each other.
 \]
 \end{minipage}
 }
-\caption{The abstract syntax of \LangFun{}, extending \LangVec{} (Figure~\ref{fig:Rvec-syntax}).}
+\caption{The abstract syntax of \LangFun{}, extending \LangVec{} (Figure~\ref{fig:Lvec-syntax}).}
 \label{fig:Rfun-syntax}
 \end{figure}
 
@@ -12143,7 +12280,7 @@ update the \code{lambda} values to use the top-level environment.
 \begin{figure}[tp]
 \begin{lstlisting}
 (define interp-Rfun_class
-  (class interp-Rvec_class
+  (class interp-Lvec_class
     (super-new)
 
     (define/override ((interp-exp env) e)
@@ -12185,6 +12322,8 @@ update the \code{lambda} values to use the top-level environment.
 (define (interp-Rfun p)
   (send (new interp-Rfun_class) interp-program p))
 \end{lstlisting}
+
+
 \caption{Interpreter for the \LangFun{} language.}
 \label{fig:interp-Rfun}
 \end{figure}
@@ -12197,7 +12336,7 @@ The type checker for \LangFun{} is in Figure~\ref{fig:type-check-Rfun}.
 \begin{figure}[tp]
 \begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
 (define type-check-Rfun_class
-  (class type-check-Rvec_class
+  (class type-check-Lvec_class
     (super-new)
     (inherit check-type-equal?)
 
@@ -12963,7 +13102,7 @@ except the \code{retq} is replaced with \code{jmp *$\itm{arg}$}.
 Regarding function definitions, you will need to generate a prelude
 and conclusion for each one. This code is similar to the prelude and
 conclusion that you generated for the \code{main} function in
-Chapter~\ref{ch:Rvec}. To review, the prelude of every function
+Chapter~\ref{ch:Lvec}. To review, the prelude of every function
 should carry out the following steps.
 \begin{enumerate}
 \item Start with \code{.global} and \code{.align} directives followed
@@ -13251,7 +13390,7 @@ applied. An efficient solution to the problem, due to
 free variables together with the function pointer for the lambda's
 code, an arrangement called a \emph{flat closure} (which we shorten to
 just ``closure'').  \index{subject}{closure}\index{subject}{flat closure} Fortunately,
-we have all the ingredients to make closures, Chapter~\ref{ch:Rvec}
+we have all the ingredients to make closures, Chapter~\ref{ch:Lvec}
 gave us vectors and Chapter~\ref{ch:Rfun} gave us function
 pointers. The function pointer resides at index $0$ and the
 values for the free variables will fill in the rest of the vector.

+ 2 - 0
defs.tex

@@ -168,6 +168,8 @@
 \newcommand{\CEQ}[2]{#1~\code{==}~#2}
 \newcommand{\CGET}[2]{#1 \LS #2 \RS}
 \newcommand{\GET}[2]{\key{Subscript}\LP #1 \code{,} #2 \code{,} \code{Load()} \RP}
+\newcommand{\CLEN}[1]{\code{len}\LP #1 \RP}
+\newcommand{\LEN}[1]{\code{Call}\LP \code{Name('len')} \code{,} \LS #1 \RS \RP}
 \newcommand{\PUT}[2]{\key{Subscript}\LP #1 \code{,} #2 \code{,} \code{Store()} \RP}
 \newcommand{\TUPLE}[1]{\key{Tuple}\LP #1 \code{,} \code{Load()} \RP}
 \newcommand{\BOOLOP}[3]{\key{BoolOp}\LP #1 \code{,} \LS #2 \code{,} #3 \RS \RP}