瀏覽代碼

chapter2 wip

Andrew Tolmach 4 年之前
父節點
當前提交
03ab5610f7
共有 1 個文件被更改,包括 362 次插入53 次删除
  1. 362 53
      book.tex

+ 362 - 53
book.tex

@@ -103,11 +103,10 @@ showstringspaces=false
   language=[Objective]Caml,
   language=[Objective]Caml,
   basicstyle=\ttfamily\small\color{blue},
   basicstyle=\ttfamily\small\color{blue},
   columns=flexible,
   columns=flexible,
-  escapechar={},
+  escapechar=~,
   showstringspaces=false
   showstringspaces=false
 }
 }
 
 
-
 \newtheorem{theorem}{Theorem}
 \newtheorem{theorem}{Theorem}
 \newtheorem{lemma}[theorem]{Lemma}
 \newtheorem{lemma}[theorem]{Lemma}
 \newtheorem{corollary}[theorem]{Corollary}
 \newtheorem{corollary}[theorem]{Corollary}
@@ -887,13 +886,15 @@ Appendix~\ref{appendix:utilities} for more details.
     | SNum of Int64.t
     | SNum of Int64.t
             (* 64-bit integers *)
             (* 64-bit integers *)
     | SSym of string
     | SSym of string
-            (* non-digit character sequence delimited by white space *)
+            (* character sequence starting with non-digit,
+               delimited by white space *)
     | SString of string
     | SString of string
             (* arbitrary character sequence delimited by double quotes *)
             (* arbitrary character sequence delimited by double quotes *)
   \end{lstlisting}
   \end{lstlisting}
   The generic S-expression parser handles (nestable) comments delimited by
   The generic S-expression parser handles (nestable) comments delimited by
-  curly braces (\code{\{} and \code{\}}).  Symbols can contain any
-  non-digit, non-whitespace characters except parentheses, curly braces, and
+  curly braces (\code{\{} and \code{\}}).  Symbols must start with a non-digit
+  character and can contain any
+  non-whitespace characters except parentheses, curly braces, and
   the back tick (\code{\`}); this last exclusion is handy when we want to
   the back tick (\code{\`}); this last exclusion is handy when we want to
   generate internal names during compilation and be sure they don't clash
   generate internal names during compilation and be sure they don't clash
   with a user-defined symbol.
   with a user-defined symbol.
@@ -1641,19 +1642,30 @@ We hope to give enough hints that the well-prepared reader, together
 with a few friends, can implement a compiler from \LangVar{} to x86 in
 with a few friends, can implement a compiler from \LangVar{} to x86 in
 a couple weeks.  To give the reader a feeling for the scale of this
 a couple weeks.  To give the reader a feeling for the scale of this
 first compiler, the instructor solution for the \LangVar{} compiler is
 first compiler, the instructor solution for the \LangVar{} compiler is
-approximately 500 lines of code.
+approximately 500 lines of code. \ocaml{For the OCaml-based course,
+  several pieces of the compiler will be provided for you, leaving enough
+  work for a week-long assignment. The instructor solution for
+  the tasks left to you is under 200 lines of code.
+  However, in return for not writing so much code,
+  you will need to \emph{read} more existing code.}
 
 
 \section{The \LangVar{} Language}
 \section{The \LangVar{} Language}
 \label{sec:s0}
 \label{sec:s0}
 \index{variable}
 \index{variable}
 
 
 The \LangVar{} language extends the \LangInt{} language with variable
 The \LangVar{} language extends the \LangInt{} language with variable
-definitions.  The concrete syntax of the \LangVar{} language is defined by
+definitions. The concrete syntax of the \LangVar{} language is defined by
 the grammar in Figure~\ref{fig:r1-concrete-syntax} and the abstract
 the grammar in Figure~\ref{fig:r1-concrete-syntax} and the abstract
-syntax is defined in Figure~\ref{fig:r1-syntax}.  The non-terminal
-\Var{} may be any Racket identifier. As in \LangInt{}, \key{read} is a
+syntax is defined in Figure~\ref{fig:r1-syntax}.  \ocaml{For the OCaml
+  version, we don't feel the need to match the syntax of Racket exactly,
+  so we can simplify the concrete syntax of \key{let} bindings.}   The non-terminal
+\Var{} may be any Racket identifier. \ocaml{For OCaml, it can be any S-expression symbol.}
+As in \LangInt{}, \key{read} is a
 nullary operator, \key{-} is a unary operator, and \key{+} is a binary
 nullary operator, \key{-} is a unary operator, and \key{+} is a binary
-operator.  Similar to \LangInt{}, the abstract syntax of \LangVar{} includes the
+operator.  \ocaml{We also add \key{-} as a binary subtraction operator in
+  the concrete syntax, but not in the abstract syntax: 
+  we will ``de-sugar'' substraction into a combination
+  of addition and negation.}Similar to \LangInt{}, the abstract syntax of \LangVar{} includes the
 \key{Program} struct to mark the top of the program.
 \key{Program} struct to mark the top of the program.
 %% The $\itm{info}$
 %% The $\itm{info}$
 %% field of the \key{Program} structure contains an \emph{association
 %% field of the \key{Program} structure contains an \emph{association
@@ -1675,7 +1687,20 @@ exhibit several compilation techniques.
 \]
 \]
 \end{minipage}
 \end{minipage}
 }
 }
-\caption{The concrete syntax of \LangVar{}.}
+\begin{ocamlx}
+\fbox{
+\begin{minipage}{0.96\textwidth}
+\[
+\begin{array}{rcl}
+  \Exp &::=& \Int \mid \CREAD{} \mid \CNEG{\Exp} \mid \CADD{\Exp}{\Exp} \mid \CSUB{\Exp}{\Exp}\\
+       &\mid& \Var \mid \code{(let $\Var$ $\Exp$ $\Exp$)}\\
+  \LangVar{} &::=& \Exp
+\end{array}
+\]
+\end{minipage}
+}
+\end{ocamlx}
+\caption{The concrete syntax of \LangVar{} \ocaml{in OCaml}.}
 \label{fig:r1-concrete-syntax}
 \label{fig:r1-concrete-syntax}
 \end{figure}
 \end{figure}
 
 
@@ -1693,6 +1718,19 @@ exhibit several compilation techniques.
 \]
 \]
 \end{minipage}
 \end{minipage}
 }
 }
+\begin{lstlisting}[style=ocaml,frame=single]
+type primop = 
+   Read
+ | Neg
+ | Add
+type var = string
+type exp = 
+   Int of int64  
+ | Prim of primop * exp list
+ | Var of var
+ | Let of var * exp * exp
+type 'info program = Program of 'info * exp
+\end{lstlisting}
 \caption{The abstract syntax of \LangVar{}.}
 \caption{The abstract syntax of \LangVar{}.}
 \label{fig:r1-syntax}
 \label{fig:r1-syntax}
 \end{figure}
 \end{figure}
@@ -1705,11 +1743,17 @@ Figure~\ref{fig:r1-syntax}.  The concrete syntax for \key{let} is
 \begin{lstlisting}
 \begin{lstlisting}
 (let ([|$\itm{var}$| |$\itm{exp}$|]) |$\itm{exp}$|)
 (let ([|$\itm{var}$| |$\itm{exp}$|]) |$\itm{exp}$|)
 \end{lstlisting}
 \end{lstlisting}
+\begin{lstlisting}[style=ocaml]
+(let ~$\itm{var}$~ ~$\itm{exp}$~ ~$\itm{exp}$~)
+\end{lstlisting}
 For example, the following program initializes \code{x} to $32$ and then
 For example, the following program initializes \code{x} to $32$ and then
 evaluates the body \code{(+ 10 x)}, producing $42$.
 evaluates the body \code{(+ 10 x)}, producing $42$.
 \begin{lstlisting}
 \begin{lstlisting}
 (let ([x (+ 12 20)]) (+ 10 x))
 (let ([x (+ 12 20)]) (+ 10 x))
 \end{lstlisting}
 \end{lstlisting}
+\begin{lstlisting}[style=ocaml]
+(let x (+ 12 20) (+ 10 x))
+\end{lstlisting}
 When there are multiple \key{let}'s for the same variable, the closest
 When there are multiple \key{let}'s for the same variable, the closest
 enclosing \key{let} is used. That is, variable definitions overshadow
 enclosing \key{let} is used. That is, variable definitions overshadow
 prior definitions. Consider the following program with two \key{let}'s
 prior definitions. Consider the following program with two \key{let}'s
@@ -1717,6 +1761,9 @@ that define variables named \code{x}. Can you figure out the result?
 \begin{lstlisting}
 \begin{lstlisting}
 (let ([x 32]) (+ (let ([x 10]) x) x))
 (let ([x 32]) (+ (let ([x 10]) x) x))
 \end{lstlisting}
 \end{lstlisting}
+\begin{lstlisting}[style=ocaml]
+(let x 32 (+ (let x 10 x) x))
+\end{lstlisting}
 For the purposes of depicting which variable uses correspond to which
 For the purposes of depicting which variable uses correspond to which
 definitions, the following shows the \code{x}'s annotated with
 definitions, the following shows the \code{x}'s annotated with
 subscripts to distinguish them. Double check that your answer for the
 subscripts to distinguish them. Double check that your answer for the
@@ -1725,6 +1772,9 @@ program.
 \begin{lstlisting}
 \begin{lstlisting}
 (let ([x|$_1$| 32]) (+ (let ([x|$_2$| 10]) x|$_2$|) x|$_1$|))
 (let ([x|$_1$| 32]) (+ (let ([x|$_2$| 10]) x|$_2$|) x|$_1$|))
 \end{lstlisting}
 \end{lstlisting}
+\begin{lstlisting}[style=ocaml]
+(let x~$_1$~ 32 (+ (let x~$_2$~ 10 x~$_2$~) x~$_1$~))
+\end{lstlisting}
 The initializing expression is always evaluated before the body of the
 The initializing expression is always evaluated before the body of the
 \key{let}, so in the following, the \key{read} for \code{x} is
 \key{let}, so in the following, the \key{read} for \code{x} is
 performed before the \key{read} for \code{y}. Given the input
 performed before the \key{read} for \code{y}. Given the input
@@ -1732,10 +1782,23 @@ $52$ then $10$, the following produces $42$ (not $-42$).
 \begin{lstlisting}
 \begin{lstlisting}
 (let ([x (read)]) (let ([y (read)]) (+ x (- y))))
 (let ([x (read)]) (let ([y (read)]) (+ x (- y))))
 \end{lstlisting}
 \end{lstlisting}
+\begin{lstlisting}[style=ocaml]
+(let x (read) (let y (read) (+ x (- y)))))
+\end{lstlisting}
 
 
 \subsection{Extensible Interpreters via Method Overriding}
 \subsection{Extensible Interpreters via Method Overriding}
 \label{sec:extensible-interp}
 \label{sec:extensible-interp}
 
 
+\begin{ocamlx}
+  We are not going to bother with making our OCaml interpreters
+  extensible, although there are several mechanisms in OCaml that
+  we could use to acheive this. The languages involved here just
+  don't seem big enough to warrant the added complexity.
+  We will, however, break out the definition and interpretation of
+  primops into a separate module, so that this can be easily shared among
+  different languages.
+\end{ocamlx}
+
 To prepare for discussing the interpreter for \LangVar{}, we need to
 To prepare for discussing the interpreter for \LangVar{}, we need to
 explain why we choose to implement the interpreter using
 explain why we choose to implement the interpreter using
 object-oriented programming, that is, as a collection of methods
 object-oriented programming, that is, as a collection of methods
@@ -1885,10 +1948,16 @@ extensible way.
 \end{wrapfigure}
 \end{wrapfigure}
 
 
 Having justified the use of classes and methods to implement
 Having justified the use of classes and methods to implement
-interpreters, we turn to the definitional interpreter for \LangVar{}
-in Figure~\ref{fig:interp-Rvar}. It is similar to the interpreter for
+interpreters \ocaml{(or not)}, we turn to the definitional interpreter for \LangVar{}
+in Figure~\ref{fig:interp-Rvar} \ocaml{(Figure~\ref{fig:interp-Rvar-ocaml})}.
+It is similar to the interpreter for
 \LangInt{} but adds two new \key{match} cases for variables and
 \LangInt{} but adds two new \key{match} cases for variables and
-\key{let}.  For \key{let} we need a way to communicate the value bound
+\key{let}. \ocaml{Also, the code for performing primops has been split out
+  into a separate function. We rely on the fact that
+  \code{List.map} processes list elements from left to right to
+  enforce the intended order of evaluation of primop subexpressions.}
+
+For \key{let} we need a way to communicate the value bound
 to a variable to all the uses of the variable. To accomplish this, we
 to a variable to all the uses of the variable. To accomplish this, we
 maintain a mapping from variables to values. Throughout the compiler
 maintain a mapping from variables to values. Throughout the compiler
 we often need to map variables to information about them. We refer to
 we often need to map variables to information about them. We refer to
@@ -1899,7 +1968,7 @@ these mappings as
 %
 %
 For simplicity, we use an association list (alist) to represent the
 For simplicity, we use an association list (alist) to represent the
 environment. The sidebar to the right gives a brief introduction to
 environment. The sidebar to the right gives a brief introduction to
-alists and the \code{racket/dict} package.  The \code{interp-exp}
+alists and the \code{racket/dict} package. The \code{interp-exp}
 function takes the current environment, \code{env}, as an extra
 function takes the current environment, \code{env}, as an extra
 parameter.  When the interpreter encounters a variable, it finds the
 parameter.  When the interpreter encounters a variable, it finds the
 corresponding value using the \code{dict-ref} function.  When the
 corresponding value using the \code{dict-ref} function.  When the
@@ -1908,6 +1977,51 @@ expression, extends the environment with the result value bound to the
 variable, using \code{dict-set}, then evaluates the body of the
 variable, using \code{dict-set}, then evaluates the body of the
 \key{Let}.
 \key{Let}.
 
 
+\begin{ocamlx}
+  In OCaml, we thread environments in the same way, but
+  it is convenient to represent environments using
+  the \code{Map} library module, which provides efficient
+  mappings from keys to values (using balanced binary trees,
+  although that is an implementation detail we don't need to
+  know about). \code{Map} is an example of a module that
+  is \emph{parameterized} by another module signature; this
+  is sometimes called a \emph{functor}.  Here we use \code{Map.Make}
+  to \emph{apply} the functor, thereby defining a module \code{Env} that provides operations
+  specialized to \code{string} keys (suitable for variables).
+  The type of environments is written \code{'a Env.t}; it is
+  parametric in the type \code{'a} of values stored in the map.
+  Here we will be using \LangVar{}
+  values, i.e. \code{int64}s, so the type is \code{int64 Env.t}.  
+  \code{Env.empty} represents an empty environment.
+  \code{Env.find $x$ $env$} returns the value associated with
+  variable $x$ in $env$ (throwing an exception if $x$ is not found). 
+  \code{Env.add $x$ $v$ $env$} produces a new environment
+  that is the same as $env$ except that variable $x$ is associated to
+  value $v$. Note that these operations are \emph{pure}; that is, they
+  do not mutate any environment.
+\end{ocamlx}
+
+\begin{ocamlx}
+  The OCaml code for \LangVar{} ASTs, concrete parsing and printing (for debug purposes),
+  and interpretation are in file \texttt{RVar.ml}, which also imports
+  from file \texttt{Primops.ml}.  These files also contain code for
+  static checking of \LangVar{} programs. The checker makes sure that
+  (i) every use of a variable is in the scope of a corresponding \code{let} binding;
+  and (ii) each primop is applied to the correct number of arguments.
+
+  Note that if a source program fails the checker for reason (i), this is a static user error
+  that should be reported as such. (Violations of (ii) in user programs
+  should be caught by the parser; parse errors are always reported as user errors.)
+  Your compiler should stop trying to process a file as soon as it reports a static user
+  error! (That's what the provided test driver will do.)
+
+  However, if a program initially passes
+  the checker but is subsequently transformed by the compiler and then
+  fails a re-check, this indicates that the problem is the compiler's fault.
+  In this case, the compiler itself should halt with a suitable error message.
+  The checker has a boolean flag to distinguish these cases.
+\end{ocamlx}
+
 \begin{figure}[tp]
 \begin{figure}[tp]
 \begin{lstlisting}
 \begin{lstlisting}
 (define interp-Rvar-class
 (define interp-Rvar-class
@@ -1940,6 +2054,31 @@ variable, using \code{dict-set}, then evaluates the body of the
 \caption{Interpreter for the \LangVar{} language.}
 \caption{Interpreter for the \LangVar{} language.}
 \label{fig:interp-Rvar}
 \label{fig:interp-Rvar}
 \end{figure}
 \end{figure}
+\begin{figure}[tp]
+\begin{lstlisting}[style=ocaml]
+type value = int64
+  
+let interp_primop (op:primop) (args: value list) : value = 
+  match op,args with
+    Read,[] -> read_int()
+  | Neg,[v] -> Int64.neg v
+  | Add,[v1;v2] -> Int64.add v1 v2
+  | _,_ -> assert false (* arity mismatch *)
+
+module StringKey = struct type t = string let compare = String.compare end
+module Env = Map.Make(StringKey)
+
+let rec interp_exp (env:value Env.t) = function
+    Int n -> n
+  | Prim(op,args) -> interp_primop op (List.map (interp_exp env) args)
+  | Var x -> Env.find x env
+  | Let (x,e1,e2) -> interp_exp (Env.add x (interp_exp env e1) env) e2
+
+let interp_program (Program(_,e)) = interp_exp Env.empty e
+\end{lstlisting}
+\caption{\ocaml{Ocaml interpreter for the \LangVar{} language.}}
+\label{fig:interp-Rvar-ocaml}
+\end{figure}
 
 
 The goal for this chapter is to implement a compiler that translates
 The goal for this chapter is to implement a compiler that translates
 any program $P_1$ written in the \LangVar{} language into an x86 assembly
 any program $P_1$ written in the \LangVar{} language into an x86 assembly
@@ -2002,7 +2141,8 @@ integer constant (called \emph{immediate value}\index{immediate
 \Arg &::=&  \key{\$}\Int \mid \key{\%}\Reg \mid \Int\key{(}\key{\%}\Reg\key{)}\\
 \Arg &::=&  \key{\$}\Int \mid \key{\%}\Reg \mid \Int\key{(}\key{\%}\Reg\key{)}\\
 \Instr &::=& \key{addq} \; \Arg\key{,} \Arg \mid
 \Instr &::=& \key{addq} \; \Arg\key{,} \Arg \mid
       \key{subq} \; \Arg\key{,} \Arg \mid
       \key{subq} \; \Arg\key{,} \Arg \mid
-      \key{negq} \; \Arg \mid \key{movq} \; \Arg\key{,} \Arg \mid \\
+      \key{negq} \; \Arg \mid \\
+  &&  \key{movq} \; \Arg\key{,} \Arg \mid \ocaml{\key{movabsq} \; \Arg\key{,} \Arg \mid} \\
   &&  \key{callq} \; \mathit{label} \mid
   &&  \key{callq} \; \mathit{label} \mid
       \key{pushq}\;\Arg \mid \key{popq}\;\Arg \mid \key{retq} \mid \key{jmp}\,\itm{label} \\
       \key{pushq}\;\Arg \mid \key{popq}\;\Arg \mid \key{retq} \mid \key{jmp}\,\itm{label} \\
   && \itm{label}\key{:}\; \Instr \\
   && \itm{label}\key{:}\; \Instr \\
@@ -2062,8 +2202,10 @@ returning the integer in \key{rax} to the operating system. The
 operating system interprets this integer as the program's exit
 operating system interprets this integer as the program's exit
 code. By convention, an exit code of 0 indicates that a program
 code. By convention, an exit code of 0 indicates that a program
 completed successfully, and all other exit codes indicate various
 completed successfully, and all other exit codes indicate various
-errors. Nevertheless, in this book we return the result of the program
-as the exit code.
+errors. \ocaml{Also, exit codes are unsigned bytes, so they cannot accurately represent
+arbitrary \code{int64}s.} Nevertheless, in this book we return the result of the program
+as the exit code. \ocaml{(Incidentally, if you run a program at the unix shell
+  prompt, you can retrieve its exit code by typing \texttt{echo \$?} as the very next command.)}
 
 
 \begin{figure}[tbp]
 \begin{figure}[tbp]
 \begin{lstlisting}
 \begin{lstlisting}
@@ -2081,7 +2223,8 @@ The x86 assembly language varies in a couple ways depending on what
 operating system it is assembled in. The code examples shown here are
 operating system it is assembled in. The code examples shown here are
 correct on Linux and most Unix-like platforms, but when assembled on
 correct on Linux and most Unix-like platforms, but when assembled on
 Mac OS X, labels like \key{main} must be prefixed with an underscore,
 Mac OS X, labels like \key{main} must be prefixed with an underscore,
-as in \key{\_main}.
+as in \key{\_main}. \ocaml{There is a utility function \code{get\_ostype}
+provided in the \texttt{utils.ml} module provided with the support materials.}
 
 
 We exhibit the use of memory for storing intermediate results in the
 We exhibit the use of memory for storing intermediate results in the
 next example.  Figure~\ref{fig:p1-x86} lists an x86 program that is
 next example.  Figure~\ref{fig:p1-x86} lists an x86 program that is
@@ -2201,12 +2344,23 @@ organization becomes apparent in Chapter~\ref{ch:Rif} when we
 introduce conditional branching. The \code{Block} structure includes
 introduce conditional branching. The \code{Block} structure includes
 an $\itm{info}$ field that is not needed for this chapter, but becomes
 an $\itm{info}$ field that is not needed for this chapter, but becomes
 useful in Chapter~\ref{ch:register-allocation-Rvar}.  For now, the
 useful in Chapter~\ref{ch:register-allocation-Rvar}.  For now, the
-$\itm{info}$ field should contain an empty list. Also, regarding the
+$\itm{info}$ field should contain an empty list. \ocaml{The \code{'binfo}
+  type parameter should be instantiated with \code{unit}.}
+Also, regarding the
 abstract syntax for \code{callq}, the \code{Callq} struct includes an
 abstract syntax for \code{callq}, the \code{Callq} struct includes an
 integer for representing the arity of the function, i.e., the number
 integer for representing the arity of the function, i.e., the number
 of arguments, which is helpful to know during register allocation
 of arguments, which is helpful to know during register allocation
 (Chapter~\ref{ch:register-allocation-Rvar}).
 (Chapter~\ref{ch:register-allocation-Rvar}).
 
 
+\begin{ocamlx}
+  The OCaml code for \LangXInt{} AST, printing, and checking is
+  in file \texttt{X86Int.ml}. Printing is used to produce \texttt{.s} files that
+  can be input to the system assembler; it can also be useful for debugging.
+  File \texttt{utils.ml} contains functions for invoking the assembler and linker and
+  running the resulting executables from inside OCaml; these are invoked
+  from the test drivers also defined in that file.
+\end{ocamlx}    
+
 \begin{figure}[tp]
 \begin{figure}[tp]
 \fbox{
 \fbox{
 \begin{minipage}{0.98\textwidth}
 \begin{minipage}{0.98\textwidth}
@@ -2218,8 +2372,9 @@ of arguments, which is helpful to know during register allocation
    \mid \DEREF{\Reg}{\Int} \\
    \mid \DEREF{\Reg}{\Int} \\
 \Instr &::=& \BININSTR{\code{addq}}{\Arg}{\Arg} 
 \Instr &::=& \BININSTR{\code{addq}}{\Arg}{\Arg} 
        \mid \BININSTR{\code{subq}}{\Arg}{\Arg} \\
        \mid \BININSTR{\code{subq}}{\Arg}{\Arg} \\
+       &\mid& \UNIINSTR{\code{negq}}{\Arg}\\
        &\mid& \BININSTR{\code{movq}}{\Arg}{\Arg}
        &\mid& \BININSTR{\code{movq}}{\Arg}{\Arg}
-       \mid \UNIINSTR{\code{negq}}{\Arg}\\
+       \ocaml{\mid \BININSTR{\code{movabsq}}{\Arg}{\Arg}} \\
        &\mid& \CALLQ{\itm{label}}{\itm{int}} \mid \RETQ{} 
        &\mid& \CALLQ{\itm{label}}{\itm{int}} \mid \RETQ{} 
        \mid \PUSHQ{\Arg} \mid \POPQ{\Arg} \mid \JMP{\itm{label}} \\
        \mid \PUSHQ{\Arg} \mid \POPQ{\Arg} \mid \JMP{\itm{label}} \\
 \Block &::= & \BLOCK{\itm{info}}{\LP\Instr\ldots\RP} \\
 \Block &::= & \BLOCK{\itm{info}}{\LP\Instr\ldots\RP} \\
@@ -2228,10 +2383,34 @@ of arguments, which is helpful to know during register allocation
 \]
 \]
 \end{minipage}
 \end{minipage}
 }
 }
-\caption{The abstract syntax of \LangXInt{} assembly.}
+\begin{lstlisting}[style=ocaml,frame=single]
+type reg =
+    RSP | RBP | RAX | RBX | RCX | RDX | RSI | RDI
+  | R8  | R9  | R10 | R11 | R12 | R13 | R14 | R15
+
+type label = string
+
+type arg =
+    Imm of int64  (* in most cases must actually be an int32 *)
+  | Reg of reg
+  | Deref of reg*int32
+  | Var of string (* a pseudo-argument for ~$\LangXVar{}$~ *)
+
+type instr =
+    Addq of arg*arg | Subq of arg*arg | Negq of arg 
+  | Movq of arg*arg | Movabsq of arg*arg | Callq of label*int 
+  | Retq | Pushq of arg | Popq of arg | Jmp of label
+
+type 'binfo block = Block of 'binfo * instr list
+
+type ('pinfo,'binfo) program =
+    Program of 'pinfo * (label * 'binfo block) list 
+\end{lstlisting}
+\caption{The abstract syntax of \LangXInt{} \ocaml{and \LangXVar{}} assembly.}
 \label{fig:x86-int-ast}
 \label{fig:x86-int-ast}
 \end{figure}
 \end{figure}
 
 
+
 \section{Planning the trip to x86 via the \LangCVar{} language}
 \section{Planning the trip to x86 via the \LangCVar{} language}
 \label{sec:plan-s0-x86}
 \label{sec:plan-s0-x86}
 
 
@@ -2246,7 +2425,8 @@ and x86 assembly? Here are some of the most important ones:
   arithmetic operations take two arguments and produce a new value.
   arithmetic operations take two arguments and produce a new value.
   An x86 instruction may have at most one memory-accessing argument.
   An x86 instruction may have at most one memory-accessing argument.
   Furthermore, some instructions place special restrictions on their
   Furthermore, some instructions place special restrictions on their
-  arguments.
+  arguments. \ocaml{For example, immediate operands are usually restricted
+    to fit in 32 bits (except for the \code{movabsq} instruction).}
 
 
 \item[(b)] An argument of an \LangVar{} operator can be a deeply-nested
 \item[(b)] An argument of an \LangVar{} operator can be a deeply-nested
   expression, whereas x86 instructions restrict their arguments to be
   expression, whereas x86 instructions restrict their arguments to be
@@ -2327,7 +2507,7 @@ become local variables whose scope is the entire program, which would
 confuse variables with the same name.
 confuse variables with the same name.
 %
 %
 We place \key{remove-complex-opera*} before \key{explicate-control}
 We place \key{remove-complex-opera*} before \key{explicate-control}
-because the later removes the \key{let} form, but it is convenient to
+because the latter removes the \key{let} form, but it is convenient to
 use \key{let} in the output of \key{remove-complex-opera*}.
 use \key{let} in the output of \key{remove-complex-opera*}.
 %
 %
 The ordering of \key{uniquify} with respect to
 The ordering of \key{uniquify} with respect to
@@ -2407,7 +2587,10 @@ language~\citep{Kernighan:1988nx} in that it has separate syntactic
 categories for expressions and statements, so we name it \LangCVar{}.  The
 categories for expressions and statements, so we name it \LangCVar{}.  The
 abstract syntax for \LangCVar{} is defined in Figure~\ref{fig:c0-syntax}.
 abstract syntax for \LangCVar{} is defined in Figure~\ref{fig:c0-syntax}.
 (The concrete syntax for \LangCVar{} is in the Appendix,
 (The concrete syntax for \LangCVar{} is in the Appendix,
-Figure~\ref{fig:c0-concrete-syntax}.)
+Figure~\ref{fig:c0-concrete-syntax}. \ocaml{(This appendix is not quite accurate
+  for the OCaml version, but the details of the concrete syntax of
+  an IR like this don't matter much, since it will normally be used
+  only to dump out information when debugging; it won't be parsed.})
 %
 %
 The \LangCVar{} language supports the same operators as \LangVar{} but
 The \LangCVar{} language supports the same operators as \LangVar{} but
 the arguments of operators are restricted to atomic
 the arguments of operators are restricted to atomic
@@ -2420,19 +2603,23 @@ assignment statements which can be executed in sequence using the
 expression that is the last one to execute within a function.
 expression that is the last one to execute within a function.
 
 
 A \LangCVar{} program consists of a control-flow graph represented as
 A \LangCVar{} program consists of a control-flow graph represented as
-an alist mapping labels to tails. This is more general than necessary
+an alist mapping labels to tails \ocaml{(that is, a list of \code{(label*tail)} pairs)}.
+This is more general than necessary
 for the present chapter, as we do not yet introduce \key{goto} for
 for the present chapter, as we do not yet introduce \key{goto} for
 jumping to labels, but it saves us from having to change the syntax in
 jumping to labels, but it saves us from having to change the syntax in
 Chapter~\ref{ch:Rif}.  For now there will be just one label,
 Chapter~\ref{ch:Rif}.  For now there will be just one label,
-\key{start}, and the whole program is its tail.
+\key{start}, and the whole program \ocaml{body} is its tail.
 %
 %
 The $\itm{info}$ field of the \key{CProgram} form, after the
 The $\itm{info}$ field of the \key{CProgram} form, after the
 \key{explicate-control} pass, contains a mapping from the symbol
 \key{explicate-control} pass, contains a mapping from the symbol
 \key{locals} to a list of variables, that is, a list of all the
 \key{locals} to a list of variables, that is, a list of all the
-variables used in the program. At the start of the program, these
+variables used in the program. \ocaml{It is represented as a \code{unit Env.t},
+a kind of degenerate map that effectively acts like a set.}
+At the start of the program, these
 variables are uninitialized; they become initialized on their first
 variables are uninitialized; they become initialized on their first
 assignment.
 assignment.
 
 
+
 \begin{figure}[tbp]
 \begin{figure}[tbp]
 \fbox{
 \fbox{
 \begin{minipage}{0.96\textwidth}
 \begin{minipage}{0.96\textwidth}
@@ -2448,12 +2635,38 @@ assignment.
 \]
 \]
 \end{minipage}
 \end{minipage}
 }
 }
+\begin{lstlisting}[style=ocaml,frame=single]
+type var = string
+
+type label = string
+
+type atm = 
+    Int of int64
+  | Var of var
+
+type exp =
+    Atom of atm
+  | Prim of primop * atm list
+
+type stmt =
+    Assign of var * exp
+
+type tail =
+    Return of exp
+  | Seq of stmt*tail
+
+type 'pinfo program = Program of 'pinfo * (label*tail) list
+\end{lstlisting}
 \caption{The abstract syntax of the \LangCVar{} intermediate language.}
 \caption{The abstract syntax of the \LangCVar{} intermediate language.}
 \label{fig:c0-syntax}
 \label{fig:c0-syntax}
 \end{figure}
 \end{figure}
 
 
 The definitional interpreter for \LangCVar{} is in the support code,
 The definitional interpreter for \LangCVar{} is in the support code,
 in the file \code{interp-Cvar.rkt}.
 in the file \code{interp-Cvar.rkt}.
+\begin{ocamlx}
+  The OCaml code for \LangCVar{} AST, checking, printing (for debug purposes),
+  and interpretation is in file \texttt{CVar.ml}. 
+\end{ocamlx}
 
 
 \subsection{The \LangXVar{} dialect}
 \subsection{The \LangXVar{} dialect}
 
 
@@ -2461,7 +2674,23 @@ The \LangXVar{} language is the output of the pass
 \key{select-instructions}. It extends \LangXInt{} with an unbounded
 \key{select-instructions}. It extends \LangXInt{} with an unbounded
 number of program-scope variables and removes the restrictions
 number of program-scope variables and removes the restrictions
 regarding instruction arguments.
 regarding instruction arguments.
-
+\begin{ocamlx}
+For simplicity, we treat \LangXInt{}  and \LangXVar{} as the same
+  language, defined in \texttt{X86Int.ml}. In particular, we allow \code{Var}
+  as one of the possible forms for an instruction argument (\code{arg}).
+  We provide two different check routines.
+  \begin{itemize}
+    \item \code{CheckLabels.check\_program}
+      just checks that all label
+      declarations are unique and that all jump targets are defined; this
+      is suitable for checking the code produced from the \key{select-instructions}
+      pass, which will use \code{Var} arguments freely.
+    \item 
+      \code{CheckArgs.check\_program} checks that all arguments are legal for the
+      actual X86-64 machine (in particular, that they are not \code{Var} arguments);
+      this is suitable for checking the output of the \key{patch-instr} pass.
+  \end{itemize}
+\end{ocamlx}
 
 
 \section{Uniquify Variables}
 \section{Uniquify Variables}
 \label{sec:uniquify-Rvar}
 \label{sec:uniquify-Rvar}
@@ -2488,6 +2717,24 @@ $\Rightarrow$
 \end{minipage}
 \end{minipage}
 \end{tabular} \\
 \end{tabular} \\
 %
 %
+\begin{tabular}{lll}
+\begin{minipage}{0.4\textwidth}
+\begin{lstlisting}[style=ocaml]
+(let x 32
+  (+ (let x 10 x) x))
+\end{lstlisting}
+\end{minipage}
+&
+\ocaml{$\Rightarrow$}
+&
+\begin{minipage}{0.4\textwidth}
+\begin{lstlisting}[style=ocaml]
+(let x.1 32
+  (+ (let x.2 10 x.2) x.1))
+\end{lstlisting}
+\end{minipage}
+\end{tabular} \\
+%
 The following is another example translation, this time of a program
 The following is another example translation, this time of a program
 with a \key{let} nested inside the initializing expression of another
 with a \key{let} nested inside the initializing expression of another
 \key{let}.\\
 \key{let}.\\
@@ -2510,20 +2757,21 @@ $\Rightarrow$
 \end{lstlisting}
 \end{lstlisting}
 \end{minipage}
 \end{minipage}
 \end{tabular}
 \end{tabular}
-
+\ocaml{You can transliterate examples like this for yourself by now...}
 We recommend implementing \code{uniquify} by creating a structurally
 We recommend implementing \code{uniquify} by creating a structurally
 recursive function named \code{uniquify-exp} that mostly just copies
 recursive function named \code{uniquify-exp} that mostly just copies
 an expression. However, when encountering a \key{let}, it should
 an expression. However, when encountering a \key{let}, it should
 generate a unique name for the variable and associate the old name
 generate a unique name for the variable and associate the old name
-with the new name in an alist.\footnote{The Racket function
-  \code{gensym} is handy for generating unique variable names.} The
-\code{uniquify-exp} function needs to access this alist when it gets
+with the new name in an alist \ocaml{(Ocaml: \key{Env})}.\footnote{The Racket function
+\code{gensym} is handy for generating unique variable names. \ocaml{There is a similar
+function defined in \texttt{utils.ml}.}} The
+\code{uniquify-exp} function needs to access this alist \ocaml{(\key{Env})} when it gets
 to a variable reference, so we add a parameter to \code{uniquify-exp}
 to a variable reference, so we add a parameter to \code{uniquify-exp}
-for the alist.
+for the alist \ocaml{(\key{Env})} .
 
 
 The skeleton of the \code{uniquify-exp} function is shown in
 The skeleton of the \code{uniquify-exp} function is shown in
 Figure~\ref{fig:uniquify-Rvar}.  The function is curried so that it is
 Figure~\ref{fig:uniquify-Rvar}.  The function is curried so that it is
-convenient to partially apply it to an alist and then apply it to
+convenient to partially apply it to an alist \ocaml{(\key{Env})} and then apply it to
 different expressions, as in the last case for primitive operations in
 different expressions, as in the last case for primitive operations in
 Figure~\ref{fig:uniquify-Rvar}.  The
 Figure~\ref{fig:uniquify-Rvar}.  The
 %
 %
@@ -2531,6 +2779,19 @@ Figure~\ref{fig:uniquify-Rvar}.  The
 %
 %
 form of Racket is useful for transforming each element of a list to
 form of Racket is useful for transforming each element of a list to
 produce a new list.\index{for/list}
 produce a new list.\index{for/list}
+\ocaml{The \code{List.map} function is similar.}
+
+\ocaml{In addition to writing the \code{uniquify} transformation, it is worthwhile 
+  to write a \emph{checker} to make sure that the result obeys any invariants we
+  expect to hold.  (Sometimes these invariants are baked into the abstract syntax
+  of the target, but that's not the case here.) Our checker should re-traverse the
+  result AST and make sure that no identifier is bound more than once.  It should also
+  re-run the \LangVar{} checker defined in module \code{RVar} to make sure that
+  all variables uses are in the scope of a binding (something we might easily have
+  messed up) and that we have not accidentally introduced a primop arity error (much
+  less likely, but still possible).
+}
+
 
 
 \begin{exercise}
 \begin{exercise}
 \normalfont % I don't like the italics for exercises. -Jeremy
 \normalfont % I don't like the italics for exercises. -Jeremy
@@ -2538,7 +2799,8 @@ produce a new list.\index{for/list}
 Complete the \code{uniquify} pass by filling in the blanks in
 Complete the \code{uniquify} pass by filling in the blanks in
 Figure~\ref{fig:uniquify-Rvar}, that is, implement the cases for
 Figure~\ref{fig:uniquify-Rvar}, that is, implement the cases for
 variables and for the \key{let} form in the file \code{compiler.rkt}
 variables and for the \key{let} form in the file \code{compiler.rkt}
-in the support code.
+in the support code. \ocaml{This exercise is done for you, in the
+  \code{Uniquify} module of file \code{Chapter2.ml}.}
 \end{exercise}
 \end{exercise}
 
 
 \begin{figure}[tbp]
 \begin{figure}[tbp]
@@ -2569,12 +2831,14 @@ parts of the \key{uniquify} pass, that is, the programs should include
 The five programs should be placed in the subdirectory named
 The five programs should be placed in the subdirectory named
 \key{tests} and the file names should start with \code{var\_test\_}
 \key{tests} and the file names should start with \code{var\_test\_}
 followed by a unique integer and end with the file extension
 followed by a unique integer and end with the file extension
-\key{.rkt}.
+\key{.rkt}. \ocaml{OCaml: use extension \key{.r}.}
 %
 %
-The \key{run-tests.rkt} script in the support code checks whether the
+The \key{run-tests.rkt} script in the support code \ocaml{(\key{test\_files}
+  function in \code{Chapter2.ml})} checks whether the
 output programs produce the same result as the input programs.  The
 output programs produce the same result as the input programs.  The
 script uses the \key{interp-tests} function
 script uses the \key{interp-tests} function
-(Appendix~\ref{appendix:utilities}) from \key{utilities.rkt} to test
+(Appendix~\ref{appendix:utilities}) from \key{utilities.rkt} \ocaml{(\key{test\_files}
+  function from \code{utils.ml})} to test
 your \key{uniquify} pass on the example programs.  The \code{passes}
 your \key{uniquify} pass on the example programs.  The \code{passes}
 parameter of \key{interp-tests} is a list that should have one entry
 parameter of \key{interp-tests} is a list that should have one entry
 for each pass in your compiler.  For now, define \code{passes} to
 for each pass in your compiler.  For now, define \code{passes} to
@@ -2585,7 +2849,7 @@ contain just one entry for \code{uniquify} as follows.
 \end{lstlisting}
 \end{lstlisting}
 Run the \key{run-tests.rkt} script in the support code to check
 Run the \key{run-tests.rkt} script in the support code to check
 whether the output programs produce the same result as the input
 whether the output programs produce the same result as the input
-programs.
+programs. \ocaml{XXXXXXX}  
 \end{exercise}
 \end{exercise}
 
 
 
 
@@ -2619,7 +2883,11 @@ $\Rightarrow$
 \end{minipage}
 \end{minipage}
 \end{tabular}
 \end{tabular}
 
 
-
+\begin{ocamlx}
+We suggest generating temporary names that begin with a back-tick (\verb'`')
+since these are illegal as S-expression symbols, and so cannot conflict with existing
+user-defined names.
+\end{ocamlx}
 \begin{figure}[tp]
 \begin{figure}[tp]
 \centering
 \centering
 \fbox{
 \fbox{
@@ -2628,13 +2896,13 @@ $\Rightarrow$
 \begin{array}{rcl}
 \begin{array}{rcl}
 \Atm &::=& \INT{\Int} \mid \VAR{\Var} \\
 \Atm &::=& \INT{\Int} \mid \VAR{\Var} \\
 \Exp &::=& \Atm \mid \READ{} \\
 \Exp &::=& \Atm \mid \READ{} \\
-     &\mid& \NEG{\Atm} \mid \ADD{\Atm}{\Atm}  \\
+n     &\mid& \NEG{\Atm} \mid \ADD{\Atm}{\Atm}  \\
      &\mid&  \LET{\Var}{\Exp}{\Exp} \\
      &\mid&  \LET{\Var}{\Exp}{\Exp} \\
 R^{\dagger}_1  &::=& \PROGRAM{\code{'()}}{\Exp}
 R^{\dagger}_1  &::=& \PROGRAM{\code{'()}}{\Exp}
 \end{array}
 \end{array}
 \]
 \]
 \end{minipage}
 \end{minipage}
-}
+}nnn
 \caption{\LangVarANF{} is \LangVar{} in administrative normal form (ANF).}
 \caption{\LangVarANF{} is \LangVar{} in administrative normal form (ANF).}
 \label{fig:r1-anf-syntax}
 \label{fig:r1-anf-syntax}
 \end{figure}
 \end{figure}
@@ -2647,6 +2915,11 @@ and variables are atomic. In the literature, restricting arguments to
 be atomic expressions is called \emph{administrative normal form}, or
 be atomic expressions is called \emph{administrative normal form}, or
 ANF for short~\citep{Danvy:1991fk,Flanagan:1993cg}.
 ANF for short~\citep{Danvy:1991fk,Flanagan:1993cg}.
 \index{administrative normal form} \index{ANF}
 \index{administrative normal form} \index{ANF}
+\ocaml{Actually, ANF
+  as defined in~\citep{Flanagan:1993cg}
+  refers to a more restricted form in which the defining expressions of
+  \code{let}s cannot themselves contain \code{lets}s. This essentially
+  corresponds to the \LangCVar{} language.}
 
 
 We recommend implementing this pass with two mutually recursive
 We recommend implementing this pass with two mutually recursive
 functions, \code{rco-atom} and \code{rco-exp}. The idea is to apply
 functions, \code{rco-atom} and \code{rco-exp}. The idea is to apply
@@ -2654,7 +2927,7 @@ functions, \code{rco-atom} and \code{rco-exp}. The idea is to apply
 apply \code{rco-exp} to subexpressions that do not.  Both functions
 apply \code{rco-exp} to subexpressions that do not.  Both functions
 take an \LangVar{} expression as input.  The \code{rco-exp} function
 take an \LangVar{} expression as input.  The \code{rco-exp} function
 returns an expression.  The \code{rco-atom} function returns two
 returns an expression.  The \code{rco-atom} function returns two
-things: an atomic expression and alist mapping temporary variables to
+things: an atomic expression and alist \ocaml{(i.e. list of pairs)} mapping temporary variables to
 complex subexpressions. You can return multiple things from a function
 complex subexpressions. You can return multiple things from a function
 using Racket's \key{values} form and you can receive multiple things
 using Racket's \key{values} form and you can receive multiple things
 from a function call using the \key{define-values} form. If you are
 from a function call using the \key{define-values} form. If you are
@@ -2664,7 +2937,9 @@ Also, the
   form is useful for applying a function to each element of a list, in
   form is useful for applying a function to each element of a list, in
   the case where the function returns multiple values.
   the case where the function returns multiple values.
   \index{for/lists}
   \index{for/lists}
-
+  \ocaml{OCaml: You can return multiple things from a function using a tuple
+    and binding the return value to a tuple pattern. Again, the \code{List.map}
+    function is handy.}
 Returning to the example program \code{(+ 52 (- 10))}, the
 Returning to the example program \code{(+ 52 (- 10))}, the
 subexpression \code{(- 10)} should be processed using the
 subexpression \code{(- 10)} should be processed using the
 \code{rco-atom} function because it is an argument of the \code{+} and
 \code{rco-atom} function because it is an argument of the \code{+} and
@@ -2723,10 +2998,15 @@ produce the following output with unnecessary temporary variables.\\
 \end{lstlisting}
 \end{lstlisting}
 \end{minipage}
 \end{minipage}
 
 
+
 \begin{exercise}\normalfont
 \begin{exercise}\normalfont
 %
 %
 Implement the \code{remove-complex-opera*} function in
 Implement the \code{remove-complex-opera*} function in
-\code{compiler.rkt}.
+\code{compiler.rkt}. \ocaml{Fill in the RemoveComplexOperations submodule in \code{Chapter2.ml}.
+  Be sure to include a checker that re-traverses the target AST to make sure that
+  all primop arguments are indeed now atomic, and that we haven't broken any of the
+  other invariants we expect to hold of \LangInt{} programs at this point.
+}
 %
 %
 Create three new \LangInt{} programs that exercise the interesting
 Create three new \LangInt{} programs that exercise the interesting
 code in the \code{remove-complex-opera*} pass (Following the same file
 code in the \code{remove-complex-opera*} pass (Following the same file
@@ -2744,6 +3024,7 @@ intermeidate programs, place the following before the call to
 \begin{lstlisting}
 \begin{lstlisting}
 (debug-level 1)  
 (debug-level 1)  
 \end{lstlisting}
 \end{lstlisting}
+\ocaml{XXXXX}
 \end{exercise}
 \end{exercise}
 
 
 
 
@@ -2792,7 +3073,7 @@ start:
 \end{lstlisting}
 \end{lstlisting}
 \end{minipage}
 \end{minipage}
 \end{tabular}
 \end{tabular}
-
+%
 \begin{figure}[tbp]
 \begin{figure}[tbp]
 \begin{lstlisting}
 \begin{lstlisting}
 (define (explicate-tail e)
 (define (explicate-tail e)
@@ -2853,11 +3134,22 @@ output. The reader might be tempted to instead organize
 statements. We warn against that alternative because the
 statements. We warn against that alternative because the
 accumulator-passing style is key to how we generate high-quality code
 accumulator-passing style is key to how we generate high-quality code
 for conditional expressions in Chapter~\ref{ch:Rif}.
 for conditional expressions in Chapter~\ref{ch:Rif}.
+\begin{ocamlx}
+  Don't take this advice too seriously. Organize things in the cleanest way you
+  can find; it will always be  possible to adjust your approach in later chapters.
+\end{ocamlx}
 
 
 \begin{exercise}\normalfont
 \begin{exercise}\normalfont
 %
 %
 Implement the \code{explicate-control} function in
 Implement the \code{explicate-control} function in
-\code{compiler.rkt}.  Create three new \LangInt{} programs that
+\code{compiler.rkt}.  \ocaml{Fill in the \code{ExplicateControl} submodule
+  of \code{Chapter2.ml} by implementing the \code{do\_program} function.
+  The checking field of this pass should invoke \code{CVar.check\_program},
+  which checks that the target code is properly bound (and also fills in
+  some information about the set of bound variables in the \code{'pinfo}
+  field of the program that will be useful in a later pass).}
+%
+Create three new \LangInt{} programs that
 exercise the code in \code{explicate-control}.
 exercise the code in \code{explicate-control}.
 %
 %
 In the \code{run-tests.rkt} script, add the following entry to the
 In the \code{run-tests.rkt} script, add the following entry to the
@@ -2865,6 +3157,7 @@ list of \code{passes} and then run the script to test your compiler.
 \begin{lstlisting}
 \begin{lstlisting}
 (list "explicate control" explicate-control interp-Cvar type-check-Cvar)  
 (list "explicate control" explicate-control interp-Cvar type-check-Cvar)  
 \end{lstlisting}
 \end{lstlisting}
+\ocaml{XXXXX}
 \end{exercise}
 \end{exercise}
 
 
 \section{Select Instructions}
 \section{Select Instructions}
@@ -2875,8 +3168,9 @@ In the \code{select-instructions} pass we begin the work of
 translating from \LangCVar{} to \LangXVar{}. The target language of
 translating from \LangCVar{} to \LangXVar{}. The target language of
 this pass is a variant of x86 that still uses variables, so we add an
 this pass is a variant of x86 that still uses variables, so we add an
 AST node of the form $\VAR{\itm{var}}$ to the \Arg{} non-terminal of
 AST node of the form $\VAR{\itm{var}}$ to the \Arg{} non-terminal of
-the \LangXInt{} abstract syntax (Figure~\ref{fig:x86-int-ast}).  We
-recommend implementing the \code{select-instructions} with
+the \LangXInt{} abstract syntax (Figure~\ref{fig:x86-int-ast}). \ocaml{Recall that
+  we use the same module to define \LangXInt{} and \LangXVar{}.}
+We recommend implementing the \code{select-instructions} with
 three auxiliary functions, one for each of the non-terminals of
 three auxiliary functions, one for each of the non-terminals of
 \LangCVar{}: $\Atm$, $\Stmt$, and $\Tail$.
 \LangCVar{}: $\Atm$, $\Stmt$, and $\Tail$.
 
 
@@ -2975,6 +3269,7 @@ list of \code{passes} and then run the script to test your compiler.
 \begin{lstlisting}
 \begin{lstlisting}
 (list "instruction selection" select-instructions interp-pseudo-x86-0)
 (list "instruction selection" select-instructions interp-pseudo-x86-0)
 \end{lstlisting}
 \end{lstlisting}
+\ocaml{XXXXXX}
 \end{exercise}
 \end{exercise}
 
 
 
 
@@ -3037,6 +3332,7 @@ with stack locations.  As an aside, the \code{locals-types} entry is
 computed by \code{type-check-Cvar} in the support code, which installs
 computed by \code{type-check-Cvar} in the support code, which installs
 it in the $\itm{info}$ field of the \code{CProgram} node, which should
 it in the $\itm{info}$ field of the \code{CProgram} node, which should
 be propagated to the \code{X86Program} node.
 be propagated to the \code{X86Program} node.
+\ocaml{XXXXX}
 
 
 In the process of assigning variables to stack locations, it is
 In the process of assigning variables to stack locations, it is
 convenient for you to compute and store the size of the frame (in
 convenient for you to compute and store the size of the frame (in
@@ -3057,6 +3353,7 @@ list of \code{passes} and then run the script to test your compiler.
 \begin{lstlisting}
 \begin{lstlisting}
 (list "assign homes" assign-homes interp-x86-0)
 (list "assign homes" assign-homes interp-x86-0)
 \end{lstlisting}
 \end{lstlisting}
+\ocaml{XXXX}
 \end{exercise}
 \end{exercise}
 
 
 
 
@@ -3066,7 +3363,10 @@ list of \code{passes} and then run the script to test your compiler.
 The \code{patch-instructions} pass compiles from \LangXVar{} to
 The \code{patch-instructions} pass compiles from \LangXVar{} to
 \LangXInt{} by making sure that each instruction adheres to the
 \LangXInt{} by making sure that each instruction adheres to the
 restriction that at most one argument of an instruction may be a
 restriction that at most one argument of an instruction may be a
-memory reference.
+memory reference. \ocaml{It also ensures that no immediate operand
+  to an ordinary instruction exceeds 32 bits, by introducing \code{movabsq}
+  instructions as needed. \code{movabsq} is the sole instruction that
+  allows a 64-bit immediate source operand; its destination must be a register.}
 
 
 We return to the following example.
 We return to the following example.
 % var_test_20.rkt
 % var_test_20.rkt
@@ -3098,7 +3398,9 @@ from \key{rax} to the destination location, as follows.
 
 
 \begin{exercise}
 \begin{exercise}
 \normalfont Implement the \key{patch-instructions} pass in
 \normalfont Implement the \key{patch-instructions} pass in
-\code{compiler.rkt}. Create three new example programs that are
+\code{compiler.rkt}. \ocaml{This task has been done for you, in the \code{PatchInstructions} submodule
+of \code{Chapter2}.}
+Create three new example programs that are
 designed to exercise all of the interesting cases in this pass.
 designed to exercise all of the interesting cases in this pass.
 %
 %
 In the \code{run-tests.rkt} script, add the following entry to the
 In the \code{run-tests.rkt} script, add the following entry to the
@@ -3116,7 +3418,8 @@ The last step of the compiler from \LangVar{} to x86 is to convert the
 \LangXInt{} AST (defined in Figure~\ref{fig:x86-int-ast}) to the
 \LangXInt{} AST (defined in Figure~\ref{fig:x86-int-ast}) to the
 string representation (defined in
 string representation (defined in
 Figure~\ref{fig:x86-int-concrete}). The Racket \key{format} and
 Figure~\ref{fig:x86-int-concrete}). The Racket \key{format} and
-\key{string-append} functions are useful in this regard. The main work
+\key{string-append} functions are useful in this regard. \ocaml{The \code{Printf}
+  library is useful here.} The main work
 that this step needs to perform is to create the \key{main} function
 that this step needs to perform is to create the \key{main} function
 and the standard instructions for its prelude and conclusion, as shown
 and the standard instructions for its prelude and conclusion, as shown
 in Figure~\ref{fig:p1-x86} of Section~\ref{sec:x86}. You will need to
 in Figure~\ref{fig:p1-x86} of Section~\ref{sec:x86}. You will need to
@@ -3128,10 +3431,14 @@ When running on Mac OS X, you compiler should prefix an underscore to
 labels like \key{main}. The Racket call \code{(system-type 'os)} is
 labels like \key{main}. The Racket call \code{(system-type 'os)} is
 useful for determining which operating system the compiler is running
 useful for determining which operating system the compiler is running
 on. It returns \code{'macosx}, \code{'unix}, or \code{'windows}.
 on. It returns \code{'macosx}, \code{'unix}, or \code{'windows}.
+\ocaml{There is a similar utility function \code{get\_ostype}
+provided in the \texttt{utils.ml} module.}
 
 
 \begin{exercise}\normalfont
 \begin{exercise}\normalfont
 %
 %
 Implement the \key{print-x86} pass in \code{compiler.rkt}.
 Implement the \key{print-x86} pass in \code{compiler.rkt}.
+\ocaml{This task has been done for you; the relevant printing
+  code is in module \code{X86Int}.}
 %
 %
 In the \code{run-tests.rkt} script, add the following entry to the
 In the \code{run-tests.rkt} script, add the following entry to the
 list of \code{passes} and then run the script to test your compiler.
 list of \code{passes} and then run the script to test your compiler.
@@ -3144,6 +3451,8 @@ Uncomment the call to the \key{compiler-tests} function
 compiler by executing the generated x86 code. Compile the provided
 compiler by executing the generated x86 code. Compile the provided
 \key{runtime.c} file to \key{runtime.o} using \key{gcc}. Run the
 \key{runtime.c} file to \key{runtime.o} using \key{gcc}. Run the
 script to test your compiler.
 script to test your compiler.
+\ocaml{XXXXX}
+
 \end{exercise}
 \end{exercise}