3 years ago · 37ad5fd61f
--- a/book.bib
+++ b/book.bib
@@ -1,3 +1,14 @@
 
															+
														
 
															+@book{Aho:2006wb,
														
 
															+	address = {USA},
														
 
															+	author = {Aho, Alfred V. and Lam, Monica S. and Sethi, Ravi and Ullman, Jeffrey D.},
														
 
															+	date-added = {2021-10-22 09:38:52 -0400},
														
 
															+	date-modified = {2021-10-22 09:38:59 -0400},
														
 
															+	isbn = {0321486811},
														
 
															+	publisher = {Addison-Wesley Longman Publishing Co., Inc.},
														
 
															+	title = {Compilers: Principles, Techniques, and Tools (2nd Edition)},
														
 
															+	year = {2006}}
														
 
															+
														
 
															 @article{Logothetis:1981,
														
 
															 author = {Logothetis, George and Mishra, Prateek},
														
 
															 title = {Compiling short-circuit boolean expressions in one pass},
														
--- a/book.tex
+++ b/book.tex
@@ -454,7 +454,7 @@ that efficiently supports the operations that the compiler needs to
 
															 perform.\index{subject}{concrete syntax}\index{subject}{abstract syntax}\index{subject}{abstract
														
 
															   syntax tree}\index{subject}{AST}\index{subject}{program}\index{subject}{parse} The translation
														
 
															 from concrete syntax to abstract syntax is a process called
														
 
															-\emph{parsing}~\citep{Aho:1986qf}. We do not cover the theory and
														
 
															+\emph{parsing}~\citep{Aho:2006wb}. We do not cover the theory and
														
 
															 implementation of parsing in this book.
														
 
															 %
														
 
															 \racket{A parser is provided in the support code for translating from
														
@@ -2565,12 +2565,12 @@ to bridge those differences. What are the differences between \LangVar{}
 
															 and x86 assembly? Here are some of the most important ones:
														
 
															 \begin{enumerate}
														
 
															-\item x86 arithmetic instructions typically have two arguments
														
 
															-  and update the second argument in place. In contrast, \LangVar{}
														
 
															+\item x86 arithmetic instructions typically have two arguments and
														
 
															+  update the second argument in place. In contrast, \LangVar{}
														
 
															   arithmetic operations take two arguments and produce a new value.
														
 
															   An x86 instruction may have at most one memory-accessing argument.
														
 
															-  Furthermore, some instructions place special restrictions on their
														
 
															-  arguments.
														
 
															+  Furthermore, some x86 instructions place special restrictions on
														
 
															+  their arguments.
														
 
															 \item An argument of an \LangVar{} operator can be a deeply-nested
														
 
															   expression, whereas x86 instructions restrict their arguments to be
														
@@ -2592,13 +2592,17 @@ and x86 assembly? Here are some of the most important ones:
 
															 \fi}  
														
 
															 \end{enumerate}
														
 
															-We ease the challenge of compiling from \LangVar{} to x86 by breaking down
														
 
															-the problem into several steps, dealing with the above differences one
														
 
															-at a time.  Each of these steps is called a \emph{pass} of the
														
 
															-compiler.\index{subject}{pass}\index{subject}{compiler pass}
														
 
															+We ease the challenge of compiling from \LangVar{} to x86 by breaking
														
 
															+down the problem into several steps, dealing with the above
														
 
															+differences one at a time. Each of these steps is called a \emph{pass}
														
 
															+of the compiler.\index{subject}{pass}\index{subject}{compiler pass}
														
 
															 %
														
 
															-This terminology comes from the way each step passes over the AST of
														
 
															-the program.
														
 
															+This terminology comes from the way each step passes over, that is,
														
 
															+traverses the AST of the program.
														
 
															+%
														
 
															+Furthermore, we follow the nanopass approach, which means we strive
														
 
															+for each pass to accomplish one clear objective (not two or three at
														
 
															+the same time).
														
 
															 %
														
 
															 We begin by sketching how we might implement each pass, and give them
														
 
															 names.  We then figure out an ordering of the passes and the
														
@@ -2611,6 +2615,8 @@ our own design.  Finally, to implement each pass we write one
 
															 recursive function per non-terminal in the grammar of the input
														
 
															 language of the pass.  \index{subject}{intermediate language}
														
 
															+Our compiler for \LangVar{} consists of the following passes.
														
 
															+%
														
 
															 \begin{description}
														
 
															 {\if\edition\racketEd
														
 
															 \item[\key{uniquify}] deals with the shadowing of variables by
														
@@ -2627,7 +2633,7 @@ language of the pass.  \index{subject}{intermediate language}
 
															 {\if\edition\racketEd
														
 
															 \item[\key{explicate\_control}] makes the execution order of the
														
 
															-  program explicit. It convert the abstract syntax tree representation
														
 
															+  program explicit. It converts the abstract syntax tree representation
														
 
															   into a control-flow graph in which each node contains a sequence of
														
 
															   statements and the edges between nodes say which nodes contain jumps
														
 
															   to other nodes.
														
@@ -2638,9 +2644,18 @@ language of the pass.  \index{subject}{intermediate language}
 
															   \LangVar{} operation to a short sequence of instructions that
														
 
															   accomplishes the same task.
														
 
															-\item[\key{assign\_homes}] replaces the variables in \LangVar{} with
														
 
															-  registers or stack locations in x86.
														
 
															+\item[\key{assign\_homes}] replaces variables with registers or stack
														
 
															+  locations.
														
 
															 \end{description}
														
 
															+%
														
 
															+{\if\edition\racketEd
														
 
															+%
														
 
															+Our treatment of \code{remove\_complex\_operands} and
														
 
															+\code{explicate\_control} as separate passes is an example of the
														
 
															+nanopass approach. The traditional approach is to combine them into a
														
 
															+single step~\citep{Aho:2006wb}.
														
 
															+%  
														
 
															+\fi}
														
 
															 The next question is: in what order should we apply these passes? This
														
 
															 question can be challenging because it is difficult to know ahead of
														
@@ -2680,10 +2695,9 @@ be forced to assign both arguments to memory locations.
 
															 %
														
 
															 A sophisticated approach is to iteratively repeat the two passes until
														
 
															 a solution is found. However, to reduce implementation complexity we
														
 
															-recommend a simpler approach in which \key{select\_instructions} comes
														
 
															-first, followed by the \key{assign\_homes}, then a third pass named
														
 
															-\key{patch\_instructions} that uses a reserved register to fix
														
 
															-outstanding problems.
														
 
															+recommend placing \key{select\_instructions} first, followed by the
														
 
															+\key{assign\_homes}, then a third pass named \key{patch\_instructions}
														
 
															+that uses a reserved register to fix outstanding problems.
														
 
															 \begin{figure}[tbp]
														
 
															 {\if\edition\racketEd  
														
@@ -2767,6 +2781,10 @@ of each of the compiler passes in Figure~\ref{fig:Lvar-passes}.
 
															 The output of \code{explicate\_control} is similar to the $C$
														
 
															 language~\citep{Kernighan:1988nx} in that it has separate syntactic
														
 
															 categories for expressions and statements, so we name it \LangCVar{}.
														
 
															+This style of intermediate language is also known as
														
 
															+\emph{three-address code}, to emphasize that the typical form of a
														
 
															+statement is \CASSIGN{\key{x}}{\CADD{\key{y}}{\key{z}}} involves three
														
 
															+addresses~\citep{Aho:2006wb}.
														
 
															 The concrete syntax for \LangCVar{} is defined in
														
 
															 Figure~\ref{fig:c0-concrete-syntax} and the abstract syntax for
														
@@ -3038,8 +3056,7 @@ Figure~\ref{fig:Lvar-anf-syntax} presents the grammar for the output
 
															 of this pass, the language \LangVarANF{}. The only difference is that
														
 
															 operator arguments are restricted to be atomic expressions that are
														
 
															 defined by the \Atm{} non-terminal. In particular, integer constants
														
 
															-and variables are atomic. This restriction brings us closer to what is
														
 
															-known as a \emph{three-address code}~\citep{Aho:1986qf} language.
														
 
															+and variables are atomic.
														
 
															 The atomic expressions are pure (they do not cause side-effects or
														
 
															 depend on them) whereas complex expressions may have side effects,
														
@@ -3053,7 +3070,8 @@ between atomic expressions and complex expressions can change and
 
															 often does. The reason that these changes are behaviour preserving is
														
 
															 that the atomic expressions are pure.
														
 
															-Another well-known form is the \emph{administrative normal form}
														
 
															+Another well-known form for intermediate languages is the
														
 
															+\emph{administrative normal form}
														
 
															 (ANF)~\citep{Danvy:1991fk,Flanagan:1993cg}.
														
 
															 \index{subject}{administrative normal form} \index{subject}{ANF}
														
 
															 %
														
@@ -7487,7 +7505,7 @@ R^{\mathsf{ANF}}_{\mathsf{if}}  &::=& \PROGRAM{\code{()}}{\Exp}
 
															 \fi}
														
 
															 \end{minipage}
														
 
															 }
														
 
															-\caption{\LangIfANF{} is \LangIf{} in administrative normal form (ANF).}
														
 
															+\caption{\LangIfANF{} is \LangIf{} in monadic normal form.}
														
 
															 \label{fig:Lif-anf-syntax}
														
 
															 \end{figure}
														
@@ -9283,12 +9301,14 @@ blocks on several test programs.
 
															 \section{Further Reading}
														
 
															 \label{sec:cond-further-reading}
														
 
															-The algorithm for the \code{explicate\_control} pass comes from the
														
 
															-course notes of \citet{Dybvig:2010aa} and it has several similarities
														
 
															-to an algorithm of \citet{Danvy:2003fk}. The treatment of conditionals
														
 
															-in the \code{explicate\_control} pass is similar to the case-of-case
														
 
															-transformation of \citet{PeytonJones:1998} and to short-cut boolean
														
 
															-evaluation~\citep{Logothetis:1981,Aho:1986qf,Clarke:1989,Danvy:2003fk}.
														
 
															+The algorithm for the \code{explicate\_control} pass is based on the
														
 
															+the \code{explose-basic-blocks} pass in course notes of
														
 
															+\citet{Dybvig:2010aa}. It has several similarities to the algorithms
														
 
															+of \citet{Danvy:2003fk} and \citet{Appel:2003fk}. The treatment of
														
 
															+conditionals in the \code{explicate\_control} pass is similar to the
														
 
															+case-of-case transformation of \citet{PeytonJones:1998} and to
														
 
															+short-cut boolean
														
 
															+evaluation~\citep{Logothetis:1981,Aho:2006wb,Clarke:1989,Danvy:2003fk}.
														
 
															 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
														
 
															 \chapter{Loops and Dataflow Analysis}
														
@@ -10168,7 +10188,7 @@ Figure~\ref{fig:Rwhile-anf-syntax} defines the output language
 
															 \fi}
														
 
															 \end{minipage}
														
 
															 }
														
 
															-\caption{\LangLoopANF{} is \LangLoop{} in administrative normal form (ANF).}
														
 
															+\caption{\LangLoopANF{} is \LangLoop{} in monadic normal form.}
														
 
															 \label{fig:Rwhile-anf-syntax}
														
 
															 \end{figure}
														
@@ -11333,7 +11353,7 @@ should all be treated as complex operands.
 
															 %% from its enclosing \code{HasType}.
														
 
															 Figure~\ref{fig:Lvec-anf-syntax}
														
 
															 shows the grammar for the output language \LangVecANF{} of this
														
 
															-pass, which is \LangVec{} in administrative normal form.
														
 
															+pass, which is \LangVec{} in monadic normal form.
														
 
															 \begin{figure}[tp]
														
 
															 \centering
														
@@ -11357,7 +11377,7 @@ pass, which is \LangVec{} in administrative normal form.
 
															 \]
														
 
															 \end{minipage}
														
 
															 }
														
 
															-\caption{\LangVecANF{} is \LangVec{} in administrative normal form (ANF).}
														
 
															+\caption{\LangVecANF{} is \LangVec{} in monadic normal form.}
														
 
															 \label{fig:Lvec-anf-syntax}
														
 
															 \end{figure}
														
@@ -12954,7 +12974,7 @@ R^{\dagger}_4  &::=& \gray{ \PROGRAMDEFS{\code{'()}}{\Def} }
 
															 \]
														
 
															 \end{minipage}
														
 
															 }
														
 
															-\caption{\LangFunANF{} is \LangFun{} in administrative normal form (ANF).}
														
 
															+\caption{\LangFunANF{} is \LangFunRefAlloc{} in monadic normal form.}
														
 
															 \label{fig:Rfun-anf-syntax}
														
 
															 \end{figure}
														
@@ -14229,15 +14249,15 @@ extract the $5$-bits starting at position $58$ from the tag.
 
															 \begin{figure}[p]
														
 
															 \begin{tikzpicture}[baseline=(current  bounding  box.center)]
														
 
															-\node (Rfun) at (0,2)  {\large \LangFun{}};
														
 
															-\node (Rfun-2) at (3,2)  {\large \LangFun{}};
														
 
															-\node (Rfun-3) at (6,2)  {\large \LangFun{}};
														
 
															-\node (F1-0) at (9,2)  {\large \LangFunRef{}};
														
 
															-\node (F1-1) at (12,0)  {\large \LangFunRef{}};
														
 
															+\node (Rfun) at (0,2)  {\large \LangLam{}};
														
 
															+\node (Rfun-2) at (3,2)  {\large \LangLam{}};
														
 
															+\node (Rfun-3) at (6,2)  {\large \LangLam{}};
														
 
															+\node (F1-0) at (9,2)  {\large \LangLamFunRef{}};
														
 
															+\node (F1-1) at (12,0)  {\large \LangLamFunRef{}};
														
 
															 \node (F1-2) at (9,0)  {\large \LangFunRef{}};
														
 
															-\node (F1-3) at (6,0)  {\large $F_1$};
														
 
															-\node (F1-4) at (3,0)  {\large $F_1$};
														
 
															-\node (F1-5) at (0,0)  {\large $F^{\RCO}_1$};
														
 
															+\node (F1-3) at (6,0)  {\large \LangFunRef{}};
														
 
															+\node (F1-4) at (3,0)  {\large \LangFunRefAlloc{}};
														
 
															+\node (F1-5) at (0,0)  {\large \LangFunANF{}};
														
 
															 \node (C3-2) at (3,-2)  {\large \LangCFun{}};
														
 
															 \node (x86-2) at (3,-4)  {\large \LangXIndCallVar{}};
														
--- a/defs.tex
+++ b/defs.tex
@@ -37,11 +37,12 @@
 
															 \newcommand{\LangFunM}{\Lang_{\mathsf{Fun}}} %R4
														
 
															 \newcommand{\LangCFun}{$\CLang_{\mathsf{Fun}}$} %C3
														
 
															 \newcommand{\LangCFunM}{\CLang_{\mathsf{Fun}}} %C3
														
 
															-\newcommand{\LangFunANF}{\ensuremath{\Lang^{\RCO}_{\mathsf{Fun}}}} %R4
														
 
															+\newcommand{\LangFunANF}{\ensuremath{\Lang^{\RCO}_{\mathsf{FunRef}}}} %R4
														
 
															 \newcommand{\LangFunRef}{$\Lang_{\mathsf{FunRef}}$} %F1
														
 
															 \newcommand{\LangFunRefM}{\Lang_{\mathsf{FunRef}}} %F1
														
 
															 \newcommand{\LangFunRefAlloc}{\ensuremath{\Lang^{\mathsf{Alloc}}_{\mathsf{FunRef}}}} %R'4
														
 
															 \newcommand{\LangLam}{$\Lang_\lambda$} %R5
														
 
															+\newcommand{\LangLamFunRef}{$\Lang_\lambda^{\mathsf{FunRef}}$} 
														
 
															 \newcommand{\LangLamM}{\ensuremath{\Lang_\lambda}} %R5
														
 
															 \newcommand{\LangCLam}{$\CLang_{\mathsf{Clos}}$} %C4
														
 
															 \newcommand{\LangCLamM}{\CLang_{\mathsf{Clos}}} %C4