3 år sedan · 37ad5fd61f
--- a/book.bib
+++ b/book.bib
@@ -1,3 +1,14 @@
 
				+
			
 
				+@book{Aho:2006wb,
			
 
				+	address = {USA},
			
 
				+	author = {Aho, Alfred V. and Lam, Monica S. and Sethi, Ravi and Ullman, Jeffrey D.},
			
 
				+	date-added = {2021-10-22 09:38:52 -0400},
			
 
				+	date-modified = {2021-10-22 09:38:59 -0400},
			
 
				+	isbn = {0321486811},
			
 
				+	publisher = {Addison-Wesley Longman Publishing Co., Inc.},
			
 
				+	title = {Compilers: Principles, Techniques, and Tools (2nd Edition)},
			
 
				+	year = {2006}}
			
 
				+
			
 
				 @article{Logothetis:1981,
			
 
				 author = {Logothetis, George and Mishra, Prateek},
			
 
				 title = {Compiling short-circuit boolean expressions in one pass},
			
--- a/book.tex
+++ b/book.tex
@@ -454,7 +454,7 @@ that efficiently supports the operations that the compiler needs to
 
				 perform.\index{subject}{concrete syntax}\index{subject}{abstract syntax}\index{subject}{abstract
			
 
				   syntax tree}\index{subject}{AST}\index{subject}{program}\index{subject}{parse} The translation
			
 
				 from concrete syntax to abstract syntax is a process called
			
 
				-\emph{parsing}~\citep{Aho:1986qf}. We do not cover the theory and
			
 
				+\emph{parsing}~\citep{Aho:2006wb}. We do not cover the theory and
			
 
				 implementation of parsing in this book.
			
 
				 %
			
 
				 \racket{A parser is provided in the support code for translating from
			
@@ -2565,12 +2565,12 @@ to bridge those differences. What are the differences between \LangVar{}
 
				 and x86 assembly? Here are some of the most important ones:
			
 
				 
			
 
				 \begin{enumerate}
			
 
				-\item x86 arithmetic instructions typically have two arguments
			
 
				-  and update the second argument in place. In contrast, \LangVar{}
			
 
				+\item x86 arithmetic instructions typically have two arguments and
			
 
				+  update the second argument in place. In contrast, \LangVar{}
			
 
				   arithmetic operations take two arguments and produce a new value.
			
 
				   An x86 instruction may have at most one memory-accessing argument.
			
 
				-  Furthermore, some instructions place special restrictions on their
			
 
				-  arguments.
			
 
				+  Furthermore, some x86 instructions place special restrictions on
			
 
				+  their arguments.
			
 
				 
			
 
				 \item An argument of an \LangVar{} operator can be a deeply-nested
			
 
				   expression, whereas x86 instructions restrict their arguments to be
			
@@ -2592,13 +2592,17 @@ and x86 assembly? Here are some of the most important ones:
 
				 \fi}  
			
 
				 \end{enumerate}
			
 
				 
			
 
				-We ease the challenge of compiling from \LangVar{} to x86 by breaking down
			
 
				-the problem into several steps, dealing with the above differences one
			
 
				-at a time.  Each of these steps is called a \emph{pass} of the
			
 
				-compiler.\index{subject}{pass}\index{subject}{compiler pass}
			
 
				+We ease the challenge of compiling from \LangVar{} to x86 by breaking
			
 
				+down the problem into several steps, dealing with the above
			
 
				+differences one at a time. Each of these steps is called a \emph{pass}
			
 
				+of the compiler.\index{subject}{pass}\index{subject}{compiler pass}
			
 
				 %
			
 
				-This terminology comes from the way each step passes over the AST of
			
 
				-the program.
			
 
				+This terminology comes from the way each step passes over, that is,
			
 
				+traverses the AST of the program.
			
 
				+%
			
 
				+Furthermore, we follow the nanopass approach, which means we strive
			
 
				+for each pass to accomplish one clear objective (not two or three at
			
 
				+the same time).
			
 
				 %
			
 
				 We begin by sketching how we might implement each pass, and give them
			
 
				 names.  We then figure out an ordering of the passes and the
			
@@ -2611,6 +2615,8 @@ our own design.  Finally, to implement each pass we write one
 
				 recursive function per non-terminal in the grammar of the input
			
 
				 language of the pass.  \index{subject}{intermediate language}
			
 
				 
			
 
				+Our compiler for \LangVar{} consists of the following passes.
			
 
				+%
			
 
				 \begin{description}
			
 
				 {\if\edition\racketEd
			
 
				 \item[\key{uniquify}] deals with the shadowing of variables by
			
@@ -2627,7 +2633,7 @@ language of the pass.  \index{subject}{intermediate language}
 
				   
			
 
				 {\if\edition\racketEd
			
 
				 \item[\key{explicate\_control}] makes the execution order of the
			
 
				-  program explicit. It convert the abstract syntax tree representation
			
 
				+  program explicit. It converts the abstract syntax tree representation
			
 
				   into a control-flow graph in which each node contains a sequence of
			
 
				   statements and the edges between nodes say which nodes contain jumps
			
 
				   to other nodes.
			
@@ -2638,9 +2644,18 @@ language of the pass.  \index{subject}{intermediate language}
 
				   \LangVar{} operation to a short sequence of instructions that
			
 
				   accomplishes the same task.
			
 
				 
			
 
				-\item[\key{assign\_homes}] replaces the variables in \LangVar{} with
			
 
				-  registers or stack locations in x86.
			
 
				+\item[\key{assign\_homes}] replaces variables with registers or stack
			
 
				+  locations.
			
 
				 \end{description}
			
 
				+%
			
 
				+{\if\edition\racketEd
			
 
				+%
			
 
				+Our treatment of \code{remove\_complex\_operands} and
			
 
				+\code{explicate\_control} as separate passes is an example of the
			
 
				+nanopass approach. The traditional approach is to combine them into a
			
 
				+single step~\citep{Aho:2006wb}.
			
 
				+%  
			
 
				+\fi}
			
 
				 
			
 
				 The next question is: in what order should we apply these passes? This
			
 
				 question can be challenging because it is difficult to know ahead of
			
@@ -2680,10 +2695,9 @@ be forced to assign both arguments to memory locations.
 
				 %
			
 
				 A sophisticated approach is to iteratively repeat the two passes until
			
 
				 a solution is found. However, to reduce implementation complexity we
			
 
				-recommend a simpler approach in which \key{select\_instructions} comes
			
 
				-first, followed by the \key{assign\_homes}, then a third pass named
			
 
				-\key{patch\_instructions} that uses a reserved register to fix
			
 
				-outstanding problems.
			
 
				+recommend placing \key{select\_instructions} first, followed by the
			
 
				+\key{assign\_homes}, then a third pass named \key{patch\_instructions}
			
 
				+that uses a reserved register to fix outstanding problems.
			
 
				 
			
 
				 \begin{figure}[tbp]
			
 
				 {\if\edition\racketEd  
			
@@ -2767,6 +2781,10 @@ of each of the compiler passes in Figure~\ref{fig:Lvar-passes}.
 
				 The output of \code{explicate\_control} is similar to the $C$
			
 
				 language~\citep{Kernighan:1988nx} in that it has separate syntactic
			
 
				 categories for expressions and statements, so we name it \LangCVar{}.
			
 
				+This style of intermediate language is also known as
			
 
				+\emph{three-address code}, to emphasize that the typical form of a
			
 
				+statement is \CASSIGN{\key{x}}{\CADD{\key{y}}{\key{z}}} involves three
			
 
				+addresses~\citep{Aho:2006wb}.
			
 
				 
			
 
				 The concrete syntax for \LangCVar{} is defined in
			
 
				 Figure~\ref{fig:c0-concrete-syntax} and the abstract syntax for
			
@@ -3038,8 +3056,7 @@ Figure~\ref{fig:Lvar-anf-syntax} presents the grammar for the output
 
				 of this pass, the language \LangVarANF{}. The only difference is that
			
 
				 operator arguments are restricted to be atomic expressions that are
			
 
				 defined by the \Atm{} non-terminal. In particular, integer constants
			
 
				-and variables are atomic. This restriction brings us closer to what is
			
 
				-known as a \emph{three-address code}~\citep{Aho:1986qf} language.
			
 
				+and variables are atomic.
			
 
				 
			
 
				 The atomic expressions are pure (they do not cause side-effects or
			
 
				 depend on them) whereas complex expressions may have side effects,
			
@@ -3053,7 +3070,8 @@ between atomic expressions and complex expressions can change and
 
				 often does. The reason that these changes are behaviour preserving is
			
 
				 that the atomic expressions are pure.
			
 
				 
			
 
				-Another well-known form is the \emph{administrative normal form}
			
 
				+Another well-known form for intermediate languages is the
			
 
				+\emph{administrative normal form}
			
 
				 (ANF)~\citep{Danvy:1991fk,Flanagan:1993cg}.
			
 
				 \index{subject}{administrative normal form} \index{subject}{ANF}
			
 
				 %
			
@@ -7487,7 +7505,7 @@ R^{\mathsf{ANF}}_{\mathsf{if}}  &::=& \PROGRAM{\code{()}}{\Exp}
 
				 \fi}
			
 
				 \end{minipage}
			
 
				 }
			
 
				-\caption{\LangIfANF{} is \LangIf{} in administrative normal form (ANF).}
			
 
				+\caption{\LangIfANF{} is \LangIf{} in monadic normal form.}
			
 
				 \label{fig:Lif-anf-syntax}
			
 
				 \end{figure}
			
 
				 
			
@@ -9283,12 +9301,14 @@ blocks on several test programs.
 
				 \section{Further Reading}
			
 
				 \label{sec:cond-further-reading}
			
 
				 
			
 
				-The algorithm for the \code{explicate\_control} pass comes from the
			
 
				-course notes of \citet{Dybvig:2010aa} and it has several similarities
			
 
				-to an algorithm of \citet{Danvy:2003fk}. The treatment of conditionals
			
 
				-in the \code{explicate\_control} pass is similar to the case-of-case
			
 
				-transformation of \citet{PeytonJones:1998} and to short-cut boolean
			
 
				-evaluation~\citep{Logothetis:1981,Aho:1986qf,Clarke:1989,Danvy:2003fk}.
			
 
				+The algorithm for the \code{explicate\_control} pass is based on the
			
 
				+the \code{explose-basic-blocks} pass in course notes of
			
 
				+\citet{Dybvig:2010aa}. It has several similarities to the algorithms
			
 
				+of \citet{Danvy:2003fk} and \citet{Appel:2003fk}. The treatment of
			
 
				+conditionals in the \code{explicate\_control} pass is similar to the
			
 
				+case-of-case transformation of \citet{PeytonJones:1998} and to
			
 
				+short-cut boolean
			
 
				+evaluation~\citep{Logothetis:1981,Aho:2006wb,Clarke:1989,Danvy:2003fk}.
			
 
				 
			
 
				 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
			
 
				 \chapter{Loops and Dataflow Analysis}
			
@@ -10168,7 +10188,7 @@ Figure~\ref{fig:Rwhile-anf-syntax} defines the output language
 
				 \fi}
			
 
				 \end{minipage}
			
 
				 }
			
 
				-\caption{\LangLoopANF{} is \LangLoop{} in administrative normal form (ANF).}
			
 
				+\caption{\LangLoopANF{} is \LangLoop{} in monadic normal form.}
			
 
				 \label{fig:Rwhile-anf-syntax}
			
 
				 \end{figure}
			
 
				 
			
@@ -11333,7 +11353,7 @@ should all be treated as complex operands.
 
				 %% from its enclosing \code{HasType}.
			
 
				 Figure~\ref{fig:Lvec-anf-syntax}
			
 
				 shows the grammar for the output language \LangVecANF{} of this
			
 
				-pass, which is \LangVec{} in administrative normal form.
			
 
				+pass, which is \LangVec{} in monadic normal form.
			
 
				 
			
 
				 \begin{figure}[tp]
			
 
				 \centering
			
@@ -11357,7 +11377,7 @@ pass, which is \LangVec{} in administrative normal form.
 
				 \]
			
 
				 \end{minipage}
			
 
				 }
			
 
				-\caption{\LangVecANF{} is \LangVec{} in administrative normal form (ANF).}
			
 
				+\caption{\LangVecANF{} is \LangVec{} in monadic normal form.}
			
 
				 \label{fig:Lvec-anf-syntax}
			
 
				 \end{figure}
			
 
				 
			
@@ -12954,7 +12974,7 @@ R^{\dagger}_4  &::=& \gray{ \PROGRAMDEFS{\code{'()}}{\Def} }
 
				 \]
			
 
				 \end{minipage}
			
 
				 }
			
 
				-\caption{\LangFunANF{} is \LangFun{} in administrative normal form (ANF).}
			
 
				+\caption{\LangFunANF{} is \LangFunRefAlloc{} in monadic normal form.}
			
 
				 \label{fig:Rfun-anf-syntax}
			
 
				 \end{figure}
			
 
				 
			
@@ -14229,15 +14249,15 @@ extract the $5$-bits starting at position $58$ from the tag.
 
				 
			
 
				 \begin{figure}[p]
			
 
				 \begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				-\node (Rfun) at (0,2)  {\large \LangFun{}};
			
 
				-\node (Rfun-2) at (3,2)  {\large \LangFun{}};
			
 
				-\node (Rfun-3) at (6,2)  {\large \LangFun{}};
			
 
				-\node (F1-0) at (9,2)  {\large \LangFunRef{}};
			
 
				-\node (F1-1) at (12,0)  {\large \LangFunRef{}};
			
 
				+\node (Rfun) at (0,2)  {\large \LangLam{}};
			
 
				+\node (Rfun-2) at (3,2)  {\large \LangLam{}};
			
 
				+\node (Rfun-3) at (6,2)  {\large \LangLam{}};
			
 
				+\node (F1-0) at (9,2)  {\large \LangLamFunRef{}};
			
 
				+\node (F1-1) at (12,0)  {\large \LangLamFunRef{}};
			
 
				 \node (F1-2) at (9,0)  {\large \LangFunRef{}};
			
 
				-\node (F1-3) at (6,0)  {\large $F_1$};
			
 
				-\node (F1-4) at (3,0)  {\large $F_1$};
			
 
				-\node (F1-5) at (0,0)  {\large $F^{\RCO}_1$};
			
 
				+\node (F1-3) at (6,0)  {\large \LangFunRef{}};
			
 
				+\node (F1-4) at (3,0)  {\large \LangFunRefAlloc{}};
			
 
				+\node (F1-5) at (0,0)  {\large \LangFunANF{}};
			
 
				 \node (C3-2) at (3,-2)  {\large \LangCFun{}};
			
 
				 
			
 
				 \node (x86-2) at (3,-4)  {\large \LangXIndCallVar{}};
			
--- a/defs.tex
+++ b/defs.tex
@@ -37,11 +37,12 @@
 
				 \newcommand{\LangFunM}{\Lang_{\mathsf{Fun}}} %R4
			
 
				 \newcommand{\LangCFun}{$\CLang_{\mathsf{Fun}}$} %C3
			
 
				 \newcommand{\LangCFunM}{\CLang_{\mathsf{Fun}}} %C3
			
 
				-\newcommand{\LangFunANF}{\ensuremath{\Lang^{\RCO}_{\mathsf{Fun}}}} %R4
			
 
				+\newcommand{\LangFunANF}{\ensuremath{\Lang^{\RCO}_{\mathsf{FunRef}}}} %R4
			
 
				 \newcommand{\LangFunRef}{$\Lang_{\mathsf{FunRef}}$} %F1
			
 
				 \newcommand{\LangFunRefM}{\Lang_{\mathsf{FunRef}}} %F1
			
 
				 \newcommand{\LangFunRefAlloc}{\ensuremath{\Lang^{\mathsf{Alloc}}_{\mathsf{FunRef}}}} %R'4
			
 
				 \newcommand{\LangLam}{$\Lang_\lambda$} %R5
			
 
				+\newcommand{\LangLamFunRef}{$\Lang_\lambda^{\mathsf{FunRef}}$} 
			
 
				 \newcommand{\LangLamM}{\ensuremath{\Lang_\lambda}} %R5
			
 
				 \newcommand{\LangCLam}{$\CLang_{\mathsf{Clos}}$} %C4
			
 
				 \newcommand{\LangCLamM}{\CLang_{\mathsf{Clos}}} %C4