|
@@ -1,7 +1,9 @@
|
|
\documentclass[7x10]{TimesAPriori_MIT}%%7x10
|
|
\documentclass[7x10]{TimesAPriori_MIT}%%7x10
|
|
|
|
|
|
% TODO:
|
|
% TODO:
|
|
-% move binary subtraction from Lif to Lint
|
|
|
|
|
|
+%
|
|
|
|
+% * TOC: Boolans and Conditions, move to page 2
|
|
|
|
+
|
|
|
|
|
|
\usepackage[utf8]{inputenc}
|
|
\usepackage[utf8]{inputenc}
|
|
%% \usepackage{setspace}
|
|
%% \usepackage{setspace}
|
|
@@ -148,9 +150,17 @@ contributions of these otherwise uncredited readers.
|
|
This book was set in Times LT Std Roman by the author. Printed and
|
|
This book was set in Times LT Std Roman by the author. Printed and
|
|
bound in the United States of America.
|
|
bound in the United States of America.
|
|
|
|
|
|
-Library of Congress Cataloging-in-Publication Data is available.
|
|
|
|
-
|
|
|
|
-ISBN:
|
|
|
|
|
|
+Library of Congress Cataloging-in-Publication Data\\
|
|
|
|
+\ \\
|
|
|
|
+Names: Siek, Jeremy, author. \\
|
|
|
|
+Title: Essentials of compilation : an incremental approach in Racket / Jeremy G. Siek. \\
|
|
|
|
+Description: Cambridge, Massachusetts : The MIT Press, [2023] | Includes bibliographical references and index. \\
|
|
|
|
+Identifiers: LCCN 2022015399 (print) | LCCN 2022015400 (ebook) | ISBN 9780262047760 (hardcover) | ISBN 9780262373272 (epub) | ISBN 9780262373289 (pdf) \\
|
|
|
|
+Subjects: LCSH: Racket (Computer program language) | Compilers (Computer programs) \\
|
|
|
|
+Classification: LCC QA76.73.R33 S54 2023 (print) | LCC QA76.73.R33 (ebook) | DDC 005.13/3--dc23/eng/20220705 \\
|
|
|
|
+LC record available at https://lccn.loc.gov/2022015399\\
|
|
|
|
+LC ebook record available at https://lccn.loc.gov/2022015400\\
|
|
|
|
+\ \\
|
|
|
|
|
|
10 9 8 7 6 5 4 3 2 1
|
|
10 9 8 7 6 5 4 3 2 1
|
|
|
|
|
|
@@ -204,10 +214,10 @@ ISBN:
|
|
\chapter*{Preface}
|
|
\chapter*{Preface}
|
|
\addcontentsline{toc}{fmbm}{Preface}
|
|
\addcontentsline{toc}{fmbm}{Preface}
|
|
|
|
|
|
-There is a magical moment when a programmer presses the run button
|
|
|
|
-and the software begins to execute. Somehow a program written in a
|
|
|
|
-high-level language is running on a computer that is capable only of
|
|
|
|
-shuffling bits. Here we reveal the wizardry that makes that moment
|
|
|
|
|
|
+There is a magical moment when a programmer presses the \emph{run}
|
|
|
|
+button and the software begins to execute. Somehow a program written
|
|
|
|
+in a high-level language is running on a computer that is capable only
|
|
|
|
+of shuffling bits. Here we reveal the wizardry that makes that moment
|
|
possible. Beginning with the groundbreaking work of Backus and
|
|
possible. Beginning with the groundbreaking work of Backus and
|
|
colleagues in the 1950s, computer scientists developed techniques for
|
|
colleagues in the 1950s, computer scientists developed techniques for
|
|
constructing programs called \emph{compilers} that automatically
|
|
constructing programs called \emph{compilers} that automatically
|
|
@@ -247,7 +257,7 @@ Our choice of language features is designed to elicit fundamental
|
|
concepts and algorithms used in compilers.
|
|
concepts and algorithms used in compilers.
|
|
\begin{itemize}
|
|
\begin{itemize}
|
|
\item We begin with integer arithmetic and local variables in
|
|
\item We begin with integer arithmetic and local variables in
|
|
- Chapters~\ref{ch:trees-recur} and \ref{ch:Lvar}, where we introduce
|
|
|
|
|
|
+ chapters~\ref{ch:trees-recur} and \ref{ch:Lvar}, where we introduce
|
|
the fundamental tools of compiler construction: \emph{abstract
|
|
the fundamental tools of compiler construction: \emph{abstract
|
|
syntax trees} and \emph{recursive functions}.
|
|
syntax trees} and \emph{recursive functions}.
|
|
{\if\edition\pythonEd\pythonColor
|
|
{\if\edition\pythonEd\pythonColor
|
|
@@ -287,7 +297,7 @@ concepts and algorithms used in compilers.
|
|
%% \emph{classes}.
|
|
%% \emph{classes}.
|
|
%% \fi}
|
|
%% \fi}
|
|
\item Chapter~\ref{ch:Lgrad} uses the \code{Any} type introduced in
|
|
\item Chapter~\ref{ch:Lgrad} uses the \code{Any} type introduced in
|
|
- Chapter~\ref{ch:Ldyn} to implement a \emph{gradually typed language}
|
|
|
|
|
|
+ chapter~\ref{ch:Ldyn} to implement a \emph{gradually typed language}
|
|
in which different regions of a program may be static or dynamically
|
|
in which different regions of a program may be static or dynamically
|
|
typed. The reader implements runtime support for \emph{proxies} that
|
|
typed. The reader implements runtime support for \emph{proxies} that
|
|
allow values to safely move between regions.
|
|
allow values to safely move between regions.
|
|
@@ -295,6 +305,7 @@ concepts and algorithms used in compilers.
|
|
leveraging the \code{Any} type and type casts developed in chapters
|
|
leveraging the \code{Any} type and type casts developed in chapters
|
|
\ref{ch:Ldyn} and \ref{ch:Lgrad}.
|
|
\ref{ch:Ldyn} and \ref{ch:Lgrad}.
|
|
\end{itemize}
|
|
\end{itemize}
|
|
|
|
+
|
|
There are many language features that we do not include. Our choices
|
|
There are many language features that we do not include. Our choices
|
|
balance the incidental complexity of a feature versus the fundamental
|
|
balance the incidental complexity of a feature versus the fundamental
|
|
concepts that it exposes. For example, we include tuples and not
|
|
concepts that it exposes. For example, we include tuples and not
|
|
@@ -302,7 +313,7 @@ records because although they both elicit the study of heap allocation and
|
|
garbage collection, records come with more incidental complexity.
|
|
garbage collection, records come with more incidental complexity.
|
|
|
|
|
|
Since 2009, drafts of this book have served as the textbook for
|
|
Since 2009, drafts of this book have served as the textbook for
|
|
-sixteen week compiler courses for upper-level undergraduates and
|
|
|
|
|
|
+sixteen-week compiler courses for upper-level undergraduates and
|
|
first-year graduate students at the University of Colorado and Indiana
|
|
first-year graduate students at the University of Colorado and Indiana
|
|
University.
|
|
University.
|
|
%
|
|
%
|
|
@@ -323,7 +334,9 @@ The last two weeks of the course involve a final project in which
|
|
students design and implement a compiler extension of their choosing.
|
|
students design and implement a compiler extension of their choosing.
|
|
The last few chapters can be used in support of these projects. Many
|
|
The last few chapters can be used in support of these projects. Many
|
|
chapters include a challenge problem that we assign to the graduate
|
|
chapters include a challenge problem that we assign to the graduate
|
|
-students. For compiler courses at universities on the quarter system
|
|
|
|
|
|
+students.
|
|
|
|
+
|
|
|
|
+For compiler courses at universities on the quarter system
|
|
(about ten weeks in length), we recommend completing the course
|
|
(about ten weeks in length), we recommend completing the course
|
|
through chapter~\ref{ch:Lvec} or chapter~\ref{ch:Lfun} and providing
|
|
through chapter~\ref{ch:Lvec} or chapter~\ref{ch:Lfun} and providing
|
|
some scaffolding code to the students for each compiler pass.
|
|
some scaffolding code to the students for each compiler pass.
|
|
@@ -409,19 +422,17 @@ Lowell, and the University of Vermont.
|
|
\label{fig:chapter-dependences}
|
|
\label{fig:chapter-dependences}
|
|
\end{figure}
|
|
\end{figure}
|
|
|
|
|
|
-\racket{
|
|
|
|
-We use the \href{https://racket-lang.org/}{Racket} language both for
|
|
|
|
|
|
+\racket{We use the \href{https://racket-lang.org/}{Racket} language both for
|
|
the implementation of the compiler and for the input language, so the
|
|
the implementation of the compiler and for the input language, so the
|
|
reader should be proficient with Racket or Scheme. There are many
|
|
reader should be proficient with Racket or Scheme. There are many
|
|
excellent resources for learning Scheme and
|
|
excellent resources for learning Scheme and
|
|
-Racket~\citep{Dybvig:1987aa,Abelson:1996uq,Friedman:1996aa,Felleisen:2001aa,Felleisen:2013aa,Flatt:2014aa}.
|
|
|
|
-}
|
|
|
|
-\python{
|
|
|
|
- This edition of the book uses \href{https://www.python.org/}{Python}
|
|
|
|
- both for the implementation of the compiler and for the input language, so the
|
|
|
|
|
|
+Racket~\citep{Dybvig:1987aa,Abelson:1996uq,Friedman:1996aa,Felleisen:2001aa,Felleisen:2013aa,Flatt:2014aa}.}
|
|
|
|
+%
|
|
|
|
+\python{This edition of the book uses \href{https://www.python.org/}{Python}
|
|
|
|
+both for the implementation of the compiler and for the input language, so the
|
|
reader should be proficient with Python. There are many
|
|
reader should be proficient with Python. There are many
|
|
-excellent resources for learning Python~\citep{Lutz:2013vp,Barry:2016vj,Sweigart:2019vn,Matthes:2019vs}.
|
|
|
|
-}
|
|
|
|
|
|
+excellent resources for learning Python~\citep{Lutz:2013vp,Barry:2016vj,Sweigart:2019vn,Matthes:2019vs}.}%
|
|
|
|
+%
|
|
The support code for this book is in the GitHub repository at
|
|
The support code for this book is in the GitHub repository at
|
|
the following location:
|
|
the following location:
|
|
\begin{center}\small\texttt
|
|
\begin{center}\small\texttt
|
|
@@ -472,21 +483,21 @@ understand the rationale for the compiler design. Ghuloum proposed the
|
|
incremental approach~\citep{Ghuloum:2006bh} on which this book is
|
|
incremental approach~\citep{Ghuloum:2006bh} on which this book is
|
|
based.
|
|
based.
|
|
|
|
|
|
-We thank the many students who served as teaching assistants for the
|
|
|
|
-compiler course at IU, including Carl Factora, Ryan Scott, Cameron
|
|
|
|
-Swords, and Chris Wailes. We thank Andre Kuhlenschmidt for work on the
|
|
|
|
|
|
+I thank the many students who served as teaching assistants for the
|
|
|
|
+compiler course at IU including Carl Factora, Ryan Scott, Cameron
|
|
|
|
+Swords, and Chris Wailes. I thank Andre Kuhlenschmidt for work on the
|
|
garbage collector and x86 interpreter, Michael Vollmer for work on
|
|
garbage collector and x86 interpreter, Michael Vollmer for work on
|
|
efficient tail calls, and Michael Vitousek for help with the first
|
|
efficient tail calls, and Michael Vitousek for help with the first
|
|
offering of the incremental compiler course at IU.
|
|
offering of the incremental compiler course at IU.
|
|
|
|
|
|
-We thank professors Bor-Yuh Chang, John Clements, Jay McCarthy, Joseph
|
|
|
|
|
|
+I thank professors Bor-Yuh Chang, John Clements, Jay McCarthy, Joseph
|
|
Near, Ryan Newton, Nate Nystrom, Peter Thiemann, Andrew Tolmach, and
|
|
Near, Ryan Newton, Nate Nystrom, Peter Thiemann, Andrew Tolmach, and
|
|
Michael Wollowski for teaching courses based on drafts of this book
|
|
Michael Wollowski for teaching courses based on drafts of this book
|
|
-and for their feedback. We thank the National Science Foundation for
|
|
|
|
|
|
+and for their feedback. I thank the National Science Foundation for
|
|
the grants that helped to support this work: Grant Numbers 1518844,
|
|
the grants that helped to support this work: Grant Numbers 1518844,
|
|
1763922, and 1814460.
|
|
1763922, and 1814460.
|
|
|
|
|
|
-We thank Ronald Garcia for helping Jeremy survive Dybvig's compiler
|
|
|
|
|
|
+I thank Ronald Garcia for helping me survive Dybvig's compiler
|
|
course in the early 2000s and especially for finding the bug that
|
|
course in the early 2000s and especially for finding the bug that
|
|
sent our garbage collector on a wild goose chase!
|
|
sent our garbage collector on a wild goose chase!
|
|
|
|
|
|
@@ -511,7 +522,7 @@ that efficiently supports the operations that the compiler needs to
|
|
perform.\index{subject}{concrete syntax}\index{subject}{abstract
|
|
perform.\index{subject}{concrete syntax}\index{subject}{abstract
|
|
syntax}\index{subject}{abstract syntax
|
|
syntax}\index{subject}{abstract syntax
|
|
tree}\index{subject}{AST}\index{subject}{program}\index{subject}{parse}
|
|
tree}\index{subject}{AST}\index{subject}{program}\index{subject}{parse}
|
|
-The process of translating from concrete syntax to abstract syntax is
|
|
|
|
|
|
+The process of translating concrete syntax to abstract syntax is
|
|
called \emph{parsing}\python{\ and is studied in
|
|
called \emph{parsing}\python{\ and is studied in
|
|
chapter~\ref{ch:parsing}}.
|
|
chapter~\ref{ch:parsing}}.
|
|
\racket{This book does not cover the theory and implementation of parsing.
|
|
\racket{This book does not cover the theory and implementation of parsing.
|
|
@@ -533,7 +544,7 @@ feature to represent ASTs (section~\ref{sec:ast}).}
|
|
%
|
|
%
|
|
\python{We use Python classes and objects to represent ASTs, especially the
|
|
\python{We use Python classes and objects to represent ASTs, especially the
|
|
classes defined in the standard \code{ast} module for the Python
|
|
classes defined in the standard \code{ast} module for the Python
|
|
- source language.}
|
|
|
|
|
|
+ source language.}%
|
|
%
|
|
%
|
|
We use grammars to define the abstract syntax of programming languages
|
|
We use grammars to define the abstract syntax of programming languages
|
|
(section~\ref{sec:grammar}) and pattern matching to inspect individual
|
|
(section~\ref{sec:grammar}) and pattern matching to inspect individual
|
|
@@ -585,6 +596,7 @@ input_int() + -8
|
|
\end{equation}
|
|
\end{equation}
|
|
\end{minipage}
|
|
\end{minipage}
|
|
\end{center}
|
|
\end{center}
|
|
|
|
+
|
|
We use the standard terminology for trees to describe ASTs: each
|
|
We use the standard terminology for trees to describe ASTs: each
|
|
rectangle above is called a \emph{node}. The arrows connect a node to its
|
|
rectangle above is called a \emph{node}. The arrows connect a node to its
|
|
\emph{children}, which are also nodes. The top-most node is the
|
|
\emph{children}, which are also nodes. The top-most node is the
|
|
@@ -1919,7 +1931,7 @@ result?
|
|
\end{lstlisting}
|
|
\end{lstlisting}
|
|
For the purposes of depicting which variable occurrences correspond to
|
|
For the purposes of depicting which variable occurrences correspond to
|
|
which definitions, the following shows the \code{x}'s annotated with
|
|
which definitions, the following shows the \code{x}'s annotated with
|
|
-subscripts to distinguish them. Double check that your answer for the
|
|
|
|
|
|
+subscripts to distinguish them. Double-check that your answer for the
|
|
previous program is the same as your answer for this annotated version
|
|
previous program is the same as your answer for this annotated version
|
|
of the program.
|
|
of the program.
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
@@ -2117,7 +2129,7 @@ print(-y)
|
|
\fi}
|
|
\fi}
|
|
\noindent We can invoke the \code{interp\_exp} method for \LangVar{}
|
|
\noindent We can invoke the \code{interp\_exp} method for \LangVar{}
|
|
\racket{on this expression,}
|
|
\racket{on this expression,}
|
|
-\python{on the \code{-y} expression,}
|
|
|
|
|
|
+\python{on the \code{-y} expression,}%
|
|
%
|
|
%
|
|
which we call \code{e0}, by creating an object of the \LangVar{} class
|
|
which we call \code{e0}, by creating an object of the \LangVar{} class
|
|
and calling the \code{interp\_exp} method
|
|
and calling the \code{interp\_exp} method
|
|
@@ -2464,7 +2476,7 @@ We discuss procedure calls in more detail further in this chapter and
|
|
in chapter~\ref{ch:Lfun}.
|
|
in chapter~\ref{ch:Lfun}.
|
|
%
|
|
%
|
|
The last letter \key{q} indicates that these instructions operate on
|
|
The last letter \key{q} indicates that these instructions operate on
|
|
-quadwords which are 64-bit values.
|
|
|
|
|
|
+quadwords, which are 64-bit values.
|
|
%
|
|
%
|
|
\racket{The instruction $\key{jmp}\,\itm{label}$ updates the program
|
|
\racket{The instruction $\key{jmp}\,\itm{label}$ updates the program
|
|
counter to the address of the instruction immediately after the
|
|
counter to the address of the instruction immediately after the
|
|
@@ -2795,7 +2807,7 @@ input/output language for each pass. The very first pass has
|
|
its output language. In between these two passes, we can choose
|
|
its output language. In between these two passes, we can choose
|
|
whichever language is most convenient for expressing the output of
|
|
whichever language is most convenient for expressing the output of
|
|
each pass, whether that be \LangVar{}, \LangXInt{}, or a new
|
|
each pass, whether that be \LangVar{}, \LangXInt{}, or a new
|
|
-\emph{intermediate languages} of our own design. Finally, to
|
|
|
|
|
|
+\emph{intermediate language} of our own design. Finally, to
|
|
implement each pass we write one recursive function per nonterminal in
|
|
implement each pass we write one recursive function per nonterminal in
|
|
the grammar of the input language of the pass.
|
|
the grammar of the input language of the pass.
|
|
\index{subject}{intermediate language}
|
|
\index{subject}{intermediate language}
|
|
@@ -2836,9 +2848,9 @@ Our compiler for \LangVar{} consists of the following passes:
|
|
%
|
|
%
|
|
Our treatment of \code{remove\_complex\_operands} and
|
|
Our treatment of \code{remove\_complex\_operands} and
|
|
\code{explicate\_control} as separate passes is an example of the
|
|
\code{explicate\_control} as separate passes is an example of the
|
|
-nanopass approach\footnote{For analogous decompositions of the
|
|
|
|
|
|
+nanopass approach.\footnote{For analogous decompositions of the
|
|
translation into continuation passing style, see the work of
|
|
translation into continuation passing style, see the work of
|
|
- \citet{Lawall:1993} and \citet{Hatcliff:1994ea}.}. The traditional
|
|
|
|
|
|
+ \citet{Lawall:1993} and \citet{Hatcliff:1994ea}.} The traditional
|
|
approach is to combine them into a single step~\citep{Aho:2006wb}.
|
|
approach is to combine them into a single step~\citep{Aho:2006wb}.
|
|
%
|
|
%
|
|
\fi}
|
|
\fi}
|
|
@@ -2902,7 +2914,7 @@ that uses a reserved register to fix outstanding problems.
|
|
|
|
|
|
\path[->,bend left=15] (Lvar) edge [above] node {\ttfamily\footnotesize uniquify} (Lvar-2);
|
|
\path[->,bend left=15] (Lvar) edge [above] node {\ttfamily\footnotesize uniquify} (Lvar-2);
|
|
\path[->,bend left=15] (Lvar-2) edge [above] node {\ttfamily\footnotesize remove\_complex\_operands} (Lvar-3);
|
|
\path[->,bend left=15] (Lvar-2) edge [above] node {\ttfamily\footnotesize remove\_complex\_operands} (Lvar-3);
|
|
-\path[->,bend left=15] (Lvar-3) edge [right] node {\ttfamily\footnotesize explicate\_control} (Cvar-2);
|
|
|
|
|
|
+\path[->,bend left=15] (Lvar-3) edge [right] node {\ttfamily\footnotesize\ \ explicate\_control} (Cvar-2);
|
|
\path[->,bend right=15] (Cvar-2) edge [right] node {\ttfamily\footnotesize select\_instructions} (x86-2);
|
|
\path[->,bend right=15] (Cvar-2) edge [right] node {\ttfamily\footnotesize select\_instructions} (x86-2);
|
|
\path[->,bend right=15] (x86-2) edge [below] node {\ttfamily\footnotesize assign\_homes} (x86-3);
|
|
\path[->,bend right=15] (x86-2) edge [below] node {\ttfamily\footnotesize assign\_homes} (x86-3);
|
|
\path[->,bend left=15] (x86-3) edge [above] node {\ttfamily\footnotesize patch\_instructions} (x86-4);
|
|
\path[->,bend left=15] (x86-3) edge [above] node {\ttfamily\footnotesize patch\_instructions} (x86-4);
|
|
@@ -3061,10 +3073,11 @@ in the file \code{interp-Cvar.rkt}.
|
|
\section{Uniquify Variables}
|
|
\section{Uniquify Variables}
|
|
\label{sec:uniquify-Lvar}
|
|
\label{sec:uniquify-Lvar}
|
|
|
|
|
|
-The \code{uniquify} pass compiles \LangVar{} programs into \LangVar{}
|
|
|
|
-programs in which every \key{let} binds a unique variable name. For
|
|
|
|
-example, the \code{uniquify} pass should translate the program on the
|
|
|
|
-left into the program on the right.
|
|
|
|
|
|
+The \code{uniquify} pass replaces the variable bound by each \key{let}
|
|
|
|
+with a unique name. Both the input and output of the \code{uniquify}
|
|
|
|
+pass is the \LangVar{} language. For example, the \code{uniquify} pass
|
|
|
|
+should translate the program on the left into the program on the
|
|
|
|
+right.
|
|
\begin{transformation}
|
|
\begin{transformation}
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
(let ([x 32])
|
|
(let ([x 32])
|
|
@@ -3279,7 +3292,7 @@ and variables are atomic.
|
|
|
|
|
|
The atomic expressions are pure (they do not cause or depend on side
|
|
The atomic expressions are pure (they do not cause or depend on side
|
|
effects) whereas complex expressions may have side effects, such as
|
|
effects) whereas complex expressions may have side effects, such as
|
|
-\READ{}. A language with this separation between pure expression
|
|
|
|
|
|
+\READ{}. A language with this separation between pure expressions
|
|
versus expressions with side effects is said to be in monadic normal
|
|
versus expressions with side effects is said to be in monadic normal
|
|
form~\citep{Moggi:1991in,Danvy:2003fk}, which explains the \textit{mon}
|
|
form~\citep{Moggi:1991in,Danvy:2003fk}, which explains the \textit{mon}
|
|
in the name \LangVarANF{}. An important invariant of the
|
|
in the name \LangVarANF{}. An important invariant of the
|
|
@@ -3294,8 +3307,11 @@ Another well-known form for intermediate languages is the
|
|
(ANF)~\citep{Danvy:1991fk,Flanagan:1993cg}.
|
|
(ANF)~\citep{Danvy:1991fk,Flanagan:1993cg}.
|
|
\index{subject}{administrative normal form} \index{subject}{ANF}
|
|
\index{subject}{administrative normal form} \index{subject}{ANF}
|
|
%
|
|
%
|
|
-The \LangVarANF{} language is not quite in ANF because we allow the
|
|
|
|
-right-hand side of a \code{let} to be a complex expression.
|
|
|
|
|
|
+The \LangVarANF{} language is not quite in ANF because it allows the
|
|
|
|
+right-hand side of a \code{let} to be a complex expression, such as
|
|
|
|
+another \code{let}. The flattening of nested \code{let} expressions is
|
|
|
|
+instead one of the responsibilities of the \code{explicate\_control}
|
|
|
|
+pass.
|
|
|
|
|
|
{\if\edition\racketEd
|
|
{\if\edition\racketEd
|
|
We recommend implementing this pass with two mutually recursive
|
|
We recommend implementing this pass with two mutually recursive
|
|
@@ -3368,7 +3384,7 @@ Take special care of programs, such as the following, that
|
|
\python{assign an atomic expression to a variable.}
|
|
\python{assign an atomic expression to a variable.}
|
|
%
|
|
%
|
|
You should leave such \racket{variable bindings}\python{assignments}
|
|
You should leave such \racket{variable bindings}\python{assignments}
|
|
-unchanged, as shown in the program on the right\\
|
|
|
|
|
|
+unchanged, as shown in the program on the right:\\
|
|
%
|
|
%
|
|
{\if\edition\racketEd
|
|
{\if\edition\racketEd
|
|
\begin{transformation}
|
|
\begin{transformation}
|
|
@@ -3581,7 +3597,7 @@ input and produces a \Tail{} in \LangCVar{} (see
|
|
figure~\ref{fig:c0-syntax}).
|
|
figure~\ref{fig:c0-syntax}).
|
|
%
|
|
%
|
|
The \code{explicate\_assign} function takes an \Exp{} in \LangVar{},
|
|
The \code{explicate\_assign} function takes an \Exp{} in \LangVar{},
|
|
-the variable to which it is to be assigned to, and a \Tail{} in
|
|
|
|
|
|
+the variable to which it is to be assigned, and a \Tail{} in
|
|
\LangCVar{} for the code that comes after the assignment. The
|
|
\LangCVar{} for the code that comes after the assignment. The
|
|
\code{explicate\_assign} function returns a $\Tail$ in \LangCVar{}.
|
|
\code{explicate\_assign} function returns a $\Tail$ in \LangCVar{}.
|
|
|
|
|
|
@@ -3957,7 +3973,7 @@ the rest of the program, as shown in figure~\ref{fig:p1-x86} and
|
|
discussed in section~\ref{sec:x86}.
|
|
discussed in section~\ref{sec:x86}.
|
|
|
|
|
|
When running on Mac OS X, your compiler should prefix an underscore to
|
|
When running on Mac OS X, your compiler should prefix an underscore to
|
|
-all labels, e.g., changing \key{main} to \key{\_main}.
|
|
|
|
|
|
+all labels (e.g., changing \key{main} to \key{\_main}).
|
|
%
|
|
%
|
|
\racket{The Racket call \code{(system-type 'os)} is useful for
|
|
\racket{The Racket call \code{(system-type 'os)} is useful for
|
|
determining which operating system the compiler is running on. It
|
|
determining which operating system the compiler is running on. It
|
|
@@ -5053,7 +5069,7 @@ callq print_int
|
|
|
|
|
|
The topic of section~\ref{sec:liveness-analysis-Lvar} is how to
|
|
The topic of section~\ref{sec:liveness-analysis-Lvar} is how to
|
|
compute where a variable is in use. Once we have that information, we
|
|
compute where a variable is in use. Once we have that information, we
|
|
-compute which variables are in use at the same time, i.e., which ones
|
|
|
|
|
|
+compute which variables are in use at the same time, that is, which ones
|
|
\emph{interfere}\index{subject}{interfere} with each other, and
|
|
\emph{interfere}\index{subject}{interfere} with each other, and
|
|
represent this relation as an undirected graph whose vertices are
|
|
represent this relation as an undirected graph whose vertices are
|
|
variables and edges indicate when two variables interfere
|
|
variables and edges indicate when two variables interfere
|
|
@@ -5087,7 +5103,7 @@ approach.
|
|
|
|
|
|
As we perform register allocation, we must be aware of the
|
|
As we perform register allocation, we must be aware of the
|
|
\emph{calling conventions} \index{subject}{calling conventions} that
|
|
\emph{calling conventions} \index{subject}{calling conventions} that
|
|
-govern how functions calls are performed in x86.
|
|
|
|
|
|
+govern how function calls are performed in x86.
|
|
%
|
|
%
|
|
Even though \LangVar{} does not include programmer-defined functions,
|
|
Even though \LangVar{} does not include programmer-defined functions,
|
|
our generated code includes a \code{main} function that is called by
|
|
our generated code includes a \code{main} function that is called by
|
|
@@ -5139,10 +5155,11 @@ function are passed in the following six registers, in this order.
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
rdi rsi rdx rcx r8 r9
|
|
rdi rsi rdx rcx r8 r9
|
|
\end{lstlisting}
|
|
\end{lstlisting}
|
|
-If there are more than six arguments, the convention is to use
|
|
|
|
-space on the frame of the caller for the rest of the
|
|
|
|
-arguments. However, in chapter~\ref{ch:Lfun} we arrange never to
|
|
|
|
-need more than six arguments.
|
|
|
|
|
|
+If there are more than six arguments, the convention is to use space
|
|
|
|
+on the frame of the caller for the rest of the arguments. In
|
|
|
|
+chapter~\ref{ch:Lfun} we instead pass a tuple containing the sixth
|
|
|
|
+argument and the rest of the arguments, which simplifies the treatment
|
|
|
|
+of efficient tail calls.
|
|
%
|
|
%
|
|
\racket{For now, the only function we care about is \code{read\_int},
|
|
\racket{For now, the only function we care about is \code{read\_int},
|
|
which takes zero arguments.}
|
|
which takes zero arguments.}
|
|
@@ -5411,7 +5428,7 @@ $\emptyset$ because it is the last instruction
|
|
(formula~\eqref{eq:live-last-empty}). The $L_{\mathsf{before}}$ for
|
|
(formula~\eqref{eq:live-last-empty}). The $L_{\mathsf{before}}$ for
|
|
this instruction is $\{\ttm{b},\ttm{c}\}$ because it reads from
|
|
this instruction is $\{\ttm{b},\ttm{c}\}$ because it reads from
|
|
variables \code{b} and \code{c}
|
|
variables \code{b} and \code{c}
|
|
-(formula~\eqref{eq:live-before-after-minus-writes-plus-reads})
|
|
|
|
|
|
+(formula~\eqref{eq:live-before-after-minus-writes-plus-reads}):
|
|
\[
|
|
\[
|
|
L_{\mathsf{before}}(5) = (\emptyset - \{\ttm{c}\}) \cup \{ \ttm{b}, \ttm{c} \} = \{ \ttm{b}, \ttm{c} \}
|
|
L_{\mathsf{before}}(5) = (\emptyset - \{\ttm{c}\}) \cup \{ \ttm{b}, \ttm{c} \} = \{ \ttm{b}, \ttm{c} \}
|
|
\]
|
|
\]
|
|
@@ -5682,7 +5699,7 @@ instructions. \racket{The first instruction is \lstinline{movq $1, v},
|
|
\racket{The next instruction is \lstinline{movq x, y}, and the
|
|
\racket{The next instruction is \lstinline{movq x, y}, and the
|
|
live-after set is $\{\ttm{w},\ttm{x},\ttm{y},\ttm{rsp}\}$. Rule 1
|
|
live-after set is $\{\ttm{w},\ttm{x},\ttm{y},\ttm{rsp}\}$. Rule 1
|
|
applies, so \ttm{y} interferes with \ttm{w} and \ttm{rsp} but not
|
|
applies, so \ttm{y} interferes with \ttm{w} and \ttm{rsp} but not
|
|
- \ttm{x} because \ttm{x} is the source of the move and therefore
|
|
|
|
|
|
+ \ttm{x}, because \ttm{x} is the source of the move and therefore
|
|
\ttm{x} and \ttm{y} hold the same value.}
|
|
\ttm{x} and \ttm{y} hold the same value.}
|
|
%
|
|
%
|
|
\python{The next instruction is \lstinline{movq x, y}, and the
|
|
\python{The next instruction is \lstinline{movq x, y}, and the
|
|
@@ -5803,31 +5820,6 @@ registers in the interference graph.
|
|
\label{fig:interfere}
|
|
\label{fig:interfere}
|
|
\end{figure}
|
|
\end{figure}
|
|
|
|
|
|
-%% Our next concern is to choose a data structure for representing the
|
|
|
|
-%% interference graph. There are many choices for how to represent a
|
|
|
|
-%% graph, for example, \emph{adjacency matrix}, \emph{adjacency list},
|
|
|
|
-%% and \emph{edge set}~\citep{Cormen:2001uq}. The right way to choose a
|
|
|
|
-%% data structure is to study the algorithm that uses the data structure,
|
|
|
|
-%% determine what operations need to be performed, and then choose the
|
|
|
|
-%% data structure that provide the most efficient implementations of
|
|
|
|
-%% those operations. Often times the choice of data structure can have an
|
|
|
|
-%% effect on the time complexity of the algorithm, as it does here. If
|
|
|
|
-%% you skim the next section, you will see that the register allocation
|
|
|
|
-%% algorithm needs to ask the graph for all its vertices and, given a
|
|
|
|
-%% vertex, it needs to known all the adjacent vertices. Thus, the
|
|
|
|
-%% correct choice of graph representation is that of an adjacency
|
|
|
|
-%% list. There are helper functions in \code{utilities.rkt} for
|
|
|
|
-%% representing graphs using the adjacency list representation:
|
|
|
|
-%% \code{make-graph}, \code{add-edge}, and \code{adjacent}
|
|
|
|
-%% (Appendix~\ref{appendix:utilities}).
|
|
|
|
-%% %
|
|
|
|
-%% \margincomment{\footnotesize To do: change to use the
|
|
|
|
-%% Racket graph library. \\ --Jeremy}
|
|
|
|
-%% %
|
|
|
|
-%% In particular, those functions use a hash table to map each vertex to
|
|
|
|
-%% the set of adjacent vertices, and the sets are represented using
|
|
|
|
-%% Racket's \key{set}, which is also a hash table.
|
|
|
|
-
|
|
|
|
\begin{exercise}\normalfont\normalsize
|
|
\begin{exercise}\normalfont\normalsize
|
|
\racket{Implement the compiler pass named \code{build\_interference} according
|
|
\racket{Implement the compiler pass named \code{build\_interference} according
|
|
to the algorithm suggested here. We recommend using the Racket
|
|
to the algorithm suggested here. We recommend using the Racket
|
|
@@ -6423,7 +6415,7 @@ queue when their saturation changes.}
|
|
\small
|
|
\small
|
|
\begin{tcolorbox}[title=Priority Queue]
|
|
\begin{tcolorbox}[title=Priority Queue]
|
|
A \emph{priority queue} is a collection of items in which the
|
|
A \emph{priority queue} is a collection of items in which the
|
|
- removal of items is governed by priority. In a min queue,
|
|
|
|
|
|
+ removal of items is governed by priority. In a \emph{min} queue,
|
|
lower priority items are removed first. An implementation is in
|
|
lower priority items are removed first. An implementation is in
|
|
\code{priority\_queue.rkt} of the support code. \index{subject}{priority
|
|
\code{priority\_queue.rkt} of the support code. \index{subject}{priority
|
|
queue} \index{subject}{minimum priority queue}
|
|
queue} \index{subject}{minimum priority queue}
|
|
@@ -6490,7 +6482,7 @@ example shown next, on the left, yields the program on the right.
|
|
% why frame size of 32? -JGS
|
|
% why frame size of 32? -JGS
|
|
\begin{center}
|
|
\begin{center}
|
|
{\if\edition\racketEd
|
|
{\if\edition\racketEd
|
|
-\begin{minipage}{0.3\textwidth}
|
|
|
|
|
|
+\begin{minipage}{0.35\textwidth}
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
movq $1, v
|
|
movq $1, v
|
|
movq $42, w
|
|
movq $42, w
|
|
@@ -6525,7 +6517,7 @@ jmp conclusion
|
|
\end{minipage}
|
|
\end{minipage}
|
|
\fi}
|
|
\fi}
|
|
{\if\edition\pythonEd\pythonColor
|
|
{\if\edition\pythonEd\pythonColor
|
|
-\begin{minipage}{0.3\textwidth}
|
|
|
|
|
|
+\begin{minipage}{0.35\textwidth}
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
movq $1, v
|
|
movq $1, v
|
|
movq $42, w
|
|
movq $42, w
|
|
@@ -6604,7 +6596,7 @@ The following is the output of \code{patch\_instructions} on the
|
|
running example.
|
|
running example.
|
|
\begin{center}
|
|
\begin{center}
|
|
{\if\edition\racketEd
|
|
{\if\edition\racketEd
|
|
-\begin{minipage}{0.4\textwidth}
|
|
|
|
|
|
+\begin{minipage}{0.35\textwidth}
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
movq $1, -8(%rbp)
|
|
movq $1, -8(%rbp)
|
|
movq $42, %rcx
|
|
movq $42, %rcx
|
|
@@ -6638,7 +6630,7 @@ jmp conclusion
|
|
\end{minipage}
|
|
\end{minipage}
|
|
\fi}
|
|
\fi}
|
|
{\if\edition\pythonEd\pythonColor
|
|
{\if\edition\pythonEd\pythonColor
|
|
-\begin{minipage}{0.4\textwidth}
|
|
|
|
|
|
+\begin{minipage}{0.35\textwidth}
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
movq $1, -8(%rbp)
|
|
movq $1, -8(%rbp)
|
|
movq $42, %rcx
|
|
movq $42, %rcx
|
|
@@ -6747,7 +6739,7 @@ and \code{pushq} subtracts $8$ from the \code{rsp}.
|
|
|
|
|
|
\path[->,bend left=15] (Lvar) edge [above] node {\ttfamily\footnotesize uniquify} (Lvar-2);
|
|
\path[->,bend left=15] (Lvar) edge [above] node {\ttfamily\footnotesize uniquify} (Lvar-2);
|
|
\path[->,bend left=15] (Lvar-2) edge [above] node {\ttfamily\footnotesize remove\_complex\_operands} (Lvar-3);
|
|
\path[->,bend left=15] (Lvar-2) edge [above] node {\ttfamily\footnotesize remove\_complex\_operands} (Lvar-3);
|
|
-\path[->,bend left=15] (Lvar-3) edge [right] node {\ttfamily\footnotesize explicate\_control} (Cvar-1);
|
|
|
|
|
|
+\path[->,bend left=15] (Lvar-3) edge [right] node {\ttfamily\footnotesize \ \ explicate\_control} (Cvar-1);
|
|
\path[->,bend right=15] (Cvar-1) edge [right] node {\ttfamily\footnotesize select\_instructions} (x86-2);
|
|
\path[->,bend right=15] (Cvar-1) edge [right] node {\ttfamily\footnotesize select\_instructions} (x86-2);
|
|
\path[->,bend left=15] (x86-2) edge [right] node {\ttfamily\footnotesize uncover\_live} (x86-2-1);
|
|
\path[->,bend left=15] (x86-2) edge [right] node {\ttfamily\footnotesize uncover\_live} (x86-2-1);
|
|
\path[->,bend right=15] (x86-2-1) edge [below] node {\ttfamily\footnotesize build\_interference} (x86-2-2);
|
|
\path[->,bend right=15] (x86-2-1) edge [below] node {\ttfamily\footnotesize build\_interference} (x86-2-2);
|
|
@@ -6903,7 +6895,7 @@ Using the same assignment of variables to color numbers that was
|
|
produced by the register allocator described in the last section, we
|
|
produced by the register allocator described in the last section, we
|
|
get the following program.
|
|
get the following program.
|
|
\begin{center}
|
|
\begin{center}
|
|
-\begin{minipage}{0.3\textwidth}
|
|
|
|
|
|
+\begin{minipage}{0.35\textwidth}
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
movq $1, v
|
|
movq $1, v
|
|
movq $42, w
|
|
movq $42, w
|
|
@@ -6956,16 +6948,21 @@ to allocate \code{y} and \code{tmp\_0} to the same register. \fi}
|
|
We say that two variables $p$ and $q$ are \emph{move
|
|
We say that two variables $p$ and $q$ are \emph{move
|
|
related}\index{subject}{move related} if they participate together in
|
|
related}\index{subject}{move related} if they participate together in
|
|
a \key{movq} instruction, that is, \key{movq} $p$\key{,} $q$ or
|
|
a \key{movq} instruction, that is, \key{movq} $p$\key{,} $q$ or
|
|
-\key{movq} $q$\key{,} $p$. In deciding which variable to color next,
|
|
|
|
-if there are multiple variables with the same saturation, prefer
|
|
|
|
-variables that can be assigned to a color that is the same as the
|
|
|
|
-color of a move-related variable. Furthermore, when the register
|
|
|
|
-allocator chooses a color for a variable, it should prefer a color
|
|
|
|
-that has already been used for a move-related variable (assuming that
|
|
|
|
-they do not interfere). Of course, this preference should not override
|
|
|
|
-the preference for registers over stack locations. So, this preference
|
|
|
|
-should be used as a tie breaker in choosing between registers and
|
|
|
|
-in choosing between stack locations.
|
|
|
|
|
|
+\key{movq} $q$\key{,} $p$.
|
|
|
|
+%
|
|
|
|
+Recall that we color variables that are more saturated before coloring
|
|
|
|
+variables that are less saturated, and in the case of equally
|
|
|
|
+saturated variables, we choose randomly. Now we break such ties by
|
|
|
|
+giving preference to variables that have an available color that is
|
|
|
|
+the same as the color of a move-related variable.
|
|
|
|
+%
|
|
|
|
+Furthermore, when the register allocator chooses a color for a
|
|
|
|
+variable, it should prefer a color that has already been used for a
|
|
|
|
+move-related variable if one exists (and assuming that they do not
|
|
|
|
+interfere). This preference should not override the preference for
|
|
|
|
+registers over stack locations. So, this preference should be used as
|
|
|
|
+a tie breaker in choosing between two registers or in choosing between
|
|
|
|
+two stack locations.
|
|
|
|
|
|
We recommend representing the move relationships in a graph, similarly
|
|
We recommend representing the move relationships in a graph, similarly
|
|
to how we represented interference. The following is the \emph{move
|
|
to how we represented interference. The following is the \emph{move
|
|
@@ -7256,7 +7253,8 @@ on the left, to obtain the code in the middle. The
|
|
the code on the right.
|
|
the code on the right.
|
|
|
|
|
|
{\if\edition\racketEd
|
|
{\if\edition\racketEd
|
|
-\begin{minipage}{0.25\textwidth}
|
|
|
|
|
|
+\begin{center}
|
|
|
|
+\begin{minipage}{0.2\textwidth}
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
movq $1, v
|
|
movq $1, v
|
|
movq $42, w
|
|
movq $42, w
|
|
@@ -7290,7 +7288,7 @@ jmp conclusion
|
|
\end{lstlisting}
|
|
\end{lstlisting}
|
|
\end{minipage}
|
|
\end{minipage}
|
|
$\Rightarrow\qquad$
|
|
$\Rightarrow\qquad$
|
|
-\begin{minipage}{0.25\textwidth}
|
|
|
|
|
|
+\begin{minipage}{0.23\textwidth}
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
movq $1, %rcx
|
|
movq $1, %rcx
|
|
movq $42, %rsi
|
|
movq $42, %rsi
|
|
@@ -7303,9 +7301,11 @@ addq %rcx, %rax
|
|
jmp conclusion
|
|
jmp conclusion
|
|
\end{lstlisting}
|
|
\end{lstlisting}
|
|
\end{minipage}
|
|
\end{minipage}
|
|
|
|
+\end{center}
|
|
\fi}
|
|
\fi}
|
|
|
|
|
|
{\if\edition\pythonEd\pythonColor
|
|
{\if\edition\pythonEd\pythonColor
|
|
|
|
+\begin{center}
|
|
\begin{minipage}{0.20\textwidth}
|
|
\begin{minipage}{0.20\textwidth}
|
|
\begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
|
|
\begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
|
|
movq $1, v
|
|
movq $1, v
|
|
@@ -7324,7 +7324,7 @@ callq _print_int
|
|
\end{lstlisting}
|
|
\end{lstlisting}
|
|
\end{minipage}
|
|
\end{minipage}
|
|
${\Rightarrow\qquad}$
|
|
${\Rightarrow\qquad}$
|
|
-\begin{minipage}{0.30\textwidth}
|
|
|
|
|
|
+\begin{minipage}{0.35\textwidth}
|
|
\begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
|
|
\begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
|
|
movq $1, %rcx
|
|
movq $1, %rcx
|
|
movq $42, -16(%rbp)
|
|
movq $42, -16(%rbp)
|
|
@@ -7356,6 +7356,7 @@ movq -8(%rbp), %rdi
|
|
callq print_int
|
|
callq print_int
|
|
\end{lstlisting}
|
|
\end{lstlisting}
|
|
\end{minipage}
|
|
\end{minipage}
|
|
|
|
+\end{center}
|
|
\fi}
|
|
\fi}
|
|
|
|
|
|
\begin{exercise}\normalfont\normalsize
|
|
\begin{exercise}\normalfont\normalsize
|
|
@@ -7412,7 +7413,7 @@ called \emph{coalescing}. Although coalescing decreases the number of
|
|
moves, it can make the graph more difficult to
|
|
moves, it can make the graph more difficult to
|
|
color. \citet{Briggs:1994kx} proposed \emph{conservative coalescing} in
|
|
color. \citet{Briggs:1994kx} proposed \emph{conservative coalescing} in
|
|
which two variables are merged only if they have fewer than $k$
|
|
which two variables are merged only if they have fewer than $k$
|
|
-neighbors of high degree. \citet{George:1996aa} observed that
|
|
|
|
|
|
+neighbors of high degree. \citet{George:1996aa} observes that
|
|
conservative coalescing is sometimes too conservative and made it more
|
|
conservative coalescing is sometimes too conservative and made it more
|
|
aggressive by iterating the coalescing with the removal of low-degree
|
|
aggressive by iterating the coalescing with the removal of low-degree
|
|
vertices.
|
|
vertices.
|
|
@@ -7426,7 +7427,7 @@ The algorithm of \citet{Chaitin:1981vl} and its successors iteratively
|
|
performs coalescing, graph coloring, and spill code insertion until
|
|
performs coalescing, graph coloring, and spill code insertion until
|
|
all variables have been assigned a location.
|
|
all variables have been assigned a location.
|
|
|
|
|
|
-\citet{Briggs:1994kx} observed that \citet{Chaitin:1982vn} sometimes
|
|
|
|
|
|
+\citet{Briggs:1994kx} observes that \citet{Chaitin:1982vn} sometimes
|
|
spilled variables that don't have to be: a high-degree variable can be
|
|
spilled variables that don't have to be: a high-degree variable can be
|
|
colorable if many of its neighbors are assigned the same color.
|
|
colorable if many of its neighbors are assigned the same color.
|
|
\citet{Briggs:1994kx} proposed \emph{optimistic coloring}, in which a
|
|
\citet{Briggs:1994kx} proposed \emph{optimistic coloring}, in which a
|
|
@@ -7483,7 +7484,7 @@ range~\citep{Chow:1984ys,Briggs:1994kx,Cooper:1998ly}.
|
|
|
|
|
|
%Register Allocation via Usage Counts, Freiburghouse CACM
|
|
%Register Allocation via Usage Counts, Freiburghouse CACM
|
|
|
|
|
|
-\citet{Palsberg:2007si} observed that many of the interference graphs
|
|
|
|
|
|
+\citet{Palsberg:2007si} observes that many of the interference graphs
|
|
that arise from Java programs in the JoeQ compiler are \emph{chordal};
|
|
that arise from Java programs in the JoeQ compiler are \emph{chordal};
|
|
that is, every cycle with four or more edges has an edge that is not
|
|
that is, every cycle with four or more edges has an edge that is not
|
|
part of the cycle but that connects two vertices on the cycle. Such
|
|
part of the cycle but that connects two vertices on the cycle. Such
|
|
@@ -7514,10 +7515,10 @@ operations that involve Booleans (\key{and}, \key{not},
|
|
expression \python{and statement}. With the addition of \key{if},
|
|
expression \python{and statement}. With the addition of \key{if},
|
|
programs can have nontrivial control flow which
|
|
programs can have nontrivial control flow which
|
|
%
|
|
%
|
|
-\racket{impacts \code{explicate\_control} and liveness analysis}
|
|
|
|
|
|
+\racket{impacts \code{explicate\_control} and liveness analysis.}
|
|
%
|
|
%
|
|
\python{impacts liveness analysis and motivates a new pass named
|
|
\python{impacts liveness analysis and motivates a new pass named
|
|
- \code{explicate\_control}}.
|
|
|
|
|
|
+ \code{explicate\_control}.}%
|
|
%
|
|
%
|
|
Also, because we now have two kinds of values, we need to handle
|
|
Also, because we now have two kinds of values, we need to handle
|
|
programs that apply an operation to the wrong kind of value, such as
|
|
programs that apply an operation to the wrong kind of value, such as
|
|
@@ -7592,9 +7593,8 @@ Definitions of the concrete syntax and abstract syntax of the
|
|
\LangIf{} language are shown in figures~\ref{fig:Lif-concrete-syntax}
|
|
\LangIf{} language are shown in figures~\ref{fig:Lif-concrete-syntax}
|
|
and~\ref{fig:Lif-syntax}, respectively. The \LangIf{} language
|
|
and~\ref{fig:Lif-syntax}, respectively. The \LangIf{} language
|
|
includes all of \LangVar{} {(shown in gray)}, the Boolean literals
|
|
includes all of \LangVar{} {(shown in gray)}, the Boolean literals
|
|
-\TRUE{} and \FALSE{}, \racket{and} the \code{if} expression
|
|
|
|
-%
|
|
|
|
-\python{, and the \code{if} statement}. We expand the set of
|
|
|
|
|
|
+\TRUE{} and \FALSE{}, \racket{and} the \code{if} expression%
|
|
|
|
+\python{, and the \code{if} statement}. We expand the set of
|
|
operators to include
|
|
operators to include
|
|
\begin{enumerate}
|
|
\begin{enumerate}
|
|
\item the logical operators \key{and}, \key{or}, and \key{not},
|
|
\item the logical operators \key{and}, \key{or}, and \key{not},
|
|
@@ -7607,7 +7607,7 @@ operators to include
|
|
\racket{We reorganize the abstract syntax for the primitive
|
|
\racket{We reorganize the abstract syntax for the primitive
|
|
operations given in figure~\ref{fig:Lif-syntax}, using only one grammar
|
|
operations given in figure~\ref{fig:Lif-syntax}, using only one grammar
|
|
rule for all of them. This means that the grammar no longer checks
|
|
rule for all of them. This means that the grammar no longer checks
|
|
- whether the arity of an operators matches the number of
|
|
|
|
|
|
+ whether the arity of an operator matches the number of
|
|
arguments. That responsibility is moved to the type checker for
|
|
arguments. That responsibility is moved to the type checker for
|
|
\LangIf{} (section~\ref{sec:type-check-Lif}).}
|
|
\LangIf{} (section~\ref{sec:type-check-Lif}).}
|
|
|
|
|
|
@@ -8624,7 +8624,7 @@ depending on whether the contents of the EFLAGS register matches the
|
|
condition code \itm{cc}: \key{e} for equal, \key{l} for less, \key{le}
|
|
condition code \itm{cc}: \key{e} for equal, \key{l} for less, \key{le}
|
|
for less-or-equal, \key{g} for greater, \key{ge} for greater-or-equal.
|
|
for less-or-equal, \key{g} for greater, \key{ge} for greater-or-equal.
|
|
The \key{set} instruction has a quirk in that its destination argument
|
|
The \key{set} instruction has a quirk in that its destination argument
|
|
-must be single-byte register, such as \code{al} (\code{l} for lower bits) or
|
|
|
|
|
|
+must be a single-byte register, such as \code{al} (\code{l} for lower bits) or
|
|
\code{ah} (\code{h} for higher bits), which are part of the \code{rax}
|
|
\code{ah} (\code{h} for higher bits), which are part of the \code{rax}
|
|
register. Thankfully, the \key{movzbq} instruction can be used to
|
|
register. Thankfully, the \key{movzbq} instruction can be used to
|
|
move from a single-byte register to a normal 64-bit register. The
|
|
move from a single-byte register to a normal 64-bit register. The
|
|
@@ -8747,7 +8747,7 @@ subexpressions, recall that you should invoke \code{rco\_atom} when
|
|
the output needs to be an \Atm{} (as specified in the grammar for
|
|
the output needs to be an \Atm{} (as specified in the grammar for
|
|
\LangIfANF{}) and invoke \code{rco\_exp} when the output should be
|
|
\LangIfANF{}) and invoke \code{rco\_exp} when the output should be
|
|
\Exp{}. Regarding \code{if}, it is particularly important
|
|
\Exp{}. Regarding \code{if}, it is particularly important
|
|
-\textbf{not} to replace its condition with a temporary variable, because
|
|
|
|
|
|
+\emph{not} to replace its condition with a temporary variable, because
|
|
that would interfere with the generation of high-quality output in the
|
|
that would interfere with the generation of high-quality output in the
|
|
upcoming \code{explicate\_control} pass.
|
|
upcoming \code{explicate\_control} pass.
|
|
|
|
|
|
@@ -8888,7 +8888,7 @@ example would be as follows:
|
|
\end{minipage}
|
|
\end{minipage}
|
|
\end{center}
|
|
\end{center}
|
|
Notice that the three instructions starting with \code{setl} are
|
|
Notice that the three instructions starting with \code{setl} are
|
|
-redundant: the conditional jump could come immediately after the first
|
|
|
|
|
|
+redundant; the conditional jump could come immediately after the first
|
|
\code{cmpq}.
|
|
\code{cmpq}.
|
|
|
|
|
|
Our goal is to compile \key{if} expressions so that the relevant
|
|
Our goal is to compile \key{if} expressions so that the relevant
|
|
@@ -9307,9 +9307,18 @@ then there is no need to generate a new label and entry in
|
|
\fi}
|
|
\fi}
|
|
|
|
|
|
{\if\edition\racketEd
|
|
{\if\edition\racketEd
|
|
-
|
|
|
|
-\subsection{Explicate Predicate}
|
|
|
|
|
|
|
|
|
|
+\subsection{Explicate Predicate}
|
|
|
|
+
|
|
|
|
+The skeleton for the \code{explicate\_pred} function is given in
|
|
|
|
+figure~\ref{fig:explicate-pred}. It takes three parameters: (1)
|
|
|
|
+\code{cnd}, the condition expression of the \code{if}; (2) \code{thn},
|
|
|
|
+the code generated by explicate for the \emph{then} branch; and (3)
|
|
|
|
+\code{els}, the code generated by explicate for the \emph{else}
|
|
|
|
+branch. The \code{explicate\_pred} function should match on
|
|
|
|
+\code{cnd} with a case for every kind of expression that can have type
|
|
|
|
+\BOOLTY{}.
|
|
|
|
+
|
|
\begin{figure}[tbp]
|
|
\begin{figure}[tbp]
|
|
\begin{tcolorbox}[colback=white]
|
|
\begin{tcolorbox}[colback=white]
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
@@ -9330,23 +9339,19 @@ then there is no need to generate a new label and entry in
|
|
\caption{Skeleton for the \key{explicate\_pred} auxiliary function.}
|
|
\caption{Skeleton for the \key{explicate\_pred} auxiliary function.}
|
|
\label{fig:explicate-pred}
|
|
\label{fig:explicate-pred}
|
|
\end{figure}
|
|
\end{figure}
|
|
-\fi}
|
|
|
|
|
|
|
|
-\racket{The skeleton for the \code{explicate\_pred} function is given
|
|
|
|
- in figure~\ref{fig:explicate-pred}. It takes three parameters:
|
|
|
|
- (1) \code{cnd}, the condition expression of the \code{if};
|
|
|
|
- (2) \code{thn}, the code generated by explicate for the \emph{then} branch;
|
|
|
|
- and (3) \code{els}, the code generated by
|
|
|
|
- explicate for the \emph{else} branch. The \code{explicate\_pred}
|
|
|
|
- function should match on \code{cnd} with a case for
|
|
|
|
- every kind of expression that can have type \BOOLTY{}.}
|
|
|
|
|
|
+\fi}
|
|
%
|
|
%
|
|
-\python{The \code{explicate\_pred} function has four parameters: 1)
|
|
|
|
- the condition expression, 2) the generated statements for the
|
|
|
|
- ``then'' branch, 3) the generated statements for the ``else''
|
|
|
|
- branch, and 4) the dictionary of basic blocks. The
|
|
|
|
- \code{explicate\_pred} function returns a list of \LangCIf{}
|
|
|
|
- statements and it may add to the dictionary of basic blocks.}
|
|
|
|
|
|
+{\if\edition\pythonEd\pythonColor
|
|
|
|
+
|
|
|
|
+The \code{explicate\_pred} function has four parameters: 1) the
|
|
|
|
+condition expression, 2) the generated statements for the ``then''
|
|
|
|
+branch, 3) the generated statements for the ``else'' branch, and 4)
|
|
|
|
+the dictionary of basic blocks. The \code{explicate\_pred} function
|
|
|
|
+returns a list of \LangCIf{} statements and it may add to the
|
|
|
|
+dictionary of basic blocks.
|
|
|
|
+
|
|
|
|
+\fi}
|
|
|
|
|
|
Consider the case for comparison operators. We translate the
|
|
Consider the case for comparison operators. We translate the
|
|
comparison to an \code{if} statement whose branches are \code{goto}
|
|
comparison to an \code{if} statement whose branches are \code{goto}
|
|
@@ -13025,7 +13030,7 @@ pass, which is \LangAlloc{} in monadic normal form.
|
|
\end{figure}
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
-\section{Explicate Control and the \LangCVec{} language}
|
|
|
|
|
|
+\section{Explicate Control and the \LangCVec{} Language}
|
|
\label{sec:explicate-control-r3}
|
|
\label{sec:explicate-control-r3}
|
|
|
|
|
|
|
|
|
|
@@ -15271,7 +15276,7 @@ FunctionDef('main', [], int, None, |$\Stmt\ldots$|Return(Constant(0)), None)
|
|
\end{lstlisting}
|
|
\end{lstlisting}
|
|
\fi}
|
|
\fi}
|
|
|
|
|
|
-\section{Reveal Functions and the \LangFunRef{} language}
|
|
|
|
|
|
+\section{Reveal Functions and the \LangFunRef{} Language}
|
|
\label{sec:reveal-functions-r4}
|
|
\label{sec:reveal-functions-r4}
|
|
|
|
|
|
The syntax of \LangFun{} is inconvenient for purposes of compilation
|
|
The syntax of \LangFun{} is inconvenient for purposes of compilation
|
|
@@ -15532,7 +15537,7 @@ and augments programs to include a list of function definitions.
|
|
%% \end{figure}
|
|
%% \end{figure}
|
|
|
|
|
|
|
|
|
|
-\section{Explicate Control and the \LangCFun{} language}
|
|
|
|
|
|
+\section{Explicate Control and the \LangCFun{} Language}
|
|
\label{sec:explicate-control-r4}
|
|
\label{sec:explicate-control-r4}
|
|
|
|
|
|
Figure~\ref{fig:c3-syntax} defines the abstract syntax for \LangCFun{}, the
|
|
Figure~\ref{fig:c3-syntax} defines the abstract syntax for \LangCFun{}, the
|
|
@@ -23504,8 +23509,8 @@ registers.
|
|
|
|
|
|
%% \addtocontents{toc}{\vspace{11pt}}
|
|
%% \addtocontents{toc}{\vspace{11pt}}
|
|
|
|
|
|
-
|
|
|
|
%% \nocite{*} is a way to get all the entries in the .bib file to print in the bibliography:
|
|
%% \nocite{*} is a way to get all the entries in the .bib file to print in the bibliography:
|
|
|
|
+\cleardoublepage % needed for right page number in TOC for References
|
|
\nocite{*}\let\bibname\refname
|
|
\nocite{*}\let\bibname\refname
|
|
\addcontentsline{toc}{fmbm}{\refname}
|
|
\addcontentsline{toc}{fmbm}{\refname}
|
|
\printbibliography
|
|
\printbibliography
|