|
@@ -521,9 +521,9 @@ write down and talk about programs. Inside the compiler, we use
|
|
|
that efficiently supports the operations that the compiler needs to
|
|
|
perform.\index{subject}{concrete syntax}\index{subject}{abstract
|
|
|
syntax}\index{subject}{abstract syntax
|
|
|
- tree}\index{subject}{AST}\index{subject}{program}\index{subject}{parse}
|
|
|
+ tree}\index{subject}{AST}\index{subject}{program}
|
|
|
The process of translating concrete syntax to abstract syntax is
|
|
|
-called \emph{parsing}\python{\ and is studied in
|
|
|
+called \emph{parsing}\index{subject}{parsing}\python{\ and is studied in
|
|
|
chapter~\ref{ch:parsing}}.
|
|
|
\racket{This book does not cover the theory and implementation of parsing.
|
|
|
We refer the readers interested in parsing to the thorough treatment
|
|
@@ -662,6 +662,7 @@ node of which it is the child). If a node has no children, it is a
|
|
|
{\if\edition\racketEd
|
|
|
We define a Racket \code{struct} for each kind of node. For this
|
|
|
chapter we require just two kinds of nodes: one for integer constants
|
|
|
+(aka literals\index{subject}{literals})
|
|
|
and one for primitive operations. The following is the \code{struct}
|
|
|
definition for integer constants.\footnote{All the AST structures are
|
|
|
defined in the file \code{utilities.rkt} in the support code.}
|
|
@@ -718,7 +719,8 @@ single structure.
|
|
|
{\if\edition\pythonEd\pythonColor
|
|
|
We use a Python \code{class} for each kind of node.
|
|
|
The following is the class definition for
|
|
|
-constants from the Python \code{ast} module.
|
|
|
+constants (aka literals\index{subject}{literals})
|
|
|
+from the Python \code{ast} module.
|
|
|
\begin{lstlisting}
|
|
|
class Constant:
|
|
|
def __init__(self, value):
|
|
@@ -797,7 +799,6 @@ syntax trees.
|
|
|
\section{Grammars}
|
|
|
\label{sec:grammar}
|
|
|
\index{subject}{integer}
|
|
|
-\index{subject}{literal}
|
|
|
%\index{subject}{constant}
|
|
|
|
|
|
A programming language can be thought of as a \emph{set} of programs.
|
|
@@ -1130,8 +1131,8 @@ three pattern variables \texttt{child1}, \texttt{op}, and
|
|
|
\code{case} consists of a \emph{pattern} and a
|
|
|
\emph{body}.\index{subject}{pattern} Patterns are recursively defined
|
|
|
to be either a pattern variable, a class name followed by a pattern
|
|
|
-for each of its constructor's arguments, or other literals such as
|
|
|
-strings, lists, etc.
|
|
|
+for each of its constructor's arguments, or other
|
|
|
+literals\index{subject}{literals} such as strings, lists, etc.
|
|
|
%
|
|
|
The body of each \code{case} may contain arbitrary Python code. The
|
|
|
pattern variables can be used in the body, such as \code{op} in
|
|
@@ -1220,6 +1221,8 @@ print(leaf(Constant(8)))
|
|
|
\end{lstlisting}
|
|
|
\fi}
|
|
|
\end{minipage}
|
|
|
+\index{subject}{True@\TRUE{}}
|
|
|
+\index{subject}{False@\FALSE{}}
|
|
|
\end{center}
|
|
|
|
|
|
When constructing a \code{match} expression, we refer to the grammar
|
|
@@ -1600,7 +1603,7 @@ In this section we consider a compiler that translates \LangInt{}
|
|
|
programs into \LangInt{} programs that may be more efficient. The
|
|
|
compiler eagerly computes the parts of the program that do not depend
|
|
|
on any inputs, a process known as \emph{partial
|
|
|
-evaluation}~\citep{Jones:1993uq}. \index{subject}{partial evaluation}
|
|
|
+evaluation}~\citep{Jones:1993uq}.\index{subject}{partialevaluation@partial evaluation}
|
|
|
For example, given the following program
|
|
|
{\if\edition\racketEd
|
|
|
\begin{lstlisting}
|
|
@@ -1948,6 +1951,7 @@ $52$ then $10$, the following produces $42$ (not $-42$).
|
|
|
|
|
|
\subsection{Extensible Interpreters via Method Overriding}
|
|
|
\label{sec:extensible-interp}
|
|
|
+\index{subject}{method overriding}
|
|
|
|
|
|
To prepare for discussing the interpreter of \LangVar{}, we explain
|
|
|
why we implement it in an object-oriented style. Throughout this book
|
|
@@ -2038,7 +2042,7 @@ To make our interpreters extensible we need something called
|
|
|
\emph{open recursion}\index{subject}{open recursion}, in which the
|
|
|
tying of the recursive knot is delayed until the functions are
|
|
|
composed. Object-oriented languages provide open recursion via method
|
|
|
-overriding\index{subject}{method overriding}. The following code uses
|
|
|
+overriding. The following code uses
|
|
|
method overriding to interpret \LangInt{} and \LangVar{} using
|
|
|
%
|
|
|
\racket{the
|
|
@@ -2835,7 +2839,8 @@ Our compiler for \LangVar{} consists of the following passes:
|
|
|
of statements and the edges are \code{goto} statements.
|
|
|
\fi}
|
|
|
|
|
|
-\item[\key{select\_instructions}] handles the difference between
|
|
|
+\item[\key{select\_instructions}]\index{subject}{select instructions}
|
|
|
+ handles the difference between
|
|
|
\LangVar{} operations and x86 instructions. This pass converts each
|
|
|
\LangVar{} operation to a short sequence of instructions that
|
|
|
accomplishes the same task.
|
|
@@ -3628,7 +3633,7 @@ list of \code{passes} and then run the script to test your compiler.
|
|
|
|
|
|
\section{Select Instructions}
|
|
|
\label{sec:select-Lvar}
|
|
|
-\index{subject}{instruction selection}
|
|
|
+\index{subject}{select instructions}
|
|
|
|
|
|
In the \code{select\_instructions} pass we begin the work of
|
|
|
translating \racket{from \LangCVar{}} to \LangXVar{}. The target
|
|
@@ -4019,7 +4024,7 @@ translates the x86 AST that you produce into a string by invoking the
|
|
|
|
|
|
\section{Challenge: Partial Evaluator for \LangVar{}}
|
|
|
\label{sec:pe-Lvar}
|
|
|
-\index{subject}{partial evaluation}
|
|
|
+\index{subject}{partialevaluation@partial evaluation}
|
|
|
|
|
|
This section describes two optional challenge exercises that involve
|
|
|
adapting and improving the partial evaluator for \LangInt{} that was
|
|
@@ -4988,7 +4993,8 @@ same register.
|
|
|
|
|
|
The program shown in figure~\ref{fig:reg-eg} serves as a running
|
|
|
example. The source program is on the left and the output of
|
|
|
-instruction selection is on the right. The program is almost
|
|
|
+instruction selection\index{subject}{instruction selection}
|
|
|
+is on the right. The program is almost
|
|
|
completely in the x86 assembly language, but it still uses variables.
|
|
|
Consider variables \code{x} and \code{z}. After the variable \code{x}
|
|
|
has been moved to \code{z}, it is no longer in use. Variable \code{z}, on
|
|
@@ -5150,11 +5156,11 @@ view, the caller view and the callee view, as follows:
|
|
|
In x86, registers are also used for passing arguments to a function
|
|
|
and for the return value. In particular, the first six arguments of a
|
|
|
function are passed in the following six registers, in this order.
|
|
|
-\index{subject}{argument-passing registers}
|
|
|
-\index{subject}{parameter-passing registers}
|
|
|
\begin{lstlisting}
|
|
|
rdi rsi rdx rcx r8 r9
|
|
|
\end{lstlisting}
|
|
|
+We refer to these six registers are the argument-passing registers
|
|
|
+\index{subject}{argument-passing registers}.
|
|
|
If there are more than six arguments, the convention is to use space
|
|
|
on the frame of the caller for the rest of the arguments. In
|
|
|
chapter~\ref{ch:Lfun} we instead pass a tuple containing the sixth
|
|
@@ -5387,9 +5393,12 @@ locations before instruction $I_k$. \racket{We recommend representing
|
|
|
\end{figure}
|
|
|
\fi}
|
|
|
|
|
|
-The live locations after an instruction are always the same as the
|
|
|
-live locations before the next instruction.
|
|
|
-\index{subject}{live-after} \index{subject}{live-before}
|
|
|
+The locations that are live after an instruction are its
|
|
|
+\emph{live-after}\index{subject}{live-after} set and the the locations
|
|
|
+that are live before an instruction are its
|
|
|
+\emph{live-before}\index{subject}{live-before} set. The live-after
|
|
|
+set of an instruction is always the same as the live-before set of the
|
|
|
+next instruction. .
|
|
|
\begin{equation} \label{eq:live-after-before-next}
|
|
|
L_{\mathsf{after}}(k) = L_{\mathsf{before}}(k+1)
|
|
|
\end{equation}
|
|
@@ -6414,11 +6423,11 @@ queue when their saturation changes.}
|
|
|
%\begin{wrapfigure}[25]{r}[0.75in]{0.55\textwidth}
|
|
|
\small
|
|
|
\begin{tcolorbox}[title=Priority Queue]
|
|
|
- A \emph{priority queue} is a collection of items in which the
|
|
|
+ A \emph{priority queue}\index{subject}{priority queue}
|
|
|
+ is a collection of items in which the
|
|
|
removal of items is governed by priority. In a \emph{min} queue,
|
|
|
lower priority items are removed first. An implementation is in
|
|
|
- \code{priority\_queue.rkt} of the support code. \index{subject}{priority
|
|
|
- queue} \index{subject}{minimum priority queue}
|
|
|
+ \code{priority\_queue.rkt} of the support code.\index{subject}{min queue}
|
|
|
\begin{description}
|
|
|
\item[$\LP\code{make-pqueue}\,\itm{cmp}\RP$] constructs an empty
|
|
|
priority queue that uses the $\itm{cmp}$ predicate to determine
|
|
@@ -6707,7 +6716,8 @@ the amount $A$ to subtract from the \code{rsp}. Let $S$ be the number
|
|
|
of stack locations used by spilled variables\footnote{Sometimes two or
|
|
|
more spilled variables are assigned to the same stack location, so
|
|
|
$S$ can be less than the number of spilled variables.} and $C$ be
|
|
|
-the number of callee-saved registers that were allocated to
|
|
|
+the number of callee-saved registers that were
|
|
|
+allocated\index{subject}{allocate} to
|
|
|
variables. The $\itm{align}$ function rounds a number up to the
|
|
|
nearest 16 bytes.
|
|
|
\[
|
|
@@ -7500,20 +7510,26 @@ be more appropriate.
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
\chapter{Booleans and Conditionals}
|
|
|
\label{ch:Lif}
|
|
|
-\index{subject}{Boolean}
|
|
|
-\index{subject}{control flow}
|
|
|
-\index{subject}{conditional expression}
|
|
|
\setcounter{footnote}{0}
|
|
|
|
|
|
The \LangVar{} language has only a single kind of value, the
|
|
|
integers. In this chapter we add a second kind of value, the Booleans,
|
|
|
to create the \LangIf{} language. In \racket{Racket}\python{Python},
|
|
|
-the Boolean values \emph{true} and \emph{false} are written \TRUE{}
|
|
|
-and \FALSE{}, respectively. The \LangIf{} language includes several
|
|
|
-operations that involve Booleans (\key{and}, \key{not},
|
|
|
-\racket{\key{eq?}}\python{==}, \key{<}, etc.) and the \key{if}
|
|
|
-expression \python{and statement}. With the addition of \key{if},
|
|
|
-programs can have nontrivial control flow which
|
|
|
+the Boolean\index{subject}{Boolean} values \emph{true} and \emph{false}
|
|
|
+are written
|
|
|
+\TRUE{}\index{subject}{True@\TRUE{}} and
|
|
|
+\FALSE{}\index{subject}{False@\FALSE{}}, respectively. The \LangIf{}
|
|
|
+language includes several operations that involve Booleans
|
|
|
+(\key{and}\index{subject}{and@\ANDNAME{}},
|
|
|
+\key{or}\index{subject}{or@\ORNAME{}},
|
|
|
+\key{not}\index{subject}{not@\NOTNAME{}},
|
|
|
+\racket{\key{eq?}\index{subject}{equal@\EQNAME{}}}\python{==},
|
|
|
+\key{<}\index{subject}{lessthan@\texttt{<}}, etc.) and the
|
|
|
+\key{if}\index{subject}{IfExp@\IFNAME{}}
|
|
|
+conditional expression\index{subject}{conditional expression}
|
|
|
+\python{ and statement\index{subject}{IfStmt@\IFSTMTNAME{}}}.
|
|
|
+With the addition of \key{if}, programs can have
|
|
|
+nontrivial control flow\index{subject}{control flow} which
|
|
|
%
|
|
|
\racket{impacts \code{explicate\_control} and liveness analysis.}
|
|
|
%
|
|
@@ -7565,7 +7581,8 @@ books~\citep{Pierce:2002hj,Pierce:2004fk,Harper2016,Pierce:SF2}.
|
|
|
This chapter is organized as follows. We begin by defining the syntax
|
|
|
and interpreter for the \LangIf{} language
|
|
|
(section~\ref{sec:lang-if}). We then introduce the idea of type
|
|
|
-checking and define a type checker for \LangIf{}
|
|
|
+checking (aka semantic analysis\index{subject}{semantic analysis})
|
|
|
+and define a type checker for \LangIf{}
|
|
|
(section~\ref{sec:type-check-Lif}).
|
|
|
%
|
|
|
\racket{To compile \LangIf{} we need to enlarge the intermediate
|
|
@@ -7592,7 +7609,8 @@ how to handle conditional \code{goto}s during liveness analysis.
|
|
|
Definitions of the concrete syntax and abstract syntax of the
|
|
|
\LangIf{} language are shown in figures~\ref{fig:Lif-concrete-syntax}
|
|
|
and~\ref{fig:Lif-syntax}, respectively. The \LangIf{} language
|
|
|
-includes all of \LangVar{} {(shown in gray)}, the Boolean literals
|
|
|
+includes all of \LangVar{} {(shown in gray)}, the Boolean
|
|
|
+literals\index{subject}{literals}
|
|
|
\TRUE{} and \FALSE{}, \racket{and} the \code{if} expression%
|
|
|
\python{, and the \code{if} statement}. We expand the set of
|
|
|
operators to include
|
|
@@ -7600,7 +7618,9 @@ operators to include
|
|
|
\item the logical operators \key{and}, \key{or}, and \key{not},
|
|
|
\item the \racket{\key{eq?} operation}\python{\key{==} and \key{!=} operations}
|
|
|
for comparing integers or Booleans for equality, and
|
|
|
-\item the \key{<}, \key{<=}, \key{>}, and \key{>=} operations for
|
|
|
+\item the \key{<}, \key{<=}\index{subject}{lessthaneq@\texttt{<=}},
|
|
|
+ \key{>}\index{subject}{greaterthan@\texttt{>}}, and
|
|
|
+ \key{>=}\index{subject}{greaterthaneq@\texttt{>=}} operations for
|
|
|
comparing integers.
|
|
|
\end{enumerate}
|
|
|
|
|
@@ -7727,20 +7747,7 @@ operators to include
|
|
|
\fi}
|
|
|
\end{tcolorbox}
|
|
|
%\end{minipage}
|
|
|
-\index{subject}{True@\TRUE{}}\index{subject}{False@\FALSE{}}
|
|
|
-\index{subject}{IfExp@\IFNAME{}}
|
|
|
-\python{\index{subject}{IfStmt@\IFSTMTNAME{}}}
|
|
|
-\index{subject}{and@\ANDNAME{}}
|
|
|
-\index{subject}{or@\ORNAME{}}
|
|
|
-\index{subject}{not@\NOTNAME{}}
|
|
|
-\index{subject}{equal@\EQNAME{}}
|
|
|
\python{\index{subject}{not equal@\NOTEQNAME{}}}
|
|
|
- \racket{
|
|
|
- \index{subject}{lessthan@\texttt{<}}
|
|
|
- \index{subject}{lessthaneq@\texttt{<=}}
|
|
|
- \index{subject}{greaterthan@\texttt{>}}
|
|
|
- \index{subject}{greaterthaneq@\texttt{>=}}
|
|
|
- }
|
|
|
\python{
|
|
|
\index{subject}{BoolOp@\texttt{BoolOp}}
|
|
|
\index{subject}{Compare@\texttt{Compare}}
|
|
@@ -7931,14 +7938,12 @@ class InterpLif(InterpLvar):
|
|
|
|
|
|
\section{Type Checking \LangIf{} Programs}
|
|
|
\label{sec:type-check-Lif}
|
|
|
-\index{subject}{type checking}
|
|
|
-\index{subject}{semantic analysis}
|
|
|
|
|
|
-It is helpful to think about type checking in two complementary
|
|
|
-ways. A type checker predicts the type of value that will be produced
|
|
|
-by each expression in the program. For \LangIf{}, we have just two types,
|
|
|
-\INTTY{} and \BOOLTY{}. So, a type checker should predict that
|
|
|
-{\if\edition\racketEd
|
|
|
+It is helpful to think about type checking\index{subject}{type
|
|
|
+ checking} in two complementary ways. A type checker predicts the
|
|
|
+type of value that will be produced by each expression in the program.
|
|
|
+For \LangIf{}, we have just two types, \INTTY{} and \BOOLTY{}. So, a
|
|
|
+type checker should predict that {\if\edition\racketEd
|
|
|
\begin{lstlisting}
|
|
|
(+ 10 (- (+ 12 20)))
|
|
|
\end{lstlisting}
|
|
@@ -9421,7 +9426,7 @@ else
|
|
|
\end{center}
|
|
|
\fi}
|
|
|
Next consider the case for Boolean constants. We perform a kind of
|
|
|
-partial evaluation\index{subject}{partial evaluation} and output
|
|
|
+partial evaluation\index{subject}{partialevaluation@partial evaluation} and output
|
|
|
either the \code{thn} or \code{els} branch, depending on whether the
|
|
|
constant is \TRUE{} or \FALSE{}. Let us illustrate this with the
|
|
|
following program:
|
|
@@ -9643,7 +9648,7 @@ and then run \code{run-tests.rkt} to test your compiler.
|
|
|
|
|
|
\section{Select Instructions}
|
|
|
\label{sec:select-Lif}
|
|
|
-\index{subject}{instruction selection}
|
|
|
+\index{subject}{select instructions}
|
|
|
|
|
|
The \code{select\_instructions} pass translates \LangCIf{} to
|
|
|
\LangXIfVar{}.
|
|
@@ -11099,8 +11104,9 @@ in reverse topological order
|
|
|
well defined only for acyclic graphs.
|
|
|
|
|
|
Let us return to the example of computing the sum of the first five
|
|
|
-positive integers. Here is the program after instruction selection but
|
|
|
-before register allocation.
|
|
|
+positive integers. Here is the program after instruction
|
|
|
+selection\index{subject}{instruction selection} but before register
|
|
|
+allocation.
|
|
|
\begin{center}
|
|
|
{\if\edition\racketEd
|
|
|
\begin{minipage}{0.45\textwidth}
|
|
@@ -11241,11 +11247,10 @@ our approximations have converged, so $m_3$ is the solution.
|
|
|
This iteration process is guaranteed to converge to a solution by the
|
|
|
Kleene fixed-point theorem, a general theorem about functions on
|
|
|
lattices~\citep{Kleene:1952aa}. Roughly speaking, a \emph{lattice} is
|
|
|
-any collection that comes with a partial ordering $\sqsubseteq$ on its
|
|
|
+any collection that comes with a partial ordering\index{subject}{partialordering@partial ordering} $\sqsubseteq$ on its
|
|
|
elements, a least element $\bot$ (pronounced \emph{bottom}), and a
|
|
|
join operator
|
|
|
-$\sqcup$.\index{subject}{lattice}\index{subject}{bottom}\index{subject}{partial
|
|
|
- ordering}\index{subject}{join}\footnote{Technically speaking, we
|
|
|
+$\sqcup$.\index{subject}{lattice}\index{subject}{bottom}\index{subject}{join}\footnote{Technically speaking, we
|
|
|
will be working with join semilattices.} When two elements are
|
|
|
ordered $m_i \sqsubseteq m_j$, it means that $m_j$ contains at least
|
|
|
as much information as $m_i$, so we can think of $m_j$ as a
|
|
@@ -11778,6 +11783,7 @@ the condition expression.
|
|
|
|
|
|
\section{Select Instructions}
|
|
|
\label{sec:select-instructions-loop}
|
|
|
+\index{subject}{select instructions}
|
|
|
|
|
|
Only two small additions are needed in the \code{select\_instructions}
|
|
|
pass to handle the changes to \LangCLoop{}. First, to handle the
|
|
@@ -11918,8 +11924,6 @@ for the compilation of \LangLoop{}.
|
|
|
\label{ch:Lvec}
|
|
|
\index{subject}{tuple}
|
|
|
\index{subject}{vector}
|
|
|
-\index{subject}{allocate}
|
|
|
-\index{subject}{heap allocate}
|
|
|
\setcounter{footnote}{0}
|
|
|
|
|
|
%% \margincomment{\scriptsize To do: Flesh out this chapter, e.g., make sure
|
|
@@ -12439,8 +12443,9 @@ set}\index{subject}{root set} to be all the tuple addresses that are
|
|
|
in registers or on the procedure call stack. We define the \emph{live
|
|
|
objects}\index{subject}{live objects} to be the objects that are
|
|
|
reachable from the root set. Garbage collectors reclaim the space that
|
|
|
-is allocated to objects that are no longer live. That means that some
|
|
|
-objects may not get reclaimed as soon as they could be, but at least
|
|
|
+is allocated to objects that are no longer live. \index{subject}{allocate}
|
|
|
+That means that some objects may not get reclaimed as soon as they could be,
|
|
|
+but at least
|
|
|
garbage collectors do not reclaim the space dedicated to objects that
|
|
|
will be accessed in the future! The programmer can influence which
|
|
|
objects get reclaimed by causing them to become unreachable.
|
|
@@ -12507,19 +12512,21 @@ issue.
|
|
|
\subsection{Graph Copying via Cheney's Algorithm}
|
|
|
\label{sec:cheney}
|
|
|
\index{subject}{Cheney's algorithm}
|
|
|
+
|
|
|
Let us take a closer look at the copying of the live objects. The
|
|
|
-allocated objects and pointers can be viewed as a graph, and we need to
|
|
|
-copy the part of the graph that is reachable from the root set. To
|
|
|
-make sure that we copy all the reachable vertices in the graph, we need
|
|
|
-an exhaustive graph traversal algorithm, such as depth-first search or
|
|
|
-breadth-first search~\citep{Moore:1959aa,Cormen:2001uq}. Recall that
|
|
|
-such algorithms take into account the possibility of cycles by marking
|
|
|
-which vertices have already been visited, so to ensure termination
|
|
|
-of the algorithm. These search algorithms also use a data structure
|
|
|
-such as a stack or queue as a to-do list to keep track of the vertices
|
|
|
-that need to be visited. We use breadth-first search and a trick
|
|
|
-due to \citet{Cheney:1970aa} for simultaneously representing the queue
|
|
|
-and copying tuples into the ToSpace.
|
|
|
+allocated\index{subject}{allocate} objects and pointers can be viewed
|
|
|
+as a graph, and we need to copy the part of the graph that is
|
|
|
+reachable from the root set. To make sure that we copy all the
|
|
|
+reachable vertices in the graph, we need an exhaustive graph traversal
|
|
|
+algorithm, such as depth-first search or breadth-first
|
|
|
+search~\citep{Moore:1959aa,Cormen:2001uq}. Recall that such algorithms
|
|
|
+take into account the possibility of cycles by marking which vertices
|
|
|
+have already been visited, so to ensure termination of the
|
|
|
+algorithm. These search algorithms also use a data structure such as a
|
|
|
+stack or queue as a to-do list to keep track of the vertices that need
|
|
|
+to be visited. We use breadth-first search and a trick due to
|
|
|
+\citet{Cheney:1970aa} for simultaneously representing the queue and
|
|
|
+copying tuples into the ToSpace.
|
|
|
|
|
|
Figure~\ref{fig:cheney} shows several snapshots of the ToSpace as the
|
|
|
copy progresses. The queue is represented by a chunk of contiguous
|
|
@@ -12669,7 +12676,8 @@ memory~\citep{Dijkstra:1982aa}. The \code{rootstack\_begin} variable
|
|
|
points to the first element of the root stack.
|
|
|
|
|
|
As long as there is room left in the FromSpace, your generated code
|
|
|
-can allocate tuples simply by moving the \code{free\_ptr} forward.
|
|
|
+can allocate\index{subject}{allocate} tuples simply by moving the
|
|
|
+\code{free\_ptr} forward.
|
|
|
%
|
|
|
The amount of room left in the FromSpace is the difference between the
|
|
|
\code{fromspace\_end} and the \code{free\_ptr}. The \code{collect}
|
|
@@ -12799,7 +12807,8 @@ translation of tuple creation.
|
|
|
%
|
|
|
The \CCOLLECT{$n$} form runs the garbage collector, requesting that it
|
|
|
make sure that there are $n$ bytes ready to be allocated. During
|
|
|
-instruction selection, the \CCOLLECT{$n$} form will become a call to
|
|
|
+instruction selection\index{subject}{instruction selection},
|
|
|
+the \CCOLLECT{$n$} form will become a call to
|
|
|
the \code{collect} function in \code{runtime.c}.
|
|
|
%
|
|
|
The \CALLOCATE{$n$}{$\itm{type}$} form obtains memory for $n$ elements (and
|
|
@@ -13118,7 +13127,7 @@ section.
|
|
|
|
|
|
\section{Select Instructions and the \LangXGlobal{} Language}
|
|
|
\label{sec:select-instructions-gc}
|
|
|
-\index{subject}{instruction selection}
|
|
|
+\index{subject}{select instructions}
|
|
|
|
|
|
%% void (rep as zero)
|
|
|
%% allocate
|
|
@@ -13647,7 +13656,8 @@ structures. The definition of its concrete syntax is shown in
|
|
|
figure~\ref{fig:Lstruct-concrete-syntax}, and the abstract syntax is
|
|
|
shown in figure~\ref{fig:Lstruct-syntax}. Recall that a \code{struct}
|
|
|
in Typed Racket is a user-defined data type that contains named fields
|
|
|
-and that is heap allocated, similarly to a vector. The following is an
|
|
|
+and that is heap allocated\index{subject}{heap allocated},
|
|
|
+similarly to a vector. The following is an
|
|
|
example of a structure definition, in this case the definition of a
|
|
|
\code{point} type:
|
|
|
\begin{lstlisting}
|
|
@@ -14332,6 +14342,7 @@ Add cases for \code{AllocateArray} to \code{explicate\_tail} and
|
|
|
\code{explicate\_assign}.
|
|
|
|
|
|
\subsection{Select Instructions}
|
|
|
+\index{subject}{select instructions}
|
|
|
|
|
|
Generate instructions for \code{AllocateArray} similar to those for
|
|
|
\code{Allocate} given in section~\ref{sec:select-instructions-gc}
|
|
@@ -14485,8 +14496,8 @@ This chapter studies the compilation of a subset of \racket{Typed
|
|
|
Racket}\python{Python} in which only top-level function definitions
|
|
|
are allowed. This kind of function appears in the C programming
|
|
|
language, and it serves as an important stepping-stone to implementing
|
|
|
-lexically scoped functions in the form of \key{lambda} abstractions,
|
|
|
-which is the topic of chapter~\ref{ch:Llambda}.
|
|
|
+lexically scoped functions in the form of \key{lambda}\index{subject}{lambda}
|
|
|
+abstractions, which is the topic of chapter~\ref{ch:Llambda}.
|
|
|
|
|
|
\section{The \LangFun{} Language}
|
|
|
|
|
@@ -14725,9 +14736,9 @@ top-level function definitions.
|
|
|
and makes two passes over the function
|
|
|
definitions~\citep{Kelsey:1998di}. In the first pass we set up the
|
|
|
top-level environment using a mutable cons cell for each function
|
|
|
- definition. Note that the \code{lambda} value for each function is
|
|
|
- incomplete; it does not yet include the environment. Once the
|
|
|
- top-level environment has been constructed, we iterate over it and
|
|
|
+ definition. Note that the \code{lambda}\index{subject}{lambda} value
|
|
|
+ for each function is incomplete; it does not yet include the environment.
|
|
|
+ Once the top-level environment has been constructed, we iterate over it and
|
|
|
update the \code{lambda} values to use the top-level environment.}
|
|
|
%
|
|
|
\python{We create a dictionary named \code{env} and fill it in
|
|
@@ -15049,8 +15060,6 @@ arguments.
|
|
|
%
|
|
|
The return value of the function is stored in register \code{rax}.
|
|
|
|
|
|
-\index{subject}{prelude}\index{subject}{conclusion}
|
|
|
-
|
|
|
Regarding frames \index{subject}{frame} and the procedure call stack,
|
|
|
\index{subject}{procedure call stack} recall from
|
|
|
section~\ref{sec:x86} that the stack grows down and each function call
|
|
@@ -15061,14 +15070,15 @@ is, anything that is at or above the stack pointer. The callee is free
|
|
|
to use locations that are below the stack pointer.
|
|
|
|
|
|
Recall that we store variables of tuple type on the root stack. So,
|
|
|
-the prelude of a function needs to move the root stack pointer
|
|
|
-\code{r15} up according to the number of variables of tuple type and
|
|
|
-the conclusion needs to move the root stack pointer back down. Also,
|
|
|
-the prelude must initialize to \code{0} this frame's slots in the root
|
|
|
-stack to signal to the garbage collector that those slots do not yet
|
|
|
-contain a valid pointer. Otherwise the garbage collector will
|
|
|
-interpret the garbage bits in those slots as memory addresses and try
|
|
|
-to traverse them, causing serious mayhem!
|
|
|
+the prelude\index{subject}{prelude} of a function needs to move the
|
|
|
+root stack pointer \code{r15} up according to the number of variables
|
|
|
+of tuple type and the conclusion\index{subject}{conclusion} needs to
|
|
|
+move the root stack pointer back down. Also, the prelude must
|
|
|
+initialize to \code{0} this frame's slots in the root stack to signal
|
|
|
+to the garbage collector that those slots do not yet contain a valid
|
|
|
+pointer. Otherwise the garbage collector will interpret the garbage
|
|
|
+bits in those slots as memory addresses and try to traverse them,
|
|
|
+causing serious mayhem!
|
|
|
|
|
|
Regarding the sharing of registers between different functions, recall
|
|
|
from section~\ref{sec:calling-conventions} that the registers are
|
|
@@ -15631,7 +15641,7 @@ appropriate explicate functions for the various contexts.
|
|
|
|
|
|
\section{Select Instructions and the \LangXIndCall{} Language}
|
|
|
\label{sec:select-r4}
|
|
|
-\index{subject}{instruction selection}
|
|
|
+\index{subject}{select instructions}
|
|
|
|
|
|
The output of select instructions is a program in the \LangXIndCall{}
|
|
|
language; the definition of its concrete syntax is shown in
|
|
@@ -16323,31 +16333,30 @@ mainconclusion:
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
\chapter{Lexically Scoped Functions}
|
|
|
\label{ch:Llambda}
|
|
|
-\index{subject}{lambda}
|
|
|
-\index{subject}{lexical scoping}
|
|
|
\setcounter{footnote}{0}
|
|
|
|
|
|
-This chapter studies lexically scoped functions. Lexical scoping means
|
|
|
-that a function's body may refer to variables whose binding site is
|
|
|
-outside of the function, in an enclosing scope.
|
|
|
+This chapter studies lexically scoped functions. Lexical
|
|
|
+scoping\index{subject}{lexical scoping} means that a function's body
|
|
|
+may refer to variables whose binding site is outside of the function,
|
|
|
+in an enclosing scope.
|
|
|
%
|
|
|
Consider the example shown in figure~\ref{fig:lexical-scoping} written
|
|
|
-in \LangLam{}, which extends \LangFun{} with the \key{lambda} form for
|
|
|
-creating lexically scoped functions. The body of the \key{lambda}
|
|
|
-refers to three variables: \code{x}, \code{y}, and \code{z}. The
|
|
|
-binding sites for \code{x} and \code{y} are outside of the
|
|
|
-\key{lambda}. Variable \code{y} is \racket{bound by the enclosing
|
|
|
- \key{let}}\python{a local variable of function \code{f}}, and
|
|
|
-\code{x} is a parameter of function \code{f}. Note that function
|
|
|
-\code{f} returns the \key{lambda} as its result value. The main
|
|
|
-expression of the program includes two calls to \code{f} with
|
|
|
-different arguments for \code{x}: first \code{5} and then \code{3}. The
|
|
|
-functions returned from \code{f} are bound to variables \code{g} and
|
|
|
-\code{h}. Even though these two functions were created by the same
|
|
|
-\code{lambda}, they are really different functions because they use
|
|
|
-different values for \code{x}. Applying \code{g} to \code{11} produces
|
|
|
-\code{20} whereas applying \code{h} to \code{15} produces \code{22},
|
|
|
-so the result of the program is \code{42}.
|
|
|
+in \LangLam{}, which extends \LangFun{} with the
|
|
|
+\key{lambda}\index{subject}{lambda} form for creating lexically scoped
|
|
|
+functions. The body of the \key{lambda} refers to three variables:
|
|
|
+\code{x}, \code{y}, and \code{z}. The binding sites for \code{x} and
|
|
|
+\code{y} are outside of the \key{lambda}. Variable \code{y} is
|
|
|
+\racket{bound by the enclosing \key{let}}\python{a local variable of
|
|
|
+ function \code{f}}, and \code{x} is a parameter of function
|
|
|
+\code{f}. Note that function \code{f} returns the \key{lambda} as its
|
|
|
+result value. The main expression of the program includes two calls to
|
|
|
+\code{f} with different arguments for \code{x}: first \code{5} and
|
|
|
+then \code{3}. The functions returned from \code{f} are bound to
|
|
|
+variables \code{g} and \code{h}. Even though these two functions were
|
|
|
+created by the same \code{lambda}, they are really different functions
|
|
|
+because they use different values for \code{x}. Applying \code{g} to
|
|
|
+\code{11} produces \code{20} whereas applying \code{h} to \code{15}
|
|
|
+produces \code{22}, so the result of the program is \code{42}.
|
|
|
|
|
|
\begin{figure}[btp]
|
|
|
\begin{tcolorbox}[colback=white]
|
|
@@ -16470,11 +16479,12 @@ parameter for the closure and insert an initialization at the beginning
|
|
|
of the function for each free variable, to bind those variables to the
|
|
|
appropriate elements from the closure parameter.
|
|
|
%
|
|
|
-This three-step dance is known as \emph{closure conversion}. We
|
|
|
-discuss the details of closure conversion in
|
|
|
-section~\ref{sec:closure-conversion} and show the code generated from
|
|
|
-the example in section~\ref{sec:example-lambda}. First, we define
|
|
|
-the syntax and semantics of \LangLam{} in section~\ref{sec:r5}.
|
|
|
+This three-step dance is known as \emph{closure
|
|
|
+ conversion}\index{subject}{closure conversion}. We discuss the
|
|
|
+details of closure conversion in section~\ref{sec:closure-conversion}
|
|
|
+and show the code generated from the example in
|
|
|
+section~\ref{sec:example-lambda}. First, we define the syntax and
|
|
|
+semantics of \LangLam{} in section~\ref{sec:r5}.
|
|
|
|
|
|
\section{The \LangLam{} Language}
|
|
|
\label{sec:r5}
|
|
@@ -16615,15 +16625,13 @@ for \LangFun{}, which already has syntax for function application.
|
|
|
\label{fig:Llam-syntax}
|
|
|
\end{figure}
|
|
|
|
|
|
-\index{subject}{interpreter}
|
|
|
-\label{sec:interp-Llambda}
|
|
|
-
|
|
|
-Figure~\ref{fig:interp-Llambda} shows the definitional interpreter for
|
|
|
-\LangLam{}. The case for \key{Lambda} saves the current environment
|
|
|
-inside the returned function value. Recall that during function
|
|
|
-application, the environment stored in the function value, extended
|
|
|
-with the mapping of parameters to argument values, is used to
|
|
|
-interpret the body of the function.
|
|
|
+Figure~\ref{fig:interp-Llambda} shows the definitional
|
|
|
+interpreter\index{subject}{interpreter} for \LangLam{}. The case for
|
|
|
+\key{Lambda} saves the current environment inside the returned
|
|
|
+function value. Recall that during function application, the
|
|
|
+environment stored in the function value, extended with the mapping of
|
|
|
+parameters to argument values, is used to interpret the body of the
|
|
|
+function.
|
|
|
|
|
|
\begin{figure}[tbp]
|
|
|
\begin{tcolorbox}[colback=white]
|
|
@@ -16693,9 +16701,6 @@ class InterpLlambda(InterpLfun):
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
-\label{sec:type-check-r5}
|
|
|
-\index{subject}{type checking}
|
|
|
-
|
|
|
{\if\edition\racketEd
|
|
|
%
|
|
|
Figure~\ref{fig:type-check-Llambda} shows how to type check the new
|
|
@@ -17618,6 +17623,7 @@ figure~\ref{fig:Clam-syntax}.
|
|
|
|
|
|
\section{Select Instructions}
|
|
|
\label{sec:select-instructions-Llambda}
|
|
|
+\index{subject}{select instructions}
|
|
|
|
|
|
Compile \ALLOCCLOS{\itm{len}}{\itm{type}}{\itm{arity}} in almost the
|
|
|
same way as the \ALLOC{\itm{len}}{\itm{type}} form
|
|
@@ -18038,7 +18044,8 @@ Languages that allow expressions to produce different kinds of values
|
|
|
are called \emph{polymorphic}, a word composed of the Greek roots
|
|
|
\emph{poly}, meaning \emph{many}, and \emph{morph}, meaning \emph{form}.
|
|
|
There are several kinds of polymorphism in programming languages, such as
|
|
|
-subtype polymorphism and parametric polymorphism
|
|
|
+subtype polymorphism\index{subject}{subtype polymorphism} and
|
|
|
+parametric polymorphism\index{subject}{parametric polymorphism}
|
|
|
(aka generics)~\citep{Cardelli:1985kx}. The kind of polymorphism that we
|
|
|
study in this chapter does not have a special name; it is the kind
|
|
|
that arises in dynamically typed languages.
|
|
@@ -19526,6 +19533,7 @@ whose syntax definition is shown in figure~\ref{fig:c5-syntax}.
|
|
|
|
|
|
\section{Select Instructions}
|
|
|
\label{sec:select-Lany}
|
|
|
+\index{subject}{select instructions}
|
|
|
|
|
|
In the \code{select\_instructions} pass, we translate the primitive
|
|
|
operations on the \ANYTY{} type to x86 instructions that manipulate
|
|
@@ -19563,10 +19571,10 @@ orq $|$\itm{tag}$|, |\itm{lhs'}|
|
|
|
\end{lstlisting}
|
|
|
\fi}
|
|
|
%
|
|
|
-The instruction selection for tuples and procedures is different
|
|
|
-because there is no need to shift them to the left. The rightmost 3
|
|
|
-bits are already zeros, so we simply combine the value and the tag
|
|
|
-using \key{orq}. \\
|
|
|
+The instruction selection\index{subject}{instruction selection} for
|
|
|
+tuples and procedures is different because there is no need to shift
|
|
|
+them to the left. The rightmost 3 bits are already zeros, so we simply
|
|
|
+combine the value and the tag using \key{orq}. \\
|
|
|
%
|
|
|
{\if\edition\racketEd
|
|
|
\begin{center}
|
|
@@ -19620,9 +19628,10 @@ andq $7, |\itm{lhs'}|
|
|
|
|
|
|
The instructions for \key{ValueOf} also differ, depending on whether
|
|
|
the type $T$ is a pointer (tuple or function) or not (integer or
|
|
|
-Boolean). The following shows the instruction selection for integers
|
|
|
-and Booleans, in which we produce an untagged value by shifting it to
|
|
|
-the right by 3 bits:
|
|
|
+Boolean). The following shows the instruction
|
|
|
+selection for integers and
|
|
|
+Booleans, in which we produce an untagged value by shifting it to the
|
|
|
+right by 3 bits:
|
|
|
%
|
|
|
{\if\edition\racketEd
|
|
|
\begin{lstlisting}
|
|
@@ -21651,6 +21660,7 @@ AST nodes.
|
|
|
|
|
|
\section{Select Instructions}
|
|
|
\label{sec:select-instructions-gradual}
|
|
|
+\index{subject}{select instructions}
|
|
|
|
|
|
Recall that the \code{select\_instructions} pass is responsible for
|
|
|
lowering the primitive operations into x86 instructions. So, we need
|
|
@@ -22040,8 +22050,6 @@ online gradual typing bibliography for more material:
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
\chapter{Generics}
|
|
|
\label{ch:Lpoly}
|
|
|
-\index{subject}{parametric polymorphism}
|
|
|
-\index{subject}{generics}
|
|
|
\setcounter{footnote}{0}
|
|
|
|
|
|
This chapter studies the compilation of
|