|
@@ -128,17 +128,18 @@ moredelim=[is][\color{red}]{~}{~}
|
|
|
\title{\Huge \textbf{Essentials of Compilation} \\
|
|
|
\huge An Incremental Approach}
|
|
|
|
|
|
-\author{\textsc{Jeremy G. Siek, Ryan R. Newton} \\
|
|
|
+\author{\textsc{Jeremy G. Siek} \\
|
|
|
%\thanks{\url{http://homes.soic.indiana.edu/jsiek/}} \\
|
|
|
Indiana University \\
|
|
|
\\
|
|
|
with contributions from: \\
|
|
|
Carl Factora \\
|
|
|
Andre Kuhlenschmidt \\
|
|
|
- Michael M. Vitousek \\
|
|
|
- Michael Vollmer \\
|
|
|
+ Ryan R. Newton \\
|
|
|
Ryan Scott \\
|
|
|
- Cameron Swords
|
|
|
+ Cameron Swords \\
|
|
|
+ Michael M. Vitousek \\
|
|
|
+ Michael Vollmer
|
|
|
}
|
|
|
|
|
|
\begin{document}
|
|
@@ -238,9 +239,7 @@ allocation to Sudoku (Chapter~\ref{ch:register-allocation-r1}).
|
|
|
\section*{Prerequisites}
|
|
|
|
|
|
The material in this book is challenging but rewarding. It is meant to
|
|
|
-prepare students for a lifelong career in programming languages. We do
|
|
|
-not recommend this book for students who want to dabble in programming
|
|
|
-languages.
|
|
|
+prepare students for a lifelong career in programming languages.
|
|
|
|
|
|
The book uses the Racket language both for the implementation of the
|
|
|
compiler and for the language that is compiled, so a student should be
|
|
@@ -295,14 +294,14 @@ following people.
|
|
|
|
|
|
In this chapter we review the basic tools that are needed to implement
|
|
|
a compiler. We use \emph{abstract syntax trees} (ASTs), which are data
|
|
|
-structures in computer memory, rather than programs as they are
|
|
|
+structures in computer memory, in contrast to how programs are
|
|
|
typically stored in text files on disk, as \emph{concrete syntax}.
|
|
|
%
|
|
|
ASTs can be represented in many different ways, depending on the programming
|
|
|
language used to write the compiler.
|
|
|
%
|
|
|
Because this book uses Racket (\url{http://racket-lang.org}), a
|
|
|
-descendant of Lisp, we use S-expressions to conveniently represent
|
|
|
+descendant of Lisp, we can use S-expressions to conveniently represent
|
|
|
ASTs (Section~\ref{sec:ast}). We use grammars to defined the abstract
|
|
|
syntax of programming languages (Section~\ref{sec:grammar}) and
|
|
|
pattern matching to inspect individual nodes in an AST
|
|
@@ -361,7 +360,7 @@ followed by the textual representation of the S-expression. It is
|
|
|
quite common to use S-expressions to represent a list, such as $a, b
|
|
|
,c$ in the following way:
|
|
|
\begin{lstlisting}
|
|
|
- `(a . (b . (c . ())))
|
|
|
+`(a . (b . (c . ())))
|
|
|
\end{lstlisting}
|
|
|
Each element of the list is in the first slot of a pair, and the
|
|
|
second slot is either the rest of the list or the null value, to mark
|
|
@@ -369,13 +368,13 @@ the end of the list. Such lists are so common that Racket provides
|
|
|
special notation for them that removes the need for the periods
|
|
|
and so many parenthesis:
|
|
|
\begin{lstlisting}
|
|
|
- `(a b c)
|
|
|
+`(a b c)
|
|
|
\end{lstlisting}
|
|
|
The following expression creates an S-expression that represents AST
|
|
|
\eqref{eq:arith-prog}.
|
|
|
-\begin{center}
|
|
|
-\texttt{`(+ (read) (- 8))}
|
|
|
-\end{center}
|
|
|
+\begin{lstlisting}
|
|
|
+`(+ (read) (- 8))
|
|
|
+\end{lstlisting}
|
|
|
When using S-expressions to represent ASTs, the convention is to
|
|
|
represent each AST node as a list and to put the operation symbol at
|
|
|
the front of the list. The rest of the list contains the children. So
|
|
@@ -391,8 +390,8 @@ we could have first created an S-expression for AST
|
|
|
\eqref{eq:arith-neg8} and then spliced that into the addition
|
|
|
S-expression.
|
|
|
\begin{lstlisting}
|
|
|
- (define ast1.4 `(- 8))
|
|
|
- (define ast1.1 `(+ (read) ,ast1.4))
|
|
|
+(define ast1.4 `(- 8))
|
|
|
+(define ast1.1 `(+ (read) ,ast1.4))
|
|
|
\end{lstlisting}
|
|
|
In general, the Racket expression that follows the comma (splice)
|
|
|
can be any expression that produces an S-expression.
|
|
@@ -418,7 +417,7 @@ the programs in the language. Instead we write down a set of rules, a
|
|
|
variant of Backus-Naur Form (BNF)~\citep{Backus:1960aa,Knuth:1964aa}.
|
|
|
As an example, we describe a small language, named $R_0$, that
|
|
|
consists of integers and arithmetic operations. The first grammar rule
|
|
|
-says that any integer ($\Int$) is an expression ($\Exp$):
|
|
|
+says that any integer is an expression:
|
|
|
\begin{equation}
|
|
|
\Exp ::= \Int \label{eq:arith-int}
|
|
|
\end{equation}
|
|
@@ -751,20 +750,20 @@ clause per grammar rule for $R_0$ expressions.
|
|
|
|
|
|
\begin{figure}[tbp]
|
|
|
\begin{lstlisting}
|
|
|
- (define (interp-exp e)
|
|
|
- (match e
|
|
|
- [(? fixnum?) e]
|
|
|
- [`(read)
|
|
|
- (let ([r (read)])
|
|
|
- (cond [(fixnum? r) r]
|
|
|
- [else (error 'interp-R0 "input not an integer" r)]))]
|
|
|
- [`(- ,e1) (fx- 0 (interp-exp e1))]
|
|
|
- [`(+ ,e1 ,e2) (fx+ (interp-exp e1) (interp-exp e2))]
|
|
|
- ))
|
|
|
-
|
|
|
- (define (interp-R0 p)
|
|
|
- (match p
|
|
|
- [`(program ,e) (interp-exp e)]))
|
|
|
+(define (interp-exp e)
|
|
|
+ (match e
|
|
|
+ [(? fixnum?) e]
|
|
|
+ [`(read)
|
|
|
+ (let ([r (read)])
|
|
|
+ (cond [(fixnum? r) r]
|
|
|
+ [else (error 'interp-R0 "input not an integer" r)]))]
|
|
|
+ [`(- ,e1) (fx- 0 (interp-exp e1))]
|
|
|
+ [`(+ ,e1 ,e2) (fx+ (interp-exp e1) (interp-exp e2))]
|
|
|
+ ))
|
|
|
+
|
|
|
+(define (interp-R0 p)
|
|
|
+ (match p
|
|
|
+ [`(program ,e) (interp-exp e)]))
|
|
|
\end{lstlisting}
|
|
|
\caption{Interpreter for the $R_0$ language.}
|
|
|
\label{fig:interp-R0}
|
|
@@ -773,7 +772,7 @@ clause per grammar rule for $R_0$ expressions.
|
|
|
Let us consider the result of interpreting a few $R_0$ programs. The
|
|
|
following program adds two integers.
|
|
|
\begin{lstlisting}
|
|
|
- (+ 10 32)
|
|
|
+(+ 10 32)
|
|
|
\end{lstlisting}
|
|
|
The result is \key{42}. (We wrote the above program in concrete syntax,
|
|
|
whereas the parsed abstract syntax is \lstinline{(program (+ 10 32))}.)
|
|
@@ -781,7 +780,7 @@ whereas the parsed abstract syntax is \lstinline{(program (+ 10 32))}.)
|
|
|
The next example demonstrates that expressions may be nested within
|
|
|
each other, in this case nesting several additions and negations.
|
|
|
\begin{lstlisting}
|
|
|
- (+ 10 (- (+ 12 20)))
|
|
|
+(+ 10 (- (+ 12 20)))
|
|
|
\end{lstlisting}
|
|
|
What is the result of the above program?
|
|
|
|
|
@@ -790,13 +789,13 @@ arbitrarily-large integers, but only $63$-bit integers, so we
|
|
|
interpret the arithmetic operations of $R_0$ using fixnum arithmetic
|
|
|
in Racket. What happens when we run the following program?
|
|
|
\begin{lstlisting}
|
|
|
- (define large 999999999999999999)
|
|
|
- (interp-R0 `(program (+ (+ (+ ,large ,large) (+ ,large ,large))
|
|
|
- (+ (+ ,large ,large) (+ ,large ,large)))))
|
|
|
+(define large 999999999999999999)
|
|
|
+(interp-R0 `(program (+ (+ (+ ,large ,large) (+ ,large ,large))
|
|
|
+ (+ (+ ,large ,large) (+ ,large ,large)))))
|
|
|
\end{lstlisting}
|
|
|
It produces an error:
|
|
|
\begin{lstlisting}
|
|
|
- fx+: result is not a fixnum
|
|
|
+fx+: result is not a fixnum
|
|
|
\end{lstlisting}
|
|
|
We establish the convention that if running the definitional
|
|
|
interpreter on a program produces an error, then the meaning of that
|
|
@@ -812,7 +811,7 @@ operation prompts the user of the program for an integer. Recall that
|
|
|
program \eqref{eq:arith-prog} performs a \key{read} and then subtracts
|
|
|
\code{8}. So if we run
|
|
|
\begin{lstlisting}
|
|
|
- (interp-R0 ast1.1)
|
|
|
+(interp-R0 ast1.1)
|
|
|
\end{lstlisting}
|
|
|
and the input the integer \code{50} we get the answer to life, the
|
|
|
universe, and everything: \code{42}.
|
|
@@ -856,89 +855,92 @@ this compiler is an optimizer. Our optimizer will accomplish this by
|
|
|
trying to eagerly compute the parts of the program that do not depend
|
|
|
on any inputs. For example, given the following program
|
|
|
\begin{lstlisting}
|
|
|
- (+ (read) (- (+ 5 3)))
|
|
|
+(+ (read) (- (+ 5 3)))
|
|
|
\end{lstlisting}
|
|
|
our compiler will translate it into the program
|
|
|
\begin{lstlisting}
|
|
|
- (+ (read) -8)
|
|
|
+(+ (read) -8)
|
|
|
\end{lstlisting}
|
|
|
|
|
|
Figure~\ref{fig:pe-arith} gives the code for a simple partial
|
|
|
evaluator for the $R_0$ language. The output of the partial evaluator
|
|
|
-is an $R_0$ program. In Figure~\ref{fig:pe-arith}, the normal
|
|
|
-structural recursion is captured in the main \texttt{pe-arith}
|
|
|
-function whereas the code for partially evaluating negation and
|
|
|
-addition is factored into two separate helper functions:
|
|
|
-\texttt{pe-neg} and \texttt{pe-add}. The input to these helper
|
|
|
-functions is the output of partially evaluating the children nodes.
|
|
|
+is an $R_0$ program. In Figure~\ref{fig:pe-arith}, the structural
|
|
|
+recursion over $\Exp$ is captured in the \code{pe-exp} function
|
|
|
+whereas the code for partially evaluating the negation and addition
|
|
|
+operations is factored into two separate helper functions:
|
|
|
+\code{pe-neg} and \code{pe-add}. The input to these helper
|
|
|
+functions is the output of partially evaluating the children.
|
|
|
|
|
|
\begin{figure}[tbp]
|
|
|
\begin{lstlisting}
|
|
|
- (define (pe-neg r)
|
|
|
- (cond [(fixnum? r) (fx- 0 r)]
|
|
|
- [else `(- ,r)]))
|
|
|
+(define (pe-neg r)
|
|
|
+ (cond [(fixnum? r) (fx- 0 r)]
|
|
|
+ [else `(- ,r)]))
|
|
|
|
|
|
- (define (pe-add r1 r2)
|
|
|
- (cond [(and (fixnum? r1) (fixnum? r2)) (fx+ r1 r2)]
|
|
|
- [else `(+ ,r1 ,r2)]))
|
|
|
+(define (pe-add r1 r2)
|
|
|
+ (cond [(and (fixnum? r1) (fixnum? r2)) (fx+ r1 r2)]
|
|
|
+ [else `(+ ,r1 ,r2)]))
|
|
|
|
|
|
- (define (pe-arith e)
|
|
|
- (match e
|
|
|
- [(? fixnum?) e]
|
|
|
- [`(read) `(read)]
|
|
|
- [`(- ,e1)
|
|
|
- (pe-neg (pe-arith e1))]
|
|
|
- [`(+ ,e1 ,e2)
|
|
|
- (pe-add (pe-arith e1) (pe-arith e2))]))
|
|
|
+(define (pe-exp e)
|
|
|
+ (match e
|
|
|
+ [(? fixnum?) e]
|
|
|
+ [`(read) `(read)]
|
|
|
+ [`(- ,e1) (pe-neg (pe-exp e1))]
|
|
|
+ [`(+ ,e1 ,e2) (pe-add (pe-exp e1) (pe-exp e2))]
|
|
|
+ ))
|
|
|
+
|
|
|
+(define (pe-R0 p)
|
|
|
+ (match p
|
|
|
+ [`(program ,e) `(program ,(pe-exp e))]
|
|
|
+ ))
|
|
|
\end{lstlisting}
|
|
|
\caption{A partial evaluator for $R_0$ expressions.}
|
|
|
\label{fig:pe-arith}
|
|
|
\end{figure}
|
|
|
|
|
|
-Our code for \texttt{pe-neg} and \texttt{pe-add} implements the simple
|
|
|
-idea of checking whether their arguments are integers and if they are,
|
|
|
-to go ahead and perform the arithmetic. Otherwise, we use quasiquote
|
|
|
-to create an AST node for the appropriate operation (either negation
|
|
|
-or addition) and use comma to splice in the child nodes.
|
|
|
+The \texttt{pe-neg} and \texttt{pe-add} functions check whether their
|
|
|
+arguments are integers and if they are, perform the appropriate
|
|
|
+arithmetic. Otherwise, they use quasiquote to create an AST node for
|
|
|
+the operation (either negation or addition) and use comma to splice in
|
|
|
+the children.
|
|
|
|
|
|
To gain some confidence that the partial evaluator is correct, we can
|
|
|
test whether it produces programs that get the same result as the
|
|
|
-input program. That is, we can test whether it satisfies Diagram
|
|
|
+input programs. That is, we can test whether it satisfies Diagram
|
|
|
\eqref{eq:compile-correct}. The following code runs the partial
|
|
|
evaluator on several examples and tests the output program. The
|
|
|
\texttt{assert} function is defined in Appendix~\ref{appendix:utilities}.
|
|
|
\begin{lstlisting}
|
|
|
(define (test-pe p)
|
|
|
- (assert "testing pe-arith"
|
|
|
- (equal? (interp-R0 p) (interp-R0 (pe-arith p)))))
|
|
|
+ (assert "testing pe-R0"
|
|
|
+ (equal? (interp-R0 p) (interp-R0 (pe-R0 p)))))
|
|
|
|
|
|
(test-pe `(+ (read) (- (+ 5 3))))
|
|
|
(test-pe `(+ 1 (+ (read) 1)))
|
|
|
(test-pe `(- (+ (read) (- 5))))
|
|
|
\end{lstlisting}
|
|
|
|
|
|
-\rn{Do we like the explicit whitespace? I've never been fond of it, in part
|
|
|
- because it breaks copy/pasting. But, then again, so do most of the quotes.}
|
|
|
-
|
|
|
\begin{exercise}
|
|
|
-\normalfont % I don't like the italics for exercises. -Jeremy
|
|
|
-We challenge the reader to improve on the simple partial evaluator in
|
|
|
-Figure~\ref{fig:pe-arith} by replacing the \texttt{pe-neg} and
|
|
|
-\texttt{pe-add} helper functions with functions that know more about
|
|
|
-arithmetic. For example, your partial evaluator should translate
|
|
|
+\normalfont
|
|
|
+% I don't like the italics for exercises. -Jeremy
|
|
|
+Improve on the partial evaluator in Figure~\ref{fig:pe-arith} by
|
|
|
+replacing the \texttt{pe-neg} and \texttt{pe-add} helper functions
|
|
|
+with functions that know more about arithmetic. For example, your
|
|
|
+partial evaluator should translate
|
|
|
\begin{lstlisting}
|
|
|
- (+ 1 (+ (read) 1))
|
|
|
+(+ 1 (+ (read) 1))
|
|
|
\end{lstlisting}
|
|
|
into
|
|
|
\begin{lstlisting}
|
|
|
- (+ 2 (read))
|
|
|
+(+ 2 (read))
|
|
|
\end{lstlisting}
|
|
|
To accomplish this, we recommend that your partial evaluator produce
|
|
|
-output that takes the form of the $\itm{residual}$ non-terminal in the
|
|
|
+output in the form of the $\itm{residual}$ non-terminal of the
|
|
|
following grammar.
|
|
|
\[
|
|
|
\begin{array}{lcl}
|
|
|
-\Exp &::=& \Int \mid (\key{read}) \mid (\key{-} \;(\key{read})) \mid (\key{+} \; \Exp \; \Exp)\\
|
|
|
+\Exp &::=& \Int \mid (\key{read}) \mid (\key{-} \;(\key{read}))
|
|
|
+ \mid (\key{+} \; \Exp \; \Exp)\\
|
|
|
\itm{residual} &::=& \Int \mid (\key{+}\; \Int\; \Exp) \mid \Exp
|
|
|
\end{array}
|
|
|
\]
|
|
@@ -949,22 +951,25 @@ following grammar.
|
|
|
\chapter{Integers and Variables}
|
|
|
\label{ch:int-exp}
|
|
|
|
|
|
-This chapter concerns the challenge of compiling a subset of Racket
|
|
|
-that includes integer arithmetic and local variable binding, which we
|
|
|
-name $R_1$, to x86-64 assembly code~\citep{Intel:2015aa}. Henceforth
|
|
|
-we shall refer to x86-64 simply as x86. The chapter begins with a
|
|
|
-description of the $R_1$ language (Section~\ref{sec:s0}) followed by a
|
|
|
-description of x86 (Section~\ref{sec:x86}). The x86 assembly language
|
|
|
-is quite large, so we only discuss what is needed for compiling
|
|
|
-$R_1$. We introduce more of x86 in later chapters. Once we have
|
|
|
-introduced $R_1$ and x86, we reflect on their differences and come up
|
|
|
-with a plan to break down the translation from $R_1$ to x86 into a
|
|
|
-handful of steps (Section~\ref{sec:plan-s0-x86}). The rest of the
|
|
|
-sections in this Chapter give detailed hints regarding each step
|
|
|
+This chapter is about compiling the subset of Racket that includes
|
|
|
+integer arithmetic and local variable binding, which we name $R_1$, to
|
|
|
+x86-64 assembly code~\citep{Intel:2015aa}. Henceforth we shall refer
|
|
|
+to x86-64 simply as x86. The chapter begins with a description of the
|
|
|
+$R_1$ language (Section~\ref{sec:s0}) followed by a description of x86
|
|
|
+(Section~\ref{sec:x86}). The x86 assembly language is quite large, so
|
|
|
+we discuss only what is needed for compiling $R_1$. We introduce more
|
|
|
+of x86 in later chapters. Once we have introduced $R_1$ and x86, we
|
|
|
+reflect on their differences and come up with a plan to break down the
|
|
|
+translation from $R_1$ to x86 into a handful of steps
|
|
|
+(Section~\ref{sec:plan-s0-x86}). The rest of the sections in this
|
|
|
+Chapter give detailed hints regarding each step
|
|
|
(Sections~\ref{sec:uniquify-s0} through \ref{sec:patch-s0}). We hope
|
|
|
-to give enough hints that the well-prepared reader can implement a
|
|
|
-compiler from $R_1$ to x86 while at the same time leaving room for
|
|
|
-some fun and creativity.
|
|
|
+to give enough hints that the well-prepared reader, together with some
|
|
|
+friends, can implement a compiler from $R_1$ to x86 in a couple weeks
|
|
|
+while at the same time leaving room for some fun and creativity. To
|
|
|
+give the reader a feeling for the scale of this first compiler, the
|
|
|
+instructor solution for the $R_1$ compiler is approximately 500 lines
|
|
|
+of code.
|
|
|
|
|
|
\section{The $R_1$ Language}
|
|
|
\label{sec:s0}
|
|
@@ -976,17 +981,11 @@ Figure~\ref{fig:r1-syntax}. The non-terminal \Var{} may be any Racket
|
|
|
identifier. As in $R_0$, \key{read} is a nullary operator, \key{-} is
|
|
|
a unary operator, and \key{+} is a binary operator. Similar to $R_0$,
|
|
|
the $R_1$ language includes the \key{program} construct to mark the
|
|
|
-top of the program, which is helpful in parts of the compiler. The
|
|
|
-$\itm{info}$ field of the \key{program} construct contains an
|
|
|
+top of the program, which is helpful in some of the compiler passes.
|
|
|
+The $\itm{info}$ field of the \key{program} construct contains an
|
|
|
association list that is used to communicate auxiliary data from one
|
|
|
-step of the compiler to the next.
|
|
|
-
|
|
|
-The $R_1$ language is rich enough to exhibit several compilation
|
|
|
-techniques but simple enough so that the reader, together with couple
|
|
|
-friends, can implement a compiler for it in a week or two of part-time
|
|
|
-work. To give the reader a feeling for the scale of this first
|
|
|
-compiler, the instructor solution for the $R_1$ compiler is less than
|
|
|
-500 lines of code.
|
|
|
+compiler pass the next. Despite the simplicity of the $R_1$ language,
|
|
|
+it is rich enough to exhibit several compilation techniques.
|
|
|
|
|
|
\begin{figure}[btp]
|
|
|
\centering
|
|
@@ -1005,30 +1004,30 @@ R_1 &::=& (\key{program} \;\itm{info}\; \Exp)
|
|
|
\label{fig:r1-syntax}
|
|
|
\end{figure}
|
|
|
|
|
|
-Let us dive into the description of the $R_1$ language. The \key{let}
|
|
|
-construct defines a variable for use within its body and initializes
|
|
|
-the variable with the value of an expression. So the following
|
|
|
-program initializes \code{x} to \code{32} and then evaluates the body
|
|
|
-\code{(+ 10 x)}, producing \code{42}.
|
|
|
+Let us dive further into the syntax and semantics of the $R_1$
|
|
|
+language. The \key{let} construct defines a variable for use within
|
|
|
+its body and initializes the variable with the value of an expression.
|
|
|
+So the following program initializes \code{x} to \code{32} and then
|
|
|
+evaluates the body \code{(+ 10 x)}, producing \code{42}.
|
|
|
\begin{lstlisting}
|
|
|
- (program ()
|
|
|
- (let ([x (+ 12 20)]) (+ 10 x)))
|
|
|
+(program ()
|
|
|
+ (let ([x (+ 12 20)]) (+ 10 x)))
|
|
|
\end{lstlisting}
|
|
|
When there are multiple \key{let}'s for the same variable, the closest
|
|
|
enclosing \key{let} is used. That is, variable definitions overshadow
|
|
|
prior definitions. Consider the following program with two \key{let}'s
|
|
|
that define variables named \code{x}. Can you figure out the result?
|
|
|
\begin{lstlisting}
|
|
|
- (program ()
|
|
|
- (let ([x 32]) (+ (let ([x 10]) x) x)))
|
|
|
+(program ()
|
|
|
+ (let ([x 32]) (+ (let ([x 10]) x) x)))
|
|
|
\end{lstlisting}
|
|
|
For the purposes of showing which variable uses correspond to which
|
|
|
definitions, the following shows the \code{x}'s annotated with subscripts
|
|
|
to distinguish them. Double check that your answer for the above is
|
|
|
the same as your answer for this annotated version of the program.
|
|
|
\begin{lstlisting}
|
|
|
- (program ()
|
|
|
- (let ([x|$_1$| 32]) (+ (let ([x|$_2$| 10]) x|$_2$|) x|$_1$|)))
|
|
|
+(program ()
|
|
|
+ (let ([x|$_1$| 32]) (+ (let ([x|$_2$| 10]) x|$_2$|) x|$_1$|)))
|
|
|
\end{lstlisting}
|
|
|
The initializing expression is always evaluated before the body of the
|
|
|
\key{let}, so in the following, the \key{read} for \code{x} is
|
|
@@ -1036,24 +1035,24 @@ performed before the \key{read} for \code{y}. Given the input
|
|
|
\code{52} then \code{10}, the following produces \code{42} (and not
|
|
|
\code{-42}).
|
|
|
\begin{lstlisting}
|
|
|
- (program ()
|
|
|
- (let ([x (read)]) (let ([y (read)]) (+ x (- y)))))
|
|
|
+(program ()
|
|
|
+ (let ([x (read)]) (let ([y (read)]) (+ x (- y)))))
|
|
|
\end{lstlisting}
|
|
|
|
|
|
-Figure~\ref{fig:interp-R1} shows the interpreter for the $R_1$
|
|
|
-language. It extends the interpreter for $R_0$ with two new
|
|
|
+Figure~\ref{fig:interp-R1} shows the definitional interpreter for the
|
|
|
+$R_1$ language. It extends the interpreter for $R_0$ with two new
|
|
|
\key{match} clauses for variables and for \key{let}. For \key{let},
|
|
|
-we will need a way to communicate the initializing value of a variable
|
|
|
-to all the uses of a variable. To accomplish this, we maintain a
|
|
|
-mapping from variables to values, which is traditionally called an
|
|
|
-\emph{environment}. For simplicity, here we use an association list to
|
|
|
-represent the environment. The \code{interp-R1} function takes the
|
|
|
-current environment, \code{env}, as an extra parameter. When the
|
|
|
-interpreter encounters a variable, it finds the corresponding value
|
|
|
-using the \code{lookup} function (Appendix~\ref{appendix:utilities}).
|
|
|
-When the interpreter encounters a \key{let}, it evaluates the
|
|
|
-initializing expression, extends the environment with the result bound
|
|
|
-to the variable, then evaluates the body of the \key{let}.
|
|
|
+we need a way to communicate the value of a variable to all the uses
|
|
|
+of a variable. To accomplish this, we maintain a mapping from
|
|
|
+variables to values, which is called an \emph{environment}. For
|
|
|
+simplicity, here we use an association list to represent the
|
|
|
+environment. The \code{interp-R1} function takes the current
|
|
|
+environment, \code{env}, as an extra parameter. When the interpreter
|
|
|
+encounters a variable, it finds the corresponding value using the
|
|
|
+\code{lookup} function (Appendix~\ref{appendix:utilities}). When the
|
|
|
+interpreter encounters a \key{let}, it evaluates the initializing
|
|
|
+expression, extends the environment with the result value bound to the
|
|
|
+variable, then evaluates the body of the \key{let}.
|
|
|
|
|
|
\begin{figure}[tbp]
|
|
|
\begin{lstlisting}
|
|
@@ -1088,10 +1087,11 @@ to the variable, then evaluates the body of the \key{let}.
|
|
|
\end{figure}
|
|
|
|
|
|
The goal for this chapter is to implement a compiler that translates
|
|
|
-any program $P_1$ in the $R_1$ language into an x86 assembly
|
|
|
-program $P_2$ such that $P_2$ exhibits the same behavior on an x86
|
|
|
-computer as the $R_1$ program running in a Racket implementation.
|
|
|
-That is, they both output the same integer $n$.
|
|
|
+any program $P_1$ written in the $R_1$ language into an x86 assembly
|
|
|
+program $P_2$ such that $P_2$ exhibits the same behavior when run on a
|
|
|
+computer as the $P_1$ program interpreted by \code{interp-R1}. That
|
|
|
+is, they both output the same integer $n$. We depict this correctness
|
|
|
+criteria in the following diagram.
|
|
|
\[
|
|
|
\begin{tikzpicture}[baseline=(current bounding box.center)]
|
|
|
\node (p1) at (0, 0) {$P_1$};
|
|
@@ -1109,29 +1109,33 @@ language to compile $R_1$.
|
|
|
\section{The x86 Assembly Language}
|
|
|
\label{sec:x86}
|
|
|
|
|
|
-An x86 program is a sequence of instructions. The program is stored in the
|
|
|
-computer's memory and the \emph{program counter} points to the address of the
|
|
|
-next instruction to be executed. For most instructions, once the instruction is
|
|
|
-executed, the program counter is incremented to point to the immediately
|
|
|
-following instruction in memory. Each instruction may refer to integer
|
|
|
-constants (called \emph{immediate values}), variables called \emph{registers},
|
|
|
-and instructions may load and store values into memory. For our purposes, we
|
|
|
-can think of the computer's memory as a mapping of 64-bit addresses to 64-bit
|
|
|
-values%
|
|
|
+Figure~\ref{fig:x86-a} defines the syntax for the
|
|
|
+subset of the x86 assembly language needed for this chapter.
|
|
|
+%
|
|
|
+An x86 program is a sequence of instructions. The program is stored in
|
|
|
+the computer's memory and the computer has a \emph{program counter}
|
|
|
+that points to the address of the next instruction to be executed. For
|
|
|
+most instructions, once the instruction is executed, the program
|
|
|
+counter is incremented to point to the immediately following
|
|
|
+instruction in memory. Most x86 instructions take two operands, where
|
|
|
+each operand is either an integer constant (called \emph{immediate
|
|
|
+ value}), a \emph{register}, or a \emph{memory} location. A register
|
|
|
+is a special kind of variable. Each one holds a 64-bit value; there
|
|
|
+are 16 registers in the computer and their names are given in
|
|
|
+Figure~\ref{fig:x86-a}. The computer's memory as a mapping of 64-bit
|
|
|
+addresses to 64-bit values%
|
|
|
\footnote{This simple story suffices for describing how sequential
|
|
|
programs access memory but is not sufficient for multi-threaded
|
|
|
programs. However, multi-threaded execution is beyond the scope of
|
|
|
this book.}.
|
|
|
%
|
|
|
-Figure~\ref{fig:x86-a} defines the syntax for the
|
|
|
-subset of the x86 assembly language needed for this chapter.
|
|
|
-%
|
|
|
We use the AT\&T syntax expected by the GNU assembler, which comes
|
|
|
with the \key{gcc} compiler that we use for compiling assembly code to
|
|
|
machine code.
|
|
|
%
|
|
|
-Also, Appendix~\ref{sec:x86-quick-reference} includes a quick-reference of all
|
|
|
-the x86 instructions used in this book and a short explanation of what they do.
|
|
|
+Appendix~\ref{sec:x86-quick-reference} is a quick-reference of all the
|
|
|
+x86 instructions used in this book with a short explanation of what
|
|
|
+they do.
|
|
|
|
|
|
|
|
|
% to do: finish treatment of imulq
|
|
@@ -1172,21 +1176,21 @@ A register is written with a \key{\%} followed by the register name,
|
|
|
such as \key{\%rax}.
|
|
|
%
|
|
|
An access to memory is specified using the syntax $n(\key{\%}r)$,
|
|
|
-which obtains the address stored in register $r$ and then
|
|
|
-offsets the address by $n$ bytes
|
|
|
-(8 bits). The address is then used to either load or store to memory
|
|
|
-depending on whether it occurs as a source or destination argument of
|
|
|
-an instruction.
|
|
|
+which obtains the address stored in register $r$ and then adds $n$
|
|
|
+bytes to the address. The resulting address is used to either load or
|
|
|
+store to memory depending on whether it occurs as a source or
|
|
|
+destination argument of an instruction.
|
|
|
|
|
|
-An arithmetic instruction, such as $\key{addq}\,s,\,d$, reads from the
|
|
|
+An arithmetic instruction such as $\key{addq}\,s,\,d$ reads from the
|
|
|
source $s$ and destination $d$, applies the arithmetic operation, then
|
|
|
-writes the result in $d$.
|
|
|
+writes the result back to the destination $d$.
|
|
|
%
|
|
|
-The move instruction, $\key{movq}\,s\,d$ reads from $s$ and stores the
|
|
|
+The move instruction $\key{movq}\,s\,d$ reads from $s$ and stores the
|
|
|
result in $d$.
|
|
|
%
|
|
|
The $\key{callq}\,\mathit{label}$ instruction executes the procedure
|
|
|
-specified by the label.
|
|
|
+specified by the label. We discuss procedure calls in more detail
|
|
|
+later in this chapter and in Chapter~\ref{ch:functions}.
|
|
|
|
|
|
Figure~\ref{fig:p0-x86} depicts an x86 program that is equivalent
|
|
|
to \code{(+ 10 32)}. The \key{globl} directive says that the
|
|
@@ -1198,7 +1202,7 @@ the operating system starts executing this program. The instruction
|
|
|
following instruction \lstinline{addq $32, %rax} adds $32$ to the
|
|
|
$10$ in \key{rax} and puts the result, $42$, back into
|
|
|
\key{rax}.
|
|
|
-
|
|
|
+%
|
|
|
The last instruction, \key{retq}, finishes the \key{main} function by
|
|
|
returning the integer in \key{rax} to the operating system. The
|
|
|
operating system interprets this integer as the program's exit
|
|
@@ -1237,10 +1241,10 @@ shown in Figure~\ref{fig:frame}. The register \key{rsp} is called the
|
|
|
stack. The stack grows downward in memory, so we increase the size of
|
|
|
the stack by subtracting from the stack pointer. The frame size is
|
|
|
required to be a multiple of 16 bytes. In the context of a procedure
|
|
|
-call, the \emph{return address} is the next instruction on the caller
|
|
|
-side that comes after the call instruction. During a function call,
|
|
|
-the return address is pushed onto the stack. The register \key{rbp}
|
|
|
-is the \emph{base pointer} which serves two purposes: 1) it saves the
|
|
|
+call, the \emph{return address} is the next instruction after the call
|
|
|
+instruction on the caller side. During a function call, the return
|
|
|
+address is pushed onto the stack. The register \key{rbp} is the
|
|
|
+\emph{base pointer} which serves two purposes: 1) it saves the
|
|
|
location of the stack pointer for the calling procedure and 2) it is
|
|
|
used to access variables associated with the current procedure. The
|
|
|
base pointer of the calling procedure is pushed onto the stack after
|
|
@@ -1292,33 +1296,33 @@ Position & Contents \\ \hline
|
|
|
Getting back to the program in Figure~\ref{fig:p1-x86}, the first
|
|
|
three instructions are the typical \emph{prelude} for a procedure.
|
|
|
The instruction \key{pushq \%rbp} saves the base pointer for the
|
|
|
-procedure that called the current one onto the stack and subtracts $8$
|
|
|
-from the stack pointer. The second instruction \key{movq \%rsp, \%rbp}
|
|
|
-changes the base pointer to the top of the stack. The instruction
|
|
|
-\key{subq \$16, \%rsp} moves the stack pointer down to make enough
|
|
|
-room for storing variables. This program just needs one variable ($8$
|
|
|
-bytes) but because the frame size is required to be a multiple of 16
|
|
|
-bytes, it rounds to 16 bytes.
|
|
|
+caller onto the stack and subtracts $8$ from the stack pointer. The
|
|
|
+second instruction \key{movq \%rsp, \%rbp} changes the base pointer to
|
|
|
+the top of the stack. The instruction \key{subq \$16, \%rsp} moves the
|
|
|
+stack pointer down to make enough room for storing variables. This
|
|
|
+program needs one variable ($8$ bytes) but because the frame size is
|
|
|
+required to be a multiple of 16 bytes, the space for variables is
|
|
|
+rounded to 16 bytes.
|
|
|
|
|
|
The four instructions under the label \code{start} carry out the work
|
|
|
of computing $\BINOP{+}{52}{\UNIOP{-}{10} }$. The first instruction
|
|
|
\key{movq \$10, -8(\%rbp)} stores $10$ in variable $1$. The
|
|
|
instruction \key{negq -8(\%rbp)} changes variable $1$ to $-10$. The
|
|
|
-\key{movq \$52, \%rax} places $52$ in the register \key{rax} and
|
|
|
-\key{addq -8(\%rbp), \%rax} adds the contents of variable $1$ to
|
|
|
+instruction \key{movq \$52, \%rax} places $52$ in the register \key{rax} and
|
|
|
+finally \key{addq -8(\%rbp), \%rax} adds the contents of variable $1$ to
|
|
|
\key{rax}, at which point \key{rax} contains $42$.
|
|
|
|
|
|
The three instructions under the label \code{conclusion} are the
|
|
|
-typical finale of a procedure. The first two are necessary to get the
|
|
|
-state of the machine back to where it was at the beginning of the
|
|
|
-procedure. The \key{addq \$16, \%rsp} instruction moves the stack
|
|
|
-pointer back to point at the old base pointer. The amount added here
|
|
|
-needs to match the amount that was subtracted in the prelude of the
|
|
|
-procedure. Then \key{popq \%rbp} returns the old base pointer to
|
|
|
-\key{rbp} and adds $8$ to the stack pointer. The last instruction,
|
|
|
-\key{retq}, jumps back to the procedure that called this one and adds
|
|
|
-8 to the stack pointer, which returns the stack pointer to where it
|
|
|
-was prior to the procedure call.
|
|
|
+typical \emph{finale} of a procedure. The first two instructions are
|
|
|
+necessary to get the state of the machine back to where it was at the
|
|
|
+beginning of the procedure. The instruction \key{addq \$16, \%rsp}
|
|
|
+moves the stack pointer back to point at the old base pointer. The
|
|
|
+amount added here needs to match the amount that was subtracted in the
|
|
|
+prelude of the procedure. Then \key{popq \%rbp} returns the old base
|
|
|
+pointer to \key{rbp} and adds $8$ to the stack pointer. The last
|
|
|
+instruction, \key{retq}, jumps back to the procedure that called this
|
|
|
+one and adds 8 to the stack pointer, which returns the stack pointer
|
|
|
+to where it was prior to the procedure call.
|
|
|
|
|
|
The compiler will need a convenient representation for manipulating
|
|
|
x86 programs, so we define an abstract syntax for x86 in
|
|
@@ -1327,7 +1331,7 @@ a subscript $0$ because later we introduce extended versions of this
|
|
|
assembly language. The main difference compared to the concrete syntax
|
|
|
of x86 (Figure~\ref{fig:x86-a}) is that it does not allow labeled
|
|
|
instructions to appear anywhere, but instead organizes instructions
|
|
|
-into groups called \emph{blocks} and a label is associated with every
|
|
|
+into groups called \emph{blocks} and associates a label with every
|
|
|
block, which is why the \key{program} form includes an association
|
|
|
list mapping labels to blocks. The reason for this organization
|
|
|
becomes apparent in Chapter~\ref{ch:bool-types}.
|
|
@@ -1363,8 +1367,8 @@ x86_0 &::= & (\key{program} \;\itm{info} \; ((\itm{label} \,\key{.}\, \Block)^{+
|
|
|
|
|
|
To compile one language to another it helps to focus on the
|
|
|
differences between the two languages because the compiler will need
|
|
|
-to bridge them. What are the differences between $R_1$ and x86
|
|
|
-assembly? Here we list some of the most important ones.
|
|
|
+to bridge those differences. What are the differences between $R_1$
|
|
|
+and x86 assembly? Here we list some of the most important ones.
|
|
|
|
|
|
\begin{enumerate}
|
|
|
\item[(a)] x86 arithmetic instructions typically have two arguments
|
|
@@ -1375,14 +1379,13 @@ assembly? Here we list some of the most important ones.
|
|
|
arguments.
|
|
|
|
|
|
\item[(b)] An argument to an $R_1$ operator can be any expression,
|
|
|
- whereas x86 instructions restrict their arguments to be \emph{simple
|
|
|
- expressions} like integers, registers, and memory locations. (All
|
|
|
- the other kinds are called \emph{complex expressions}.)
|
|
|
+ whereas x86 instructions restrict their arguments to be integers
|
|
|
+ constants, registers, and memory locations.
|
|
|
|
|
|
\item[(c)] The order of execution in x86 is explicit in the syntax: a
|
|
|
sequence of instructions and jumps to labeled positions, whereas in
|
|
|
- $R_1$ it is a left-to-right depth-first traversal of the abstract
|
|
|
- syntax tree.
|
|
|
+ $R_1$ the order of evaluation is a left-to-right depth-first
|
|
|
+ traversal of the abstract syntax tree.
|
|
|
|
|
|
\item[(d)] An $R_1$ program can have any number of variables whereas
|
|
|
x86 has 16 registers and the procedure calls stack.
|
|
@@ -1397,15 +1400,15 @@ the problem into several steps, dealing with the above differences one
|
|
|
at a time. Each of these steps is called a \emph{pass} of the
|
|
|
compiler, because step traverses (passes over) the AST of the program.
|
|
|
%
|
|
|
-We begin by giving a sketch about how we might implement each pass,
|
|
|
-and give them names. We shall then figure out an ordering of the
|
|
|
-passes and the input/output language for each pass. The very first
|
|
|
-pass has $R_1$ as its input language and the last pass has x86 as its
|
|
|
-output language. In between we can choose whichever language is most
|
|
|
+We begin by sketching how we might implement each pass, and give them
|
|
|
+names. We then figure out an ordering of the passes and the
|
|
|
+input/output language for each pass. The very first pass has $R_1$ as
|
|
|
+its input language and the last pass has x86 as its output
|
|
|
+language. In between we can choose whichever language is most
|
|
|
convenient for expressing the output of each pass, whether that be
|
|
|
$R_1$, x86, or new \emph{intermediate languages} of our own design.
|
|
|
-Finally, to implement the compiler, we shall write one function,
|
|
|
-typically a structural recursive function, per pass.
|
|
|
+Finally, to implement each pass we write one recursive function per
|
|
|
+non-terminal in the grammar of the input language of the pass.
|
|
|
|
|
|
\begin{description}
|
|
|
\item[Pass \key{select-instructions}] To handle the difference between
|
|
@@ -1415,32 +1418,30 @@ typically a structural recursive function, per pass.
|
|
|
|
|
|
\item[Pass \key{remove-complex-opera*}] To ensure that each
|
|
|
subexpression (i.e. operator and operand, and hence \key{opera*}) is
|
|
|
- a simple expression, we shall introduce temporary variables to hold
|
|
|
- the results of subexpressions.
|
|
|
+ a \emph{simple expression} (a variable or integer), we shall
|
|
|
+ introduce temporary variables to hold the results of subexpressions.
|
|
|
|
|
|
\item[Pass \key{explicate-control}] To make the execution order of the
|
|
|
program explicit, we shall convert from the abstract syntax tree
|
|
|
representation into a graph representation in which each node
|
|
|
- contains a sequence of actions and the edges say where to go after
|
|
|
- the sequence is complete.
|
|
|
+ contains a sequence of statements and the edges say where to go
|
|
|
+ after the sequence is complete.
|
|
|
|
|
|
\item[Pass \key{assign-homes}] To handle the difference between the
|
|
|
variables in $R_1$ versus the registers and stack location in x86,
|
|
|
- we shall come up with an assignment of each variable to its
|
|
|
- \emph{home}, that is, to a register or stack location.
|
|
|
+ we assignment of each variable to a register or stack location.
|
|
|
|
|
|
\item[Pass \key{uniquify}] This pass deals with the shadowing of variables
|
|
|
by renaming every variable to a unique name, so that shadowing no
|
|
|
longer occurs.
|
|
|
-
|
|
|
\end{description}
|
|
|
|
|
|
The next question is: in what order should we apply these passes? This
|
|
|
-question can be a challenging one to answer because it is difficult to
|
|
|
-know ahead of time which orders will be better (easier to implement,
|
|
|
-produce more efficient code, etc.) so often some trial-and-error is
|
|
|
+question can be challenging because it is difficult to know ahead of
|
|
|
+time which orders will be better (easier to implement, produce more
|
|
|
+efficient code, etc.) so often some trial-and-error is
|
|
|
involved. Nevertheless, we can try to plan ahead and make educated
|
|
|
-choices regarding the orderings.
|
|
|
+choices regarding the ordering.
|
|
|
|
|
|
Let us consider the ordering of \key{uniquify} and
|
|
|
\key{remove-complex-opera*}. The assignment of subexpressions to
|
|
@@ -1466,10 +1467,14 @@ With respect to \key{remove-complex-opera*}, it perhaps does not
|
|
|
matter very much, but it works well to place \key{explicate-control}
|
|
|
after removing complex subexpressions.
|
|
|
|
|
|
-The \key{assign-homes} pass should come after
|
|
|
-\key{remove-complex-opera*} and \key{explicate-control}. The
|
|
|
-\key{remove-complex-opera*} pass generates temporary variables, which
|
|
|
-also need to be assigned homes. The \key{explicate-control} pass
|
|
|
+Next consider the ordering of the \key{assign-homes} pass with respect
|
|
|
+to \key{remove-complex-opera*} and \key{explicate-control}.
|
|
|
+%
|
|
|
+Removing complex subexpression requires generating temporary variables
|
|
|
+which then need to be assigned homes, so \key{assign-homes} should
|
|
|
+come after \key{remove-complex-opera*}.
|
|
|
+%
|
|
|
+The \key{explicate-control} pass
|
|
|
deletes branches that will never be executed, which can remove
|
|
|
variables. Thus it is good to place \key{explicate-control} prior to
|
|
|
\key{assign-homes} so that there are fewer variables that need to be
|
|
@@ -1492,7 +1497,9 @@ approach in which \key{select-instructions} comes first, followed by
|
|
|
the \key{assign-homes}, followed by a third pass, named
|
|
|
\key{patch-instructions}, that uses a reserved register (\key{rax}) to
|
|
|
patch-up outstanding problems regarding instructions with too many
|
|
|
-memory accesses.
|
|
|
+memory accesses. The disadvantage of this approach a reduction in
|
|
|
+runtime efficiency.
|
|
|
+
|
|
|
|
|
|
\begin{figure}[tbp]
|
|
|
\begin{tikzpicture}[baseline=(current bounding box.center)]
|