|
@@ -371,7 +371,7 @@ those programs that the rules allow.
|
|
|
It is common to have many rules with the same left-hand side, so there
|
|
|
is a vertical bar notation for gathering several rules, as shown in
|
|
|
Figure~\ref{fig:r0-syntax}. Each clause between a vertical bar is
|
|
|
-called an ``alternative''.
|
|
|
+called an \em{alternative}.
|
|
|
|
|
|
\begin{figure}[tbp]
|
|
|
\fbox{
|
|
@@ -508,7 +508,7 @@ general, when a recursive function is defined using a sequence of
|
|
|
match clauses that correspond to a grammar, and each clause body makes
|
|
|
a recursive call on each child node, then we say the function is
|
|
|
defined by structural recursion.
|
|
|
-
|
|
|
+\marginpar{Should this be \(R_{0}\) and not {\tt arith?}}
|
|
|
\begin{center}
|
|
|
\begin{minipage}{0.7\textwidth}
|
|
|
\begin{lstlisting}
|
|
@@ -699,7 +699,7 @@ functions is the output of partially evaluating the children nodes.
|
|
|
|
|
|
Our code for \texttt{pe-neg} and \texttt{pe-add} implements the simple
|
|
|
idea of checking whether the inputs are integers and if they are, to
|
|
|
-go ahead perform the arithmetic. Otherwise, we use quasiquote to
|
|
|
+go ahead and perform the arithmetic. Otherwise, we use quasiquote to
|
|
|
create an AST node for the appropriate operation (either negation or
|
|
|
addition) and use comma to splice in the child nodes.
|
|
|
|
|
@@ -912,14 +912,14 @@ A register is written with a \key{\%} followed by the register name,
|
|
|
such as \key{\%rax}.
|
|
|
%
|
|
|
An access to memory is specified using the syntax $n(\key{\%}r)$,
|
|
|
-which reads register $r$ and then offsets the address by $n$ bytes (8
|
|
|
-bits). The address is then used to either load or store to memory
|
|
|
+which reads register $r$ and then offsets the address by $n$ bytes
|
|
|
+($8\times n$ bits). The address is then used to either load or store to memory
|
|
|
depending on whether it occurs as a source or destination argument of
|
|
|
an instruction.
|
|
|
|
|
|
An arithmetic instruction, such as $\key{addq}\,s,\,d$, reads from the
|
|
|
source $s$ and destination $d$, applies the arithmetic operation, then
|
|
|
-write the result in $d$.
|
|
|
+writes the result in $d$.
|
|
|
%
|
|
|
The move instruction, $\key{movq}\,s\,d$ reads from $s$ and stores the
|
|
|
result in $d$.
|
|
@@ -966,13 +966,15 @@ _main:
|
|
|
\caption{An x86-64 program equivalent to $\BINOP{+}{10}{32}$.}
|
|
|
\label{fig:p0-x86}
|
|
|
\end{wrapfigure}
|
|
|
+\marginpar{Consider using italics for the texts in these figures.
|
|
|
+ It can get confusing to differentiate them from the main text.}
|
|
|
|
|
|
Figure~\ref{fig:p0-x86} depicts an x86-64 program that is equivalent
|
|
|
to \code{(+ 10 32)}. The \key{globl} directive says that the
|
|
|
\key{\_main} procedure is externally visible, which is necessary so
|
|
|
that the operating system can call it. The label \key{\_main:}
|
|
|
indicates the beginning of the \key{\_main} procedure which is where
|
|
|
-the operating system starting executing this program. The instruction
|
|
|
+the operating system starts executing this program. The instruction
|
|
|
\lstinline{movq $10, %rax} puts $10$ into register \key{rax}. The
|
|
|
following instruction \lstinline{addq $32, %rax} adds $32$ to the
|
|
|
$10$ in \key{rax} and puts the result, $42$, back into
|
|
@@ -1071,6 +1073,8 @@ x86 programs, so we define an abstract syntax for x86 in
|
|
|
Figure~\ref{fig:x86-ast-a}. The \itm{info} field of the \key{program}
|
|
|
AST node is for storing auxiliary information that needs to be
|
|
|
communicated from one step of the compiler to the next.
|
|
|
+\marginpar{Consider mentioning PseudoX86, since I think that's what
|
|
|
+ you actually are referring to.}
|
|
|
|
|
|
\begin{figure}[tbp]
|
|
|
\fbox{
|
|
@@ -1095,6 +1099,7 @@ x86_0 &::= & (\key{program} \;\itm{info} \; \Instr^{+})
|
|
|
\caption{Abstract syntax for x86-64 assembly.}
|
|
|
\label{fig:x86-ast-a}
|
|
|
\end{figure}
|
|
|
+\marginpar{I think this is PseudoX86, not x86-64.}
|
|
|
|
|
|
\section{Planning the trip from $R_1$ to x86-64}
|
|
|
\label{sec:plan-s0-x86}
|
|
@@ -1182,7 +1187,7 @@ $C_0$.
|
|
|
Each of these steps in the compiler is implemented by a function,
|
|
|
typically a structurally recursive function that translates an input
|
|
|
AST into an output AST. We refer to such a function as a \emph{pass}
|
|
|
-because it makes a pass over, i.e. traverses, the entire AST.
|
|
|
+because it makes a pass over, i.e. it traverses the entire AST.
|
|
|
|
|
|
The syntax for $C_0$ is defined in Figure~\ref{fig:c0-syntax}. The
|
|
|
$C_0$ language supports the same operators as $R_1$ but the arguments
|
|
@@ -1191,10 +1196,10 @@ of operators are now restricted to just variables and integers. The
|
|
|
and there is a \key{return} construct to specify the return value of
|
|
|
the program. A program consists of a sequence of statements that
|
|
|
include at least one \key{return} statement. Each program is also
|
|
|
-annotated with a list of variables. At the start of the program, these
|
|
|
-variables are uninitialized (they contain garbage) and each variable
|
|
|
-becomes initialized on its first assignment. All of the variables used
|
|
|
-in the program must be present in this list.
|
|
|
+annotated with a list of variables (viz. {\tt (var*)}). At the start
|
|
|
+of the program, these variables are uninitialized (they contain garbage)
|
|
|
+and each variable becomes initialized on its first assignment. All of
|
|
|
+the variables used in the program must be present in this list.
|
|
|
|
|
|
\begin{figure}[tbp]
|
|
|
\fbox{
|
|
@@ -1388,6 +1393,7 @@ except for a different integer at the end of the name, followed by the
|
|
|
ending \key{.scm}. Use the \key{interp-tests} function
|
|
|
(Appendix~\ref{appendix:utilities}) from \key{utilities.rkt} to test
|
|
|
your \key{uniquify} pass on the example programs.
|
|
|
+\marginpar{Tests should be {\tt .scm} files or {\tt .rkt} files?}
|
|
|
|
|
|
%% You can use the interpreter \key{interpret-S0} defined in the
|
|
|
%% \key{interp.rkt} file. The entire sequence of tests should be a short
|
|
@@ -1401,8 +1407,8 @@ your \key{uniquify} pass on the example programs.
|
|
|
|
|
|
The \code{flatten} pass will transform $R_1$ programs into $C_0$
|
|
|
programs. In particular, the purpose of the \code{flatten} pass is to
|
|
|
-get rid of nested expressions, such as the \code{(- 10)} in the below
|
|
|
-program. This can be accomplished by introducing a new variable,
|
|
|
+get rid of nested expressions, such as the \code{(- 10)} in the program
|
|
|
+below. This can be accomplished by introducing a new variable,
|
|
|
assigning the nested expression to the new variable, and then using
|
|
|
the new variable in place of the nested expressions, as shown in the
|
|
|
output of \code{flatten} on the right.\\
|
|
@@ -1577,7 +1583,7 @@ language, with the function \code{read\_int} in the file
|
|
|
functionality in this file as the \emph{runtime system}, or simply
|
|
|
\emph{runtime} for short. When compiling your generated x86-64
|
|
|
assembly code, you will need to compile \code{runtime.c} and link it
|
|
|
-in. For for purposes of code generation, all you need to do is
|
|
|
+in. For our purposes of code generation, all you need to do is
|
|
|
translate an assignment of \key{read} to some left-hand side
|
|
|
$\itm{lhs}$ into call to the \code{read\_int} function followed by a
|
|
|
move from \code{rax} into $\itm{lhs}$. (Recall that the return value
|
|
@@ -1697,7 +1703,7 @@ your passes on the example programs.
|
|
|
|
|
|
\section{Print x86-64}
|
|
|
\label{sec:print-x86}
|
|
|
-
|
|
|
+\marginpar{The input isn't quite x86-64 right? It's PseudoX86.}
|
|
|
The last step of the compiler from $R_1$ to x86-64 is to convert the
|
|
|
x86-64 AST (defined in Figure~\ref{fig:x86-ast-a}) to the string
|
|
|
representation (defined in Figure~\ref{fig:x86-a}). The Racket
|