|
@@ -1150,7 +1150,7 @@ code~\citep{Intel:2015aa}. The subset, named \LangVar{}, includes
|
|
integer arithmetic and local variable binding. We often refer to
|
|
integer arithmetic and local variable binding. We often refer to
|
|
x86-64 simply as x86. The chapter begins with a description of the
|
|
x86-64 simply as x86. The chapter begins with a description of the
|
|
\LangVar{} language (Section~\ref{sec:s0}) followed by an introduction
|
|
\LangVar{} language (Section~\ref{sec:s0}) followed by an introduction
|
|
-to of x86 assembly (Section~\ref{sec:x86}). The x86 assembly language
|
|
|
|
|
|
+to x86 assembly (Section~\ref{sec:x86}). The x86 assembly language
|
|
is large so we discuss only the instructions needed for compiling
|
|
is large so we discuss only the instructions needed for compiling
|
|
\LangVar{}. We introduce more x86 instructions in later chapters.
|
|
\LangVar{}. We introduce more x86 instructions in later chapters.
|
|
After introducing \LangVar{} and x86, we reflect on their differences
|
|
After introducing \LangVar{} and x86, we reflect on their differences
|
|
@@ -1498,14 +1498,14 @@ $\ldots$ are used to indicate a sequence of items, e.g., $\Instr
|
|
\ldots$ is a sequence of instructions.\index{instruction}
|
|
\ldots$ is a sequence of instructions.\index{instruction}
|
|
%
|
|
%
|
|
An x86 program is stored in the computer's memory. For our purposes,
|
|
An x86 program is stored in the computer's memory. For our purposes,
|
|
-the computer's memory is as a mapping of 64-bit addresses to 64-bit
|
|
|
|
|
|
+the computer's memory is a mapping of 64-bit addresses to 64-bit
|
|
values. The computer has a \emph{program counter} (PC)\index{program
|
|
values. The computer has a \emph{program counter} (PC)\index{program
|
|
counter}\index{PC} stored in the \code{rip} register that points to
|
|
counter}\index{PC} stored in the \code{rip} register that points to
|
|
the address of the next instruction to be executed. For most
|
|
the address of the next instruction to be executed. For most
|
|
instructions, the program counter is incremented after the instruction
|
|
instructions, the program counter is incremented after the instruction
|
|
is executed, so it points to the next instruction in memory. Most x86
|
|
is executed, so it points to the next instruction in memory. Most x86
|
|
instructions take two operands, where each operand is either an
|
|
instructions take two operands, where each operand is either an
|
|
-integer constant (called \emph{immediate value}\index{immediate
|
|
|
|
|
|
+integer constant (called an \emph{immediate value}\index{immediate
|
|
value}), a \emph{register}\index{register}, or a memory location.
|
|
value}), a \emph{register}\index{register}, or a memory location.
|
|
|
|
|
|
\newcommand{\allregisters}{\key{rsp} \mid \key{rbp} \mid \key{rax} \mid \key{rbx} \mid \key{rcx}
|
|
\newcommand{\allregisters}{\key{rsp} \mid \key{rbp} \mid \key{rax} \mid \key{rbx} \mid \key{rcx}
|
|
@@ -1598,7 +1598,7 @@ main:
|
|
\label{fig:p0-x86}
|
|
\label{fig:p0-x86}
|
|
\end{figure}
|
|
\end{figure}
|
|
|
|
|
|
-The x86 assembly language varies in a couple ways depending on what
|
|
|
|
|
|
+The x86 assembly language varies in a couple of ways depending on what
|
|
operating system it is assembled in. The code examples shown here are
|
|
operating system it is assembled in. The code examples shown here are
|
|
correct on Linux and most Unix-like platforms, but when assembled on
|
|
correct on Linux and most Unix-like platforms, but when assembled on
|
|
Mac OS X, labels like \key{main} must be prefixed with an underscore,
|
|
Mac OS X, labels like \key{main} must be prefixed with an underscore,
|
|
@@ -1714,7 +1714,7 @@ programs, so we define an abstract syntax for x86 in
|
|
Figure~\ref{fig:x86-int-ast}. We refer to this language as
|
|
Figure~\ref{fig:x86-int-ast}. We refer to this language as
|
|
\LangXInt{}. The main difference compared to the concrete syntax of
|
|
\LangXInt{}. The main difference compared to the concrete syntax of
|
|
\LangXInt{} (Figure~\ref{fig:x86-int-concrete}) is that labels are not
|
|
\LangXInt{} (Figure~\ref{fig:x86-int-concrete}) is that labels are not
|
|
-allowed in front of every instructions. Instead instructions are
|
|
|
|
|
|
+allowed in front of every instruction. Instead instructions are
|
|
grouped into \emph{blocks}\index{block}\index{basic block} with a
|
|
grouped into \emph{blocks}\index{block}\index{basic block} with a
|
|
label associated with every block, which is why the \key{X86Program}
|
|
label associated with every block, which is why the \key{X86Program}
|
|
struct includes an alist mapping labels to blocks. The reason for this
|
|
struct includes an alist mapping labels to blocks. The reason for this
|
|
@@ -1771,7 +1771,7 @@ and x86 assembly? Here are some of the most important ones:
|
|
|
|
|
|
\item[(b)] An argument of an \LangVar{} operator can be a deeply-nested
|
|
\item[(b)] An argument of an \LangVar{} operator can be a deeply-nested
|
|
expression, whereas x86 instructions restrict their arguments to be
|
|
expression, whereas x86 instructions restrict their arguments to be
|
|
- integers constants, registers, and memory locations.
|
|
|
|
|
|
+ integer constants, registers, and memory locations.
|
|
|
|
|
|
\item[(c)] The order of execution in x86 is explicit in the syntax: a
|
|
\item[(c)] The order of execution in x86 is explicit in the syntax: a
|
|
sequence of instructions and jumps to labeled positions, whereas in
|
|
sequence of instructions and jumps to labeled positions, whereas in
|
|
@@ -1781,7 +1781,7 @@ and x86 assembly? Here are some of the most important ones:
|
|
\item[(d)] A program in \LangVar{} can have any number of variables
|
|
\item[(d)] A program in \LangVar{} can have any number of variables
|
|
whereas x86 has 16 registers and the procedure calls stack.
|
|
whereas x86 has 16 registers and the procedure calls stack.
|
|
|
|
|
|
-\item[(e)] Variables in \LangVar{} can overshadow other variables with the
|
|
|
|
|
|
+\item[(e)] Variables in \LangVar{} can shadow other variables with the
|
|
same name. In x86, registers have unique names and memory locations
|
|
same name. In x86, registers have unique names and memory locations
|
|
have unique addresses.
|
|
have unique addresses.
|
|
\end{enumerate}
|
|
\end{enumerate}
|
|
@@ -2083,10 +2083,10 @@ in the support code.
|
|
|
|
|
|
\begin{exercise}
|
|
\begin{exercise}
|
|
\normalfont % I don't like the italics for exercises. -Jeremy
|
|
\normalfont % I don't like the italics for exercises. -Jeremy
|
|
-
|
|
|
|
|
|
+\label{ex:Rvar}
|
|
Create five \LangVar{} programs that exercise the most interesting
|
|
Create five \LangVar{} programs that exercise the most interesting
|
|
parts of the \key{uniquify} pass, that is, the programs should include
|
|
parts of the \key{uniquify} pass, that is, the programs should include
|
|
-\key{let} forms, variables, and variables that overshadow each other.
|
|
|
|
|
|
+\key{let} forms, variables, and variables that shadow each other.
|
|
The five programs should be placed in the subdirectory named
|
|
The five programs should be placed in the subdirectory named
|
|
\key{tests} and the file names should start with \code{var\_test\_}
|
|
\key{tests} and the file names should start with \code{var\_test\_}
|
|
followed by a unique integer and end with the file extension
|
|
followed by a unique integer and end with the file extension
|
|
@@ -2175,7 +2175,7 @@ functions, \code{rco-atom} and \code{rco-exp}. The idea is to apply
|
|
apply \code{rco-exp} to subexpressions that do not. Both functions
|
|
apply \code{rco-exp} to subexpressions that do not. Both functions
|
|
take an \LangVar{} expression as input. The \code{rco-exp} function
|
|
take an \LangVar{} expression as input. The \code{rco-exp} function
|
|
returns an expression. The \code{rco-atom} function returns two
|
|
returns an expression. The \code{rco-atom} function returns two
|
|
-things: an atomic expression and alist mapping temporary variables to
|
|
|
|
|
|
+things: an atomic expression and an alist mapping temporary variables to
|
|
complex subexpressions. You can return multiple things from a function
|
|
complex subexpressions. You can return multiple things from a function
|
|
using Racket's \key{values} form and you can receive multiple things
|
|
using Racket's \key{values} form and you can receive multiple things
|
|
from a function call using the \key{define-values} form. If you are
|
|
from a function call using the \key{define-values} form. If you are
|
|
@@ -2250,8 +2250,8 @@ Implement the \code{remove-complex-opera*} function in
|
|
\code{compiler.rkt}.
|
|
\code{compiler.rkt}.
|
|
%
|
|
%
|
|
Create three new \LangInt{} programs that exercise the interesting
|
|
Create three new \LangInt{} programs that exercise the interesting
|
|
-code in the \code{remove-complex-opera*} pass (Following the same file
|
|
|
|
-name guidelines as before.).
|
|
|
|
|
|
+code in the \code{remove-complex-opera*} pass. Follow the guidelines
|
|
|
|
+regarding file names described in Exercise~\ref{ex:Rvar}.
|
|
%
|
|
%
|
|
In the \code{run-tests.rkt} script, add the following entry to the
|
|
In the \code{run-tests.rkt} script, add the following entry to the
|
|
list of \code{passes} and then run the script to test your compiler.
|
|
list of \code{passes} and then run the script to test your compiler.
|
|
@@ -2260,7 +2260,7 @@ list of \code{passes} and then run the script to test your compiler.
|
|
\end{lstlisting}
|
|
\end{lstlisting}
|
|
While debugging your compiler, it is often useful to see the
|
|
While debugging your compiler, it is often useful to see the
|
|
intermediate programs that are output from each pass. To print the
|
|
intermediate programs that are output from each pass. To print the
|
|
-intermeidate programs, place the following before the call to
|
|
|
|
|
|
+intermediate programs, place the following before the call to
|
|
\code{interp-tests} in \code{run-tests.rkt}.
|
|
\code{interp-tests} in \code{run-tests.rkt}.
|
|
\begin{lstlisting}
|
|
\begin{lstlisting}
|
|
(debug-level 1)
|
|
(debug-level 1)
|
|
@@ -2401,7 +2401,7 @@ recommend implementing the \code{select-instructions} with
|
|
three auxiliary functions, one for each of the non-terminals of
|
|
three auxiliary functions, one for each of the non-terminals of
|
|
\LangCVar{}: $\Atm$, $\Stmt$, and $\Tail$.
|
|
\LangCVar{}: $\Atm$, $\Stmt$, and $\Tail$.
|
|
|
|
|
|
-The cases for $\Atm$ are straightforward, variables stay
|
|
|
|
|
|
+The cases for $\Atm$ are straightforward; variables stay
|
|
the same and integer constants are changed to immediates:
|
|
the same and integer constants are changed to immediates:
|
|
$\INT{n}$ changes to $\IMM{n}$.
|
|
$\INT{n}$ changes to $\IMM{n}$.
|
|
|
|
|