3 ani în urmă · 03f527d719
--- a/book.tex
+++ b/book.tex
@@ -3783,8 +3783,8 @@ all, fast code is useless if it produces incorrect results!
 
				 \index{subject}{register allocation}
			
 
				 
			
 
				 In Chapter~\ref{ch:Lvar} we learned how to store variables on the
			
 
				-stack. In this Chapter we learn how to improve the performance of the
			
 
				-generated code by placing some variables into registers.  The CPU can
			
 
				+stack. In this chapter we learn how to improve the performance of the
			
 
				+generated code by assigning some variables to registers.  The CPU can
			
 
				 access a register in a single cycle, whereas accessing the stack can
			
 
				 take 10s to 100s of cycles.  The program in Figure~\ref{fig:reg-eg}
			
 
				 serves as a running example. The source program is on the left and the
			
@@ -3882,10 +3882,9 @@ then model register allocation as a graph coloring problem
 
				 
			
 
				 If we run out of registers despite these efforts, we place the
			
 
				 remaining variables on the stack, similar to what we did in
			
 
				-Chapter~\ref{ch:Lvar}. It is common to use the verb \emph{spill}
			
 
				-for assigning a variable to a stack location. The decision to spill a
			
 
				-variable is handled as part of the graph coloring process
			
 
				-(Section~\ref{sec:graph-coloring}).
			
 
				+Chapter~\ref{ch:Lvar}. It is common to use the verb \emph{spill} for
			
 
				+assigning a variable to a stack location. The decision to spill a
			
 
				+variable is handled as part of the graph coloring process.
			
 
				 
			
 
				 We make the simplifying assumption that each variable is assigned to
			
 
				 one location (a register or stack address). A more sophisticated
			
@@ -3895,7 +3894,8 @@ in short sequence and then only used again after many other
 
				 instructions, it could be more efficient to assign the variable to a
			
 
				 register during the initial sequence and then move it to the stack for
			
 
				 the rest of its lifetime. We refer the interested reader to
			
 
				-\citet{Cooper:2011aa} for more information about that approach.
			
 
				+\citet{Cooper:2011aa} Chapter 13 for more information about that
			
 
				+approach.
			
 
				 
			
 
				 % discuss prioritizing variables based on how much they are used.
			
 
				 
			
@@ -3959,8 +3959,13 @@ rdi rsi rdx rcx r8 r9
 
				 If there are more than six arguments, then the convention is to use
			
 
				 space on the frame of the caller for the rest of the
			
 
				 arguments. However, in Chapter~\ref{ch:Rfun} we arrange never to
			
 
				-need more than six arguments. For now, the only function we care about
			
 
				-is \code{read\_int} and it takes zero arguments.
			
 
				+need more than six arguments.
			
 
				+%
			
 
				+\racket{For now, the only function we care about is \code{read\_int}
			
 
				+  and it takes zero arguments.}
			
 
				+%
			
 
				+\python{For now, the only functions we care about are \code{read\_int}
			
 
				+  and \code{print\_int}, which take zero and one argument, respectively.}
			
 
				 %
			
 
				 The register \code{rax} is used for the return value of a function.
			
 
				 
			
@@ -3970,20 +3975,19 @@ Figure~\ref{fig:example-calling-conventions}.  We first analyze this
 
				 example from the caller point of view and then from the callee point
			
 
				 of view.
			
 
				 
			
 
				-The program makes two calls to the \code{read} function.  Also, the
			
 
				-variable \code{x} is in use during the second call to \code{read}, so
			
 
				-we need to make sure that the value in \code{x} does not get
			
 
				-accidentally wiped out by the call to \code{read}.  One obvious
			
 
				-approach is to save all the values in caller-saved registers to the
			
 
				-stack prior to each function call, and restore them after each
			
 
				-call. That way, if the register allocator chooses to assign \code{x}
			
 
				-to a caller-saved register, its value will be preserved across the
			
 
				-call to \code{read}.  However, saving and restoring to the stack is
			
 
				-relatively slow. If \code{x} is not used many times, it may be better
			
 
				-to assign \code{x} to a stack location in the first place. Or better
			
 
				-yet, if we can arrange for \code{x} to be placed in a callee-saved
			
 
				-register, then it won't need to be saved and restored during function
			
 
				-calls.
			
 
				+The program makes two calls to \READOP{}.  Also, the variable \code{x}
			
 
				+is in use during the second call to \READOP{}, so we need to make sure
			
 
				+that the value in \code{x} does not get accidentally wiped out by the
			
 
				+call to \READOP{}.  One obvious approach is to save all the values in
			
 
				+caller-saved registers to the stack prior to each function call, and
			
 
				+restore them after each call. That way, if the register allocator
			
 
				+chooses to assign \code{x} to a caller-saved register, its value will
			
 
				+be preserved across the call to \READOP{}.  However, saving and
			
 
				+restoring to the stack is relatively slow. If \code{x} is not used
			
 
				+many times, it may be better to assign \code{x} to a stack location in
			
 
				+the first place. Or better yet, if we can arrange for \code{x} to be
			
 
				+placed in a callee-saved register, then it won't need to be saved and
			
 
				+restored during function calls.
			
 
				 
			
 
				 The approach that we recommend for variables that are in use during a
			
 
				 function call is to either assign them to callee-saved registers or to
			
@@ -3996,21 +4000,21 @@ callee-saved register, and 3) spill the variable to the stack.
 
				 It is straightforward to implement this approach in a graph coloring
			
 
				 register allocator. First, we know which variables are in use during
			
 
				 every function call because we compute that information for every
			
 
				-instruction (Section~\ref{sec:liveness-analysis-Lvar}). Second, when we
			
 
				-build the interference graph (Section~\ref{sec:build-interference}),
			
 
				-we can place an edge between each of these variables and the
			
 
				-caller-saved registers in the interference graph. This will prevent
			
 
				-the graph coloring algorithm from assigning those variables to
			
 
				-caller-saved registers.
			
 
				+instruction (Section~\ref{sec:liveness-analysis-Lvar}). Second, when
			
 
				+we build the interference graph
			
 
				+(Section~\ref{sec:build-interference}), we can place an edge between
			
 
				+each of these call-live variables and the caller-saved registers in
			
 
				+the interference graph. This will prevent the graph coloring algorithm
			
 
				+from assigning them to caller-saved registers.
			
 
				 
			
 
				 Returning to the example in
			
 
				 Figure~\ref{fig:example-calling-conventions}, let us analyze the
			
 
				-generated x86 code on the right-hand side, focusing on the
			
 
				-\code{start} block. Notice that variable \code{x} is assigned to
			
 
				-\code{rbx}, a callee-saved register. Thus, it is already in a safe
			
 
				-place during the second call to \code{read\_int}. Next, notice that
			
 
				-variable \code{y} is assigned to \code{rcx}, a caller-saved register,
			
 
				-because there are no function calls in the remainder of the block.
			
 
				+generated x86 code on the right-hand side. Notice that variable
			
 
				+\code{x} is assigned to \code{rbx}, a callee-saved register. Thus, it
			
 
				+is already in a safe place during the second call to
			
 
				+\code{read\_int}. Next, notice that variable \code{y} is assigned to
			
 
				+\code{rcx}, a caller-saved register, because there are no function
			
 
				+calls in the remainder of the block.
			
 
				 
			
 
				 Next we analyze the example from the callee point of view, focusing on
			
 
				 the prelude and conclusion of the \code{main} function. As usual the
			
@@ -4114,8 +4118,9 @@ is, it discovers which variables are in-use in different regions of a
 
				 program.
			
 
				 %
			
 
				 A variable or register is \emph{live} at a program point if its
			
 
				-current value is used at some later point in the program.  We 
			
 
				-refer to variables and registers collectively as \emph{locations}.
			
 
				+current value is used at some later point in the program.  We refer to
			
 
				+variables, stack locations, and registers collectively as
			
 
				+\emph{locations}.
			
 
				 %
			
 
				 Consider the following code fragment in which there are two writes to
			
 
				 \code{b}. Are \code{a} and \code{b} both live at the same time?
			
@@ -4420,13 +4425,13 @@ if they interfere with each other.
 
				   data structures in the file \code{graph.py} of the support code.}
			
 
				 
			
 
				 A straightforward way to compute the interference graph is to look at
			
 
				-the set of live locations between each instruction and the next and
			
 
				-add an edge to the graph for every pair of variables in the same set.
			
 
				-This approach is less than ideal for two reasons. First, it can be
			
 
				-expensive because it takes $O(n^2)$ time to consider at every pair in
			
 
				-a set of $n$ live locations. Second, in the special case where two
			
 
				-locations hold the same value (because one was assigned to the other),
			
 
				-they can be live at the same time without interfering with each other.
			
 
				+the set of live locations between each instruction and add an edge to
			
 
				+the graph for every pair of variables in the same set.  This approach
			
 
				+is less than ideal for two reasons. First, it can be expensive because
			
 
				+it takes $O(n^2)$ time to consider at every pair in a set of $n$ live
			
 
				+locations. Second, in the special case where two locations hold the
			
 
				+same value (because one was assigned to the other), they can be live
			
 
				+at the same time without interfering with each other.
			
 
				 
			
 
				 A better way to compute the interference graph is to focus on
			
 
				 writes~\citep{Appel:2003fk}. The writes performed by an instruction
			
@@ -4435,13 +4440,16 @@ instruction, we create an edge between the locations being written to
 
				 and the live locations. (Except that one should not create self
			
 
				 edges.)  Note that for the \key{callq} instruction, we consider all of
			
 
				 the caller-saved registers as being written to, so an edge is added
			
 
				-between every live variable and every caller-saved register. For
			
 
				-\key{movq}, we deal with the above-mentioned special case by not
			
 
				-adding an edge between a live variable $v$ and the destination if $v$
			
 
				-matches the source. So we have the following two rules.
			
 
				+between every live variable and every caller-saved register. Also, for
			
 
				+\key{movq} there is the above-mentioned special case to deal with. If
			
 
				+a live variable $v$ is the same as the source of the \key{movq}, then
			
 
				+there is no need to add an edge between $v$ and the destination,
			
 
				+because they both hold the same value.
			
 
				+%
			
 
				+So we have the following two rules.
			
 
				 
			
 
				 \begin{enumerate}
			
 
				-\item If instruction $I_k$ is a move such as \key{movq} $s$\key{,}
			
 
				+\item If instruction $I_k$ is a move instruction, \key{movq} $s$\key{,}
			
 
				   $d$, then add the edge $(d,v)$ for every $v \in
			
 
				   L_{\mathsf{after}}(k)$ unless $v = d$ or $v = s$.
			
 
				 
			
@@ -4693,15 +4701,15 @@ definition:
 
				 where $\mathrm{adjacent}(u)$ is the set of vertices that share an
			
 
				 edge with $u$.
			
 
				 
			
 
				-Using the Pencil Marks technique leads to a simple strategy for
			
 
				-filling in numbers: if there is a square with only one possible number
			
 
				-left, then choose that number! But what if there are no squares with
			
 
				-only one possibility left? One brute-force approach is to try them
			
 
				-all: choose the first one and if that ultimately leads to a solution,
			
 
				-great.  If not, backtrack and choose the next possibility.  One good
			
 
				-thing about Pencil Marks is that it reduces the degree of branching in
			
 
				-the search tree. Nevertheless, backtracking can be terribly time
			
 
				-consuming. One way to reduce the amount of backtracking is to use the
			
 
				+The Pencil Marks technique leads to a simple strategy for filling in
			
 
				+numbers: if there is a square with only one possible number left, then
			
 
				+choose that number! But what if there are no squares with only one
			
 
				+possibility left? One brute-force approach is to try them all: choose
			
 
				+the first one and if that ultimately leads to a solution, great.  If
			
 
				+not, backtrack and choose the next possibility.  One good thing about
			
 
				+Pencil Marks is that it reduces the degree of branching in the search
			
 
				+tree. Nevertheless, backtracking can be terribly time consuming. One
			
 
				+way to reduce the amount of backtracking is to use the
			
 
				 most-constrained-first heuristic (aka. minimum remaining
			
 
				 values)~\citep{Russell2003}.  That is, when choosing a square, always
			
 
				 choose one with the fewest possibilities left (the vertex with the
			
@@ -4710,16 +4718,15 @@ squares earlier rather than later is better because later on there may
 
				 not be any possibilities left in the highly saturated squares.
			
 
				 
			
 
				 However, register allocation is easier than Sudoku because the
			
 
				-register allocator can map variables to stack locations when the
			
 
				-registers run out. Thus, it makes sense to replace backtracking with
			
 
				-greedy search: make the best choice at the time and keep going. We
			
 
				-still wish to minimize the number of colors needed, so we use the
			
 
				-most-constrained-first heuristic in the greedy search.
			
 
				+register allocator can fall back to assigning variables to stack
			
 
				+locations when the registers run out. Thus, it makes sense to replace
			
 
				+backtracking with greedy search: make the best choice at the time and
			
 
				+keep going. We still wish to minimize the number of colors needed, so
			
 
				+we use the most-constrained-first heuristic in the greedy search.
			
 
				 Figure~\ref{fig:satur-algo} gives the pseudo-code for a simple greedy
			
 
				 algorithm for register allocation based on saturation and the
			
 
				 most-constrained-first heuristic. It is roughly equivalent to the
			
 
				-DSATUR
			
 
				-algorithm~\citep{Brelaz:1979eu}.
			
 
				+DSATUR graph coloring algorithm~\citep{Brelaz:1979eu}.
			
 
				 %,Gebremedhin:1999fk,Omari:2006uq
			
 
				 Just as in Sudoku, the algorithm represents colors with integers. The
			
 
				 integers $0$ through $k-1$ correspond to the $k$ registers that we use
			
@@ -5030,10 +5037,11 @@ general there can be.)
 
				 \draw (v) to (w);
			
 
				 \end{tikzpicture}
			
 
				 \]
			
 
				-The algorithm says to select a maximally saturated vertex. So we pick
			
 
				-$\ttm{tmp\_0}$ and color it with the first available integer, which is
			
 
				-$0$. We mark $0$ as no longer available for $\ttm{tmp\_1}$ and $\ttm{z}$
			
 
				-because they interfere with $\ttm{tmp\_0}$.
			
 
				+The algorithm says to select a maximally saturated vertex, but they
			
 
				+are alal equally saturated. So we flip a coin and pick $\ttm{tmp\_0}$
			
 
				+then color it with the first available integer, which is $0$. We mark
			
 
				+$0$ as no longer available for $\ttm{tmp\_1}$ and $\ttm{z}$ because
			
 
				+they interfere with $\ttm{tmp\_0}$.
			
 
				 \[
			
 
				 \begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				 \node (t0) at (0,2) {$\ttm{tmp\_0}: 0, \{\}$};
			
@@ -5053,10 +5061,10 @@ because they interfere with $\ttm{tmp\_0}$.
 
				 \draw (v) to (w);
			
 
				 \end{tikzpicture}
			
 
				 \]
			
 
				-We repeat the process, selecting a maximally saturated vertex,
			
 
				-choosing \code{z}, and color it with the first available number, which
			
 
				-is $1$. We add $1$ to the saturation for the neighboring vertices
			
 
				-\code{tmp\_0}, \code{y}, and \code{w}.
			
 
				+We repeat the process. The most saturated vertices are \code{z} and
			
 
				+\code{tmp\_1}, so we choose \code{z} and color it with the first
			
 
				+available number, which is $1$. We add $1$ to the saturation for the
			
 
				+neighboring vertices \code{tmp\_0}, \code{y}, and \code{w}.
			
 
				 \[
			
 
				 \begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				 \node (t0) at (0,2) {$\ttm{tmp\_0}: 0, \{1\}$};
			
@@ -5118,7 +5126,7 @@ Now \code{y} is the most saturated, so we color it with $2$.
 
				 \draw (v) to (w);
			
 
				 \end{tikzpicture}
			
 
				 \]
			
 
				-Now \code{tmp\_1}, \code{x}, and \code{v} are equally saturated.
			
 
				+The most saturated vertices are \code{tmp\_1}, \code{x}, and \code{v}.
			
 
				 We choose to color \code{v} with $1$.
			
 
				 \[
			
 
				 \begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
@@ -5335,8 +5343,8 @@ callq print_int
 
				 %
			
 
				 Implement the compiler pass \code{allocate\_registers}.
			
 
				 %
			
 
				-Create five programs that exercise all of the register allocation
			
 
				-algorithm, including spilling variables to the stack.
			
 
				+Create five programs that exercise all aspects of the register
			
 
				+allocation algorithm, including spilling variables to the stack.
			
 
				 %
			
 
				 \racket{Replace \code{assign\_homes} in the list of \code{passes} in the
			
 
				 \code{run-tests.rkt} script with the three new passes:
			
@@ -5365,7 +5373,7 @@ In the running example, the instruction \code{movq -8(\%rbp),
 
				   then move \code{rax} into \code{-16(\%rbp)}.
			
 
				 %
			
 
				 The moves from \code{-8(\%rbp)} to \code{-8(\%rbp)} are also
			
 
				-problematic, but it can simply be deleted. In general, we recommend
			
 
				+problematic, but they can simply be deleted. In general, we recommend
			
 
				 deleting all the trivial moves whose source and destination are the
			
 
				 same location.
			
 
				 %
			
@@ -5481,11 +5489,11 @@ prelude, make sure to take into account the space used for saving the
 
				 callee-saved registers. Also, don't forget that the frame needs to be
			
 
				 a multiple of 16 bytes!
			
 
				 
			
 
				-An overview of all of the passes involved in register allocation is
			
 
				-shown in Figure~\ref{fig:reg-alloc-passes}.
			
 
				+\racket{An overview of all of the passes involved in register
			
 
				+  allocation is shown in Figure~\ref{fig:reg-alloc-passes}.}
			
 
				 
			
 
				-\begin{figure}[tbp]
			
 
				 {\if\edition\racketEd\color{olive}      
			
 
				+\begin{figure}[tbp]
			
 
				 \begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				 \node (Lvar) at (0,2)  {\large \LangVar{}};
			
 
				 \node (Lvar-2) at (3,2)  {\large \LangVar{}};
			
@@ -5510,37 +5518,21 @@ shown in Figure~\ref{fig:reg-alloc-passes}.
 
				 \path[->,bend left=15] (x86-3) edge [above] node {\ttfamily\footnotesize patch\_instr.} (x86-4);
			
 
				 \path[->,bend left=15] (x86-4) edge [right] node {\ttfamily\footnotesize print\_x86} (x86-5);
			
 
				 \end{tikzpicture}
			
 
				-\fi}
			
 
				-{\if\edition\pythonEd
			
 
				-\begin{tikzpicture}[baseline=(current  bounding  box.center)]
			
 
				-\node (Lvar-1) at (0,2)  {\large \LangVar{}};
			
 
				-\node (Lvar-2) at (3,2)  {\large \LangVar{}};
			
 
				-\node (x86-1) at (3,0)  {\large \LangXVar{}};
			
 
				-\node (x86-2) at (6,0)  {\large \LangXVar{}};
			
 
				-\node (x86-3) at (9,0) {\large \LangXInt{}};
			
 
				-\node (x86-4) at (11,0) {\large \LangXInt{}};
			
 
				-
			
 
				-\path[->,bend left=15] (Lvar-1) edge [above] node {\ttfamily\footnotesize remove\_complex.} (Lvar-2);
			
 
				-\path[->,bend right=15] (Lvar-2) edge [left] node {\ttfamily\footnotesize select\_instr.} (x86-1);
			
 
				-\path[->,bend right=15] (x86-1) edge [below] node {\ttfamily\footnotesize allocate\_reg.} (x86-2);
			
 
				-\path[->,bend left=15] (x86-2) edge [above] node {\ttfamily\footnotesize patch\_instr.} (x86-3);
			
 
				-\path[->,bend right=15] (x86-3) edge [below] node {\ttfamily\footnotesize print\_x86} (x86-4);
			
 
				-\end{tikzpicture}
			
 
				-\fi}
			
 
				 \caption{Diagram of the passes for \LangVar{} with register allocation.}
			
 
				 \label{fig:reg-alloc-passes}
			
 
				 \end{figure}
			
 
				+\fi}
			
 
				 
			
 
				 Figure~\ref{fig:running-example-x86} shows the x86 code generated for
			
 
				 the running example (Figure~\ref{fig:reg-eg}). To demonstrate both the
			
 
				-use of registers and the stack, we have limited the register allocator
			
 
				-to use just two registers: \code{rbx} and \code{rcx}.  In the
			
 
				-prelude\index{subject}{prelude} of the \code{main} function, we push
			
 
				-\code{rbx} onto the stack because it is a callee-saved register and it
			
 
				-was assigned to variable by the register allocator.  We subtract
			
 
				-\code{8} from the \code{rsp} at the end of the prelude to reserve
			
 
				-space for the one spilled variable.  After that subtraction, the
			
 
				-\code{rsp} is aligned to 16 bytes.
			
 
				+use of registers and the stack, we limit the register allocator for
			
 
				+this example to use just two registers: \code{rbx} and \code{rcx}.  In
			
 
				+the prelude\index{subject}{prelude} of the \code{main} function, we
			
 
				+push \code{rbx} onto the stack because it is a callee-saved register
			
 
				+and it was assigned to variable by the register allocator.  We
			
 
				+subtract \code{8} from the \code{rsp} at the end of the prelude to
			
 
				+reserve space for the one spilled variable.  After that subtraction,
			
 
				+the \code{rsp} is aligned to 16 bytes.
			
 
				 
			
 
				 Moving on to the program proper, we see how the registers were
			
 
				 allocated.
			
@@ -5629,7 +5621,9 @@ main:
 
				 	retq
			
 
				 \end{lstlisting}
			
 
				 \fi}
			
 
				-\caption{The x86 output from the running example (Figure~\ref{fig:reg-eg}).}
			
 
				+\caption{The x86 output from the running example
			
 
				+  (Figure~\ref{fig:reg-eg}), limiting allocation to just \code{rbx}
			
 
				+  and \code{rcx}.}
			
 
				 \label{fig:running-example-x86}
			
 
				 \end{figure}
			
 
				 
			
@@ -5648,9 +5642,9 @@ performs register allocation.
 
				 \label{sec:move-biasing}
			
 
				 \index{subject}{move biasing}
			
 
				 
			
 
				-This section describes an enhancement to the register allocator for
			
 
				-students looking for an extra challenge or who have a deeper interest
			
 
				-in register allocation.
			
 
				+This section describes an enhancement to the register allocator,
			
 
				+called move biasing, for students who are looking for an extra
			
 
				+challenge.
			
 
				 
			
 
				 {\if\edition\racketEd\color{olive}      
			
 
				 To motivate the need for move biasing we return to the running example
			
@@ -5708,25 +5702,24 @@ can accomplish this by taking into account which variables appear in
 
				 {\if\edition\pythonEd
			
 
				 %
			
 
				 To motivate the need for move biasing we return to the running example
			
 
				-and recall that Section~\ref{sec:patch-instructions} we were able to
			
 
				+and recall that in Section~\ref{sec:patch-instructions} we were able to
			
 
				 remove three trivial move instructions from the running
			
 
				 example. However, we could remove another trivial move if we were able
			
 
				 to allocate \code{y} and \code{tmp\_0} to the same register.  \fi}
			
 
				 
			
 
				 We say that two variables $p$ and $q$ are \emph{move
			
 
				-  related}\index{subject}{move related} if they participate together in a
			
 
				-\key{movq} instruction, that is, \key{movq} $p$\key{,} $q$ or
			
 
				-\key{movq} $q$\key{,} $p$. When deciding which variable to
			
 
				-color next, when there are multiple variables with the same
			
 
				-saturation, prefer variables that can be assigned the same
			
 
				-color as a move related variable that has already been colored.
			
 
				-Furthermore,  when the register allocator chooses a color
			
 
				-for a variable, it should prefer a color that has already been used
			
 
				-for a move-related variable (assuming that they do not interfere). Of
			
 
				-course, this preference should not override the preference for
			
 
				-registers over stack locations. This preference should be used as a
			
 
				-tie breaker when choosing between registers or when choosing between
			
 
				-stack locations.
			
 
				+related}\index{subject}{move related} if they participate together in
			
 
				+a \key{movq} instruction, that is, \key{movq} $p$\key{,} $q$ or
			
 
				+\key{movq} $q$\key{,} $p$. When deciding which variable to color next,
			
 
				+when there are multiple variables with the same saturation, prefer
			
 
				+variables that can be assigned to a color that is the same as the
			
 
				+color of a move related variable.  Furthermore, when the register
			
 
				+allocator chooses a color for a variable, it should prefer a color
			
 
				+that has already been used for a move-related variable (assuming that
			
 
				+they do not interfere). Of course, this preference should not override
			
 
				+the preference for registers over stack locations. So this preference
			
 
				+should be used as a tie breaker when choosing between registers or
			
 
				+when choosing between stack locations.
			
 
				 
			
 
				 We recommend representing the move relationships in a graph, similar
			
 
				 to how we represented interference.  The following is the \emph{move
			
@@ -6065,8 +6058,8 @@ jmp conclusion
 
				 \fi}
			
 
				 
			
 
				 {\if\edition\pythonEd
			
 
				-\begin{minipage}{0.25\textwidth}
			
 
				-\begin{lstlisting}
			
 
				+\begin{minipage}{0.20\textwidth}
			
 
				+\begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
			
 
				 movq $1, v
			
 
				 movq $42, w
			
 
				 movq v, x
			
@@ -6079,11 +6072,12 @@ negq tmp_0
 
				 movq z, tmp_1
			
 
				 addq tmp_0, tmp_1
			
 
				 movq tmp_1, %rdi
			
 
				-callq _print_int\end{lstlisting}
			
 
				+callq _print_int
			
 
				+\end{lstlisting}
			
 
				 \end{minipage}
			
 
				-$\Rightarrow\qquad$
			
 
				-\begin{minipage}{0.25\textwidth}
			
 
				-\begin{lstlisting}
			
 
				+${\Rightarrow\qquad}$
			
 
				+\begin{minipage}{0.30\textwidth}
			
 
				+\begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
			
 
				 movq $1, %rcx
			
 
				 movq $42, -16(%rbp)
			
 
				 movq %rcx, %rcx
			
@@ -6099,9 +6093,9 @@ movq -8(%rbp), %rdi
 
				 callq _print_int
			
 
				 \end{lstlisting}
			
 
				 \end{minipage}
			
 
				-$\Rightarrow\qquad$
			
 
				-\begin{minipage}{0.25\textwidth}
			
 
				-\begin{lstlisting}
			
 
				+${\Rightarrow\qquad}$
			
 
				+\begin{minipage}{0.20\textwidth}
			
 
				+\begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
			
 
				 movq $1, %rcx
			
 
				 movq $42, -16(%rbp)
			
 
				 addq $7, %rcx
			
@@ -6148,11 +6142,11 @@ compilers in the 1950s~\citep{Horwitz:1966aa,Backus:1978aa}.  The use
 
				 of graph coloring began in the late 1970s and early 1980s with the
			
 
				 work of \citet{Chaitin:1981vl} on an optimizing compiler for PL/I. The
			
 
				 algorithm is based on the following observation of
			
 
				-\citet{Kempe:1879aa} from the 1870s.  If a graph $G$ has a vertex $v$
			
 
				-with degree lower than $k$, then $G$ is $k$ colorable if the subgraph
			
 
				-of $G$ with $v$ removed is also $k$ colorable. Suppose that the
			
 
				+\citet{Kempe:1879aa}. If a graph $G$ has a vertex $v$ with degree
			
 
				+lower than $k$, then $G$ is $k$ colorable if the subgraph of $G$ with
			
 
				+$v$ removed is also $k$ colorable. To see why, suppose that the
			
 
				 subgraph is $k$ colorable.  At worst the neighbors of $v$ are assigned
			
 
				-different colors, but since there are less than $k$ of them, there
			
 
				+different colors, but since there are less than $k$ neighbors, there
			
 
				 will be one or more colors left over to use for coloring $v$ in $G$.
			
 
				 
			
 
				 The algorithm of \citet{Chaitin:1981vl} removes a vertex $v$ of degree
			
@@ -6205,17 +6199,16 @@ vertices in a particular order and assigns each one the first
 
				 available color. An \emph{offline} greedy algorithm chooses the
			
 
				 ordering up-front, prior to assigning colors. The algorithm of
			
 
				 \citet{Chaitin:1981vl} should be considered offline because the vertex
			
 
				-ordering does not depend on the colors assigned, so the algorithm
			
 
				-could be split into two phases.  Other orderings are possible. For
			
 
				-example, \citet{Chow:1984ys} order variables according to an estimate
			
 
				-of runtime cost.
			
 
				+ordering does not depend on the colors assigned.  Other orderings are
			
 
				+possible. For example, \citet{Chow:1984ys} order variables according
			
 
				+to an estimate of runtime cost.
			
 
				 
			
 
				 An \emph{online} greedy coloring algorithm uses information about the
			
 
				 current assignment of colors to influence the order in which the
			
 
				 remaining vertices are colored. The saturation-based algorithm
			
 
				 described in this chapter is one such algorithm. We choose to use
			
 
				-saturation-based coloring is because it is fun to introduce graph
			
 
				-coloring via Sudoku.
			
 
				+saturation-based coloring because it is fun to introduce graph
			
 
				+coloring via Sudoku!
			
 
				 
			
 
				 A register allocator may choose to map each variable to just one
			
 
				 location, as in \citet{Chaitin:1981vl}, or it may choose to map a
			
@@ -6251,8 +6244,8 @@ ordering determined by maximum cardinality search.
 
				 
			
 
				 In situations where compile time is of utmost importance, such as in
			
 
				 just-in-time compilers, graph coloring algorithms can be too expensive
			
 
				-and the linear scan of \citet{Poletto:1999uq} may be more appropriate.
			
 
				-
			
 
				+and the linear scan algorithm of \citet{Poletto:1999uq} may be more
			
 
				+appropriate.
			
 
				 
			
 
				 
			
 
				 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%