Jeremy Siek 4 gadi atpakaļ
vecāks
revīzija
5da98bfb9d
1 mainītis faili ar 51 papildinājumiem un 56 dzēšanām
  1. 51 56
      book.tex

+ 51 - 56
book.tex

@@ -2602,17 +2602,14 @@ all, fast code is useless if it produces incorrect results!
 
 \index{register allocation}
 
-In Chapter~\ref{ch:int-exp} we placed all variables on the stack to
-make our life easier. However, we can improve the performance of the
-generated code if we instead place some variables into registers.  The
-CPU can access a register in a single cycle, whereas accessing the
-stack takes many cycles if the relevant data is in cache or many more
-to access main memory if the data is not in cache.
-Figure~\ref{fig:reg-eg} shows a program with four variables that
-serves as a running example. We show the source program and also the
-output of instruction selection. At that point the program is almost
-x86 assembly but not quite; it still contains variables instead of
-stack locations or registers.
+In Chapter~\ref{ch:int-exp} we learned how to store variables on the
+stack. In this Chapter we learn how to improve the performance of the
+generated code by placing some variables into registers.  The CPU can
+access a register in a single cycle, whereas accessing the stack can
+take 10s to 100s of cycles.  The program in Figure~\ref{fig:reg-eg}
+serves as a running example. The source program is on the left and the
+output of instruction selection is on the right. The program is almost
+in the x86 assembly language but it still uses variables.
 
 \begin{figure}
 \begin{minipage}{0.45\textwidth}
@@ -2654,30 +2651,30 @@ start:
 \end{figure}
 
 The goal of register allocation is to fit as many variables into
-registers as possible. A program sometimes has more variables than
-registers, so we cannot always map each variable to a different
+registers as possible. Some programs have more variables than
+registers so we cannot always map each variable to a different
 register. Fortunately, it is common for different variables to be
 needed during different periods of time during program execution, and
 in such cases several variables can be mapped to the same register.
-Consider variables \code{x} and \code{y} in Figure~\ref{fig:reg-eg}.
+Consider variables \code{x} and \code{z} in Figure~\ref{fig:reg-eg}.
 After the variable \code{x} is moved to \code{z} it is no longer
-needed.  Variable \code{y}, on the other hand, is used only after this
-point, so \code{x} and \code{y} could share the same register. The
+needed.  Variable \code{z}, on the other hand, is used only after this
+point, so \code{x} and \code{z} could share the same register. The
 topic of Section~\ref{sec:liveness-analysis-r1} is how to compute
 where a variable is needed.  Once we have that information, we compute
 which variables are needed at the same time, i.e., which ones
 \emph{interfere} with each other, and represent this relation as an
 undirected graph whose vertices are variables and edges indicate when
 two variables interfere (Section~\ref{sec:build-interference}). We
-then model register allocation as a graph coloring problem, which we
-discuss in Section~\ref{sec:graph-coloring}.
+then model register allocation as a graph coloring problem
+(Section~\ref{sec:graph-coloring}).
 
 If we run out of registers despite these efforts, we place the
 remaining variables on the stack, similar to what we did in
 Chapter~\ref{ch:int-exp}. It is common to use the verb \emph{spill}
 for assigning a variable to a stack location. The decision to spill a
-variable is handled as part of the graph coloring process described in
-Section~\ref{sec:graph-coloring}.
+variable is handled as part of the graph coloring process
+(Section~\ref{sec:graph-coloring}).
 
 We make the simplifying assumption that each variable is assigned to
 one location (a register or stack address). A more sophisticated
@@ -2687,8 +2684,7 @@ in short sequence and then only used again after many other
 instructions, it could be more efficient to assign the variable to a
 register during the initial sequence and then move it to the stack for
 the rest of its lifetime. We refer the interested reader to
-\citet{Cooper:1998ly} and \citet{Cooper:2011aa} for more information
-about that approach.
+\citet{Cooper:2011aa} for more information about that approach.
 
 % discuss prioritizing variables based on how much they are used.
 
@@ -2698,18 +2694,19 @@ about that approach.
 
 As we perform register allocation, we need to be aware of the
 \emph{calling conventions} \index{calling conventions} that govern how
-functions calls are performed in x86. Function calls require
-coordination between the caller and the callee, which is often
-assembly code written by different programmers or generated by
-different compilers. Here we follow the System V calling conventions
-that are used by the \code{gcc} compiler on Linux and
+functions calls are performed in x86.
+%
+Even though \LangVar{} does not include programmer-defined functions,
+our generated code includes a \code{main} function that is called by
+the operating system and our generated code contains calls to the
+\code{read\_int} function.
+
+Function calls require coordination between two pieces of code that
+may be written by different programmers or generated by different
+compilers. Here we follow the System V calling conventions that are
+used by the GNU C compiler on Linux and
 MacOS~\citep{Bryant:2005aa,Matz:2013aa}.
 %
-Even though \LangVar{} does not include programmer-defined functions, our
-generated code will 1) include a \code{main} function that the
-operating system will call to initiate execution, and 2) make calls to
-the \code{read\_int} function in our runtime system.
-
 The calling conventions include rules about how functions share the
 use of registers. In particular, the caller is responsible for freeing
 up some registers prior to the function call for use by the callee.
@@ -2737,15 +2734,14 @@ view, the caller view and the callee view:
 \item The callee can freely use any of the caller-saved registers.
   However, if the callee wants to use a callee-saved register, the
   callee must arrange to put the original value back in the register
-  prior to returning to the caller, which is usually accomplished by
-  saving the value to the stack in the prelude of the function and
-  restoring the value in the conclusion of the function.
+  prior to returning to the caller. This can be accomplished by saving
+  the value to the stack in the prelude of the function and restoring
+  the value in the conclusion of the function.
 \end{itemize}
 
 In x86, registers are also used for passing arguments to a function
 and for the return value.  In particular, the first six arguments to a
-function are passed in the following six registers, in the order
-given.
+function are passed in the following six registers, in this order.
 \begin{lstlisting}
 rdi rsi rdx rcx r8 r9
 \end{lstlisting}
@@ -2811,14 +2807,13 @@ prelude begins with saving the \code{rbp} register to the stack and
 setting the \code{rbp} to the current stack pointer. We now know why
 it is necessary to save the \code{rbp}: it is a callee-saved register.
 The prelude then pushes \code{rbx} to the stack because 1) \code{rbx}
-is also a callee-saved register and 2) \code{rbx} is assigned to a
-variable (\code{x}). There are several more callee-saved registers
-that are not saved in the prelude because they were not used. The
-prelude subtracts 8 bytes from the \code{rsp} to make it 16-byte
-aligned and then jumps to the \code{start} block. Shifting attention
-to the \code{conclusion}, we see that \code{rbx} is restored from the
-stack with a \code{popq} instruction.
-\index{prelude}\index{conclusion}
+is a callee-saved register and 2) \code{rbx} is assigned to a variable
+(\code{x}). The other callee-saved registers are not saved in the
+prelude because they are not used. The prelude subtracts 8 bytes from
+the \code{rsp} to make it 16-byte aligned and then jumps to the
+\code{start} block. Shifting attention to the \code{conclusion}, we
+see that \code{rbx} is restored from the stack with a \code{popq}
+instruction.  \index{prelude}\index{conclusion}
 
 \begin{figure}[tp]
 \begin{minipage}{0.45\textwidth}
@@ -2861,6 +2856,7 @@ conclusion:
   \label{fig:example-calling-conventions}
 \end{figure}
 
+\clearpage
 
 \section{Liveness Analysis}
 \label{sec:liveness-analysis-r1}
@@ -4081,17 +4077,16 @@ programs to make sure that your move biasing is working properly.
 \margincomment{\footnotesize To do: another neat challenge would be to do
   live range splitting~\citep{Cooper:1998ly}. \\ --Jeremy}
 
-\section{Output of the Running Example}
-\label{sec:reg-alloc-output}
-\index{prelude}\index{conclusion}
+%% \subsection{Output of the Running Example}
+%% \label{sec:reg-alloc-output}
 
 Figure~\ref{fig:running-example-x86} shows the x86 code generated for
 the running example (Figure~\ref{fig:reg-eg}) with register allocation
 and move biasing. To demonstrate both the use of registers and the
 stack, we have limited the register allocator to use just two
-registers: \code{rbx} and \code{rcx}.  In the prelude of the
-\code{main} function, we push \code{rbx} onto the stack because it is
-a callee-saved register and it was assigned to variable by the
+registers: \code{rbx} and \code{rcx}.  In the prelude\index{prelude}
+of the \code{main} function, we push \code{rbx} onto the stack because
+it is a callee-saved register and it was assigned to variable by the
 register allocator.  We subtract \code{8} from the \code{rsp} at the
 end of the prelude to reserve space for the one spilled variable.
 After that subtraction, the \code{rsp} is aligned to 16 bytes.
@@ -4105,11 +4100,11 @@ stack. The spilled variables must be placed lower on the stack than
 the saved callee-save registers, so in this case \code{w} is placed at
 \code{-16(\%rbp)}.
 
-In the \code{conclusion}, we undo the work that was done in the
-prelude. We move the stack pointer up by \code{8} bytes (the room for
-spilled variables), then we pop the old values of \code{rbx} and
-\code{rbp} (callee-saved registers), and finish with \code{retq} to
-return control to the operating system.
+In the \code{conclusion}\index{conclusion}, we undo the work that was
+done in the prelude. We move the stack pointer up by \code{8} bytes
+(the room for spilled variables), then we pop the old values of
+\code{rbx} and \code{rbp} (callee-saved registers), and finish with
+\code{retq} to return control to the operating system.
 
   
 \begin{figure}[tbp]