浏览代码

expanded sec 3.1

Jeremy Siek 4 年之前
父节点
当前提交
8ef46f5fd2
共有 2 个文件被更改,包括 1417 次插入31 次删除
  1. 1258 1
      all.bib
  2. 159 30
      book.tex

文件差异内容过多而无法显示
+ 1258 - 1
all.bib


+ 159 - 30
book.tex

@@ -2535,9 +2535,9 @@ point, so \code{x} and \code{y} could share the same register. The
 topic of Section~\ref{sec:liveness-analysis-r1} is how to compute
 topic of Section~\ref{sec:liveness-analysis-r1} is how to compute
 where a variable is needed.  Once we have that information, we compute
 where a variable is needed.  Once we have that information, we compute
 which variables are needed at the same time, i.e., which ones
 which variables are needed at the same time, i.e., which ones
-\emph{interfere}, and represent this relation as an undirected graph
-whose vertices are variables and edges indicate when two variables
-interfere with each other (Section~\ref{sec:build-interference}). We
+\emph{interfere} with each other, and represent this relation as an
+undirected graph whose vertices are variables and edges indicate when
+two variables interfere (Section~\ref{sec:build-interference}). We
 then model register allocation as a graph coloring problem, which we
 then model register allocation as a graph coloring problem, which we
 discuss in Section~\ref{sec:graph-coloring}.
 discuss in Section~\ref{sec:graph-coloring}.
 
 
@@ -2548,17 +2548,32 @@ for assigning a variable to a stack location. The process of spilling
 variables is handled as part of the graph coloring process described
 variables is handled as part of the graph coloring process described
 in \ref{sec:graph-coloring}.
 in \ref{sec:graph-coloring}.
 
 
+We make the simplifying assumption that each variable is assigned to
+one location (a register or stack address). A more sophisticated
+approach is to assign a variable to one or more locations in different
+regions of the program.  For example, if a variable is used many times
+in short sequence and then only used again after many other
+instructions, it could be more efficient to assign the variable to a
+register during the intial sequence and then move it to the stack for
+the rest of its lifetime. We refer the interested reader to
+\citet{Cooper:1998ly} and \citet{Cooper:2011aa} for more information
+about this approach.
+
+% discuss prioritizing variables based on how much they are used.
+
 \section{Registers and Calling Conventions}
 \section{Registers and Calling Conventions}
 \label{sec:calling-conventions}
 \label{sec:calling-conventions}
 
 
 As we perform register allocation, we need to be aware of the
 As we perform register allocation, we need to be aware of the
 conventions that govern the way in which registers interact with
 conventions that govern the way in which registers interact with
-function calls, such as calls to the \code{read\_int} function. The
-convention for x86 is that the caller is responsible for freeing up
-some registers, the \emph{caller-saved registers}, prior to the
-function call, and the callee is responsible for saving and restoring
-some other registers, the \emph{callee-saved registers}, before and
-after using them. The caller-saved registers are
+function calls, such as calls to the \code{read\_int} function in our
+generated code and even the call that the operating system makes to
+execute our \code{main} function.  The convention for x86 is that the
+caller is responsible for freeing up some registers, the
+\emph{caller-saved registers}, prior to the function call, and the
+callee is responsible for preserving the values in some other
+registers, the \emph{callee-saved registers}. The caller-saved
+registers are
 \begin{lstlisting}
 \begin{lstlisting}
   rax rcx rdx rsi rdi r8 r9 r10 r11
   rax rcx rdx rsi rdi r8 r9 r10 r11
 \end{lstlisting}
 \end{lstlisting}
@@ -2566,16 +2581,129 @@ while the callee-saved registers are
 \begin{lstlisting}
 \begin{lstlisting}
   rsp rbp rbx r12 r13 r14 r15
   rsp rbp rbx r12 r13 r14 r15
 \end{lstlisting}
 \end{lstlisting}
-Another way to think about this caller/callee convention is the
-following. The caller should assume that all the caller-saved registers
-get overwritten with arbitrary values by the callee.  On the other
-hand, the caller can safely assume that all the callee-saved registers
-contain the same values after the call that they did before the call.
-The callee can freely use any of the caller-saved registers.  However,
-if the callee wants to use a callee-saved register, the callee must
-arrange to put the original value back in the register prior to
-returning to the caller, which is usually accomplished by saving and
-restoring the value from the stack.
+
+We can think about this caller/callee convention from two points of
+view, the caller view and the callee view:
+\begin{itemize}
+\item The caller should assume that all the caller-saved registers get
+  overwritten with arbitrary values by the callee.  On the other hand,
+  the caller can safely assume that all the callee-saved registers
+  contain the same values after the call that they did before the
+  call.
+\item The callee can freely use any of the caller-saved registers.
+  However, if the callee wants to use a callee-saved register, the
+  callee must arrange to put the original value back in the register
+  prior to returning to the caller, which is usually accomplished by
+  saving the value to the stack in the prelude of the function and
+  restoring the value in the conclusion of the function.
+\end{itemize}
+
+The next question is how these calling conventions impact register
+allocation. Consider the $R_1$ program in
+Figure~\ref{fig:example-calling-conventions}.  We first analyze this
+example from the caller point of view and then from the callee point
+of view.
+
+The program makes two calls to the \code{read} function.  Also, the
+variable \code{x} is in-use during the second call to \code{read}, so
+we need to make sure that the value in \code{x} does not get
+accidentally wiped out by the call to \code{read}.  One obvious
+approach is to save all the values in caller-saved registers to the
+stack prior to each function call, and restore them after each
+call. That way, if the register allocator chooses to assign \code{x}
+to a caller-saved register, its value will be preserved accross the
+call to \code{read}.  However, the disadvantage of this approach is
+that saving and restoring to the stack is relatively slow. If \code{x}
+is not used many times, it may be better to assign \code{x} to a stack
+location in the first place. Or better yet, if we can arrange for
+\code{x} to be placed in a callee-saved register, then it won't need
+to be saved and restored during function calls.
+
+The approach that we recommend is to treat variables differently
+depending on whether they are in-use during a function call.  If a
+variable is in-use during a function call, then we never assign it to
+a caller-saved register: we either assign it to a callee-saved
+register or we spill it to the stack. If a variable is not in-use
+during any function call, then we try the following alternatives in
+order 1) look for an available caller-saved register (to leave room
+for other variables in the callee-saved register), 2) look for a
+callee-saved register, and 3) spill the variable to the stack.
+
+It is straightforward to implement this approach in a graph coloring
+register allocator. First, we know which variables are in-use during
+every function call because we compute that information for every
+instruciton (Section~\ref{sec:liveness-analysis-r1}). Second, when we
+build the interference graph (Section~\ref{sec:build-interference}),
+we can place an edge between each of these variables and the
+caller-saved registers in the interference graph. This will prevent
+the graph coloring algorithm from assigning those variables to
+caller-saved registers.
+
+Returning to the example in
+Figure~\ref{fig:example-calling-conventions}, let us analyze the
+generated x86 code on the right-hand side, focusing on the
+\code{start} block. Notice that variable \code{x} is assigned to
+\code{rbx}, a callee-saved register. Thus, it is already in a safe
+place during the second call to \code{read\_int}. Next, notice that
+variable \code{y} is assigned to \code{rcx}, a caller-saved register,
+because there are no function calls in the remainder of the block.
+
+Next we analyze the example from the callee point of view, focusing on
+the prelude and conclusion of the \code{main} function. As usual the
+prelude begins with saving the \code{rbp} register to the stack and
+setting the \code{rbp} to the current stack pointer. We now know why
+it is necessary to save the \code{rbp}: it is a callee-saved register.
+The prelude then pushes \code{rbx} to the stack because 1) \code{rbx}
+is also a callee-saved register and 2) \code{rbx} is assigned to a
+variable (\code{x}). There are several more callee-saved register that
+are not saved in the prelude because they were not assigned to
+variables. The prelude subtracts 8 bytes from the \code{rsp} to make
+it 16-byte aligned and then jumps to the \code{start} block. Shifting
+attention to the \code{conclusion}, we see that \code{rbx} is restored
+from the stack with a \code{popq} instruction.
+
+\begin{figure}[tp]
+\begin{minipage}{0.45\textwidth}
+Example $R_1$ program:
+%s0_14.rkt
+\begin{lstlisting}
+(let ([x (read)])
+  (let ([y (read)])
+    (+ (+ x y) 42)))
+\end{lstlisting}
+\end{minipage}
+\begin{minipage}{0.45\textwidth}
+Generated x86 assembly:
+\begin{lstlisting}
+start:
+	callq	read_int
+	movq	%rax, %rbx
+	callq	read_int
+	movq	%rax, %rcx
+	addq	%rcx, %rbx
+	movq	%rbx, %rax
+	addq	$42, %rax
+	jmp _conclusion
+
+	.globl main
+main:
+	pushq	%rbp
+	movq	%rsp, %rbp
+	pushq	%rbx
+	subq	$8, %rsp
+	jmp start
+conclusion:
+	addq	$8, %rsp
+	popq	%rbx
+	popq	%rbp
+	retq
+\end{lstlisting}
+\end{minipage}
+\caption{Example with function calls.}
+  \label{fig:example-calling-conventions}
+\end{figure}
+
+
 
 
 
 
 \section{Liveness Analysis}
 \section{Liveness Analysis}
@@ -2685,7 +2813,7 @@ Figure~\ref{fig:live-eg} shows the results of live variables analysis
 for the running example program, with the live-before and live-after
 for the running example program, with the live-before and live-after
 sets shown between each instruction to make the figure easy to read.
 sets shown between each instruction to make the figure easy to read.
 
 
-\begin{figure}[tbp]
+\begin{figure}[tp]
 \hspace{20pt}
 \hspace{20pt}
 \begin{minipage}{0.45\textwidth}
 \begin{minipage}{0.45\textwidth}
 \begin{lstlisting}
 \begin{lstlisting}
@@ -3278,15 +3406,15 @@ reason for this is that our \code{main} function must adhere to the
 x86 calling conventions that we described in
 x86 calling conventions that we described in
 Section~\ref{sec:calling-conventions}.  Furthermore, if your register
 Section~\ref{sec:calling-conventions}.  Furthermore, if your register
 allocator assigned variables to other callee-saved registers
 allocator assigned variables to other callee-saved registers
-(e.g. rbx, r12, etc.), then those variables must also be saved to the
-stack in the prelude and restored in the conclusion.  The simplest
-approach is to save and restore all of the callee-saved registers. The
-more efficient approach is to keep track of which callee-saved
-registers were used and only save and restore them. Either way, make
-sure to take this use of stack space into account when you are
-calculating the size of the frame and adjusting the \code{rsp} in the
-prelude. Also, don't forget that the size of the frame needs to be a
-multiple of 16 bytes!
+(e.g. \code{rbx}, \code{r12}, etc.), then those variables must also be
+saved to the stack in the prelude and restored in the conclusion.  The
+simplest approach is to save and restore all of the callee-saved
+registers. The more efficient approach is to keep track of which
+callee-saved registers were used and only save and restore
+them. Either way, make sure to take this use of stack space into
+account when you are calculating the size of the frame and adjusting
+the \code{rsp} in the prelude. Also, don't forget that the size of the
+frame needs to be a multiple of 16 bytes!
 
 
 
 
 \section{Challenge: Move Biasing}
 \section{Challenge: Move Biasing}
@@ -3566,7 +3694,7 @@ return control to the operating system.
   % s0_28.rkt
   % s0_28.rkt
   % (use-minimal-set-of-registers! #t)
   % (use-minimal-set-of-registers! #t)
   % and only rbx rcx
   % and only rbx rcx
-% tmp 0
+% tmp 0 rbx
 % z 1  rcx
 % z 1  rcx
 % y 0  rbx
 % y 0  rbx
 % w 2  16(%rbp)
 % w 2  16(%rbp)
@@ -3591,6 +3719,7 @@ main:
 	pushq	%rbx
 	pushq	%rbx
 	subq	$8, %rsp
 	subq	$8, %rsp
 	jmp start
 	jmp start
+        
 conclusion:
 conclusion:
 	addq	$8, %rsp
 	addq	$8, %rsp
 	popq	%rbx
 	popq	%rbx

部分文件因为文件数量过多而无法显示