瀏覽代碼

expanded sec 3.1

Jeremy Siek 4 年之前
父節點
當前提交
8ef46f5fd2
共有 2 個文件被更改,包括 1417 次插入31 次删除
  1. 1258 1
      all.bib
  2. 159 30
      book.tex

文件差異過大導致無法顯示
+ 1258 - 1
all.bib


+ 159 - 30
book.tex

@@ -2535,9 +2535,9 @@ point, so \code{x} and \code{y} could share the same register. The
 topic of Section~\ref{sec:liveness-analysis-r1} is how to compute
 where a variable is needed.  Once we have that information, we compute
 which variables are needed at the same time, i.e., which ones
-\emph{interfere}, and represent this relation as an undirected graph
-whose vertices are variables and edges indicate when two variables
-interfere with each other (Section~\ref{sec:build-interference}). We
+\emph{interfere} with each other, and represent this relation as an
+undirected graph whose vertices are variables and edges indicate when
+two variables interfere (Section~\ref{sec:build-interference}). We
 then model register allocation as a graph coloring problem, which we
 discuss in Section~\ref{sec:graph-coloring}.
 
@@ -2548,17 +2548,32 @@ for assigning a variable to a stack location. The process of spilling
 variables is handled as part of the graph coloring process described
 in \ref{sec:graph-coloring}.
 
+We make the simplifying assumption that each variable is assigned to
+one location (a register or stack address). A more sophisticated
+approach is to assign a variable to one or more locations in different
+regions of the program.  For example, if a variable is used many times
+in short sequence and then only used again after many other
+instructions, it could be more efficient to assign the variable to a
+register during the intial sequence and then move it to the stack for
+the rest of its lifetime. We refer the interested reader to
+\citet{Cooper:1998ly} and \citet{Cooper:2011aa} for more information
+about this approach.
+
+% discuss prioritizing variables based on how much they are used.
+
 \section{Registers and Calling Conventions}
 \label{sec:calling-conventions}
 
 As we perform register allocation, we need to be aware of the
 conventions that govern the way in which registers interact with
-function calls, such as calls to the \code{read\_int} function. The
-convention for x86 is that the caller is responsible for freeing up
-some registers, the \emph{caller-saved registers}, prior to the
-function call, and the callee is responsible for saving and restoring
-some other registers, the \emph{callee-saved registers}, before and
-after using them. The caller-saved registers are
+function calls, such as calls to the \code{read\_int} function in our
+generated code and even the call that the operating system makes to
+execute our \code{main} function.  The convention for x86 is that the
+caller is responsible for freeing up some registers, the
+\emph{caller-saved registers}, prior to the function call, and the
+callee is responsible for preserving the values in some other
+registers, the \emph{callee-saved registers}. The caller-saved
+registers are
 \begin{lstlisting}
   rax rcx rdx rsi rdi r8 r9 r10 r11
 \end{lstlisting}
@@ -2566,16 +2581,129 @@ while the callee-saved registers are
 \begin{lstlisting}
   rsp rbp rbx r12 r13 r14 r15
 \end{lstlisting}
-Another way to think about this caller/callee convention is the
-following. The caller should assume that all the caller-saved registers
-get overwritten with arbitrary values by the callee.  On the other
-hand, the caller can safely assume that all the callee-saved registers
-contain the same values after the call that they did before the call.
-The callee can freely use any of the caller-saved registers.  However,
-if the callee wants to use a callee-saved register, the callee must
-arrange to put the original value back in the register prior to
-returning to the caller, which is usually accomplished by saving and
-restoring the value from the stack.
+
+We can think about this caller/callee convention from two points of
+view, the caller view and the callee view:
+\begin{itemize}
+\item The caller should assume that all the caller-saved registers get
+  overwritten with arbitrary values by the callee.  On the other hand,
+  the caller can safely assume that all the callee-saved registers
+  contain the same values after the call that they did before the
+  call.
+\item The callee can freely use any of the caller-saved registers.
+  However, if the callee wants to use a callee-saved register, the
+  callee must arrange to put the original value back in the register
+  prior to returning to the caller, which is usually accomplished by
+  saving the value to the stack in the prelude of the function and
+  restoring the value in the conclusion of the function.
+\end{itemize}
+
+The next question is how these calling conventions impact register
+allocation. Consider the $R_1$ program in
+Figure~\ref{fig:example-calling-conventions}.  We first analyze this
+example from the caller point of view and then from the callee point
+of view.
+
+The program makes two calls to the \code{read} function.  Also, the
+variable \code{x} is in-use during the second call to \code{read}, so
+we need to make sure that the value in \code{x} does not get
+accidentally wiped out by the call to \code{read}.  One obvious
+approach is to save all the values in caller-saved registers to the
+stack prior to each function call, and restore them after each
+call. That way, if the register allocator chooses to assign \code{x}
+to a caller-saved register, its value will be preserved accross the
+call to \code{read}.  However, the disadvantage of this approach is
+that saving and restoring to the stack is relatively slow. If \code{x}
+is not used many times, it may be better to assign \code{x} to a stack
+location in the first place. Or better yet, if we can arrange for
+\code{x} to be placed in a callee-saved register, then it won't need
+to be saved and restored during function calls.
+
+The approach that we recommend is to treat variables differently
+depending on whether they are in-use during a function call.  If a
+variable is in-use during a function call, then we never assign it to
+a caller-saved register: we either assign it to a callee-saved
+register or we spill it to the stack. If a variable is not in-use
+during any function call, then we try the following alternatives in
+order 1) look for an available caller-saved register (to leave room
+for other variables in the callee-saved register), 2) look for a
+callee-saved register, and 3) spill the variable to the stack.
+
+It is straightforward to implement this approach in a graph coloring
+register allocator. First, we know which variables are in-use during
+every function call because we compute that information for every
+instruciton (Section~\ref{sec:liveness-analysis-r1}). Second, when we
+build the interference graph (Section~\ref{sec:build-interference}),
+we can place an edge between each of these variables and the
+caller-saved registers in the interference graph. This will prevent
+the graph coloring algorithm from assigning those variables to
+caller-saved registers.
+
+Returning to the example in
+Figure~\ref{fig:example-calling-conventions}, let us analyze the
+generated x86 code on the right-hand side, focusing on the
+\code{start} block. Notice that variable \code{x} is assigned to
+\code{rbx}, a callee-saved register. Thus, it is already in a safe
+place during the second call to \code{read\_int}. Next, notice that
+variable \code{y} is assigned to \code{rcx}, a caller-saved register,
+because there are no function calls in the remainder of the block.
+
+Next we analyze the example from the callee point of view, focusing on
+the prelude and conclusion of the \code{main} function. As usual the
+prelude begins with saving the \code{rbp} register to the stack and
+setting the \code{rbp} to the current stack pointer. We now know why
+it is necessary to save the \code{rbp}: it is a callee-saved register.
+The prelude then pushes \code{rbx} to the stack because 1) \code{rbx}
+is also a callee-saved register and 2) \code{rbx} is assigned to a
+variable (\code{x}). There are several more callee-saved register that
+are not saved in the prelude because they were not assigned to
+variables. The prelude subtracts 8 bytes from the \code{rsp} to make
+it 16-byte aligned and then jumps to the \code{start} block. Shifting
+attention to the \code{conclusion}, we see that \code{rbx} is restored
+from the stack with a \code{popq} instruction.
+
+\begin{figure}[tp]
+\begin{minipage}{0.45\textwidth}
+Example $R_1$ program:
+%s0_14.rkt
+\begin{lstlisting}
+(let ([x (read)])
+  (let ([y (read)])
+    (+ (+ x y) 42)))
+\end{lstlisting}
+\end{minipage}
+\begin{minipage}{0.45\textwidth}
+Generated x86 assembly:
+\begin{lstlisting}
+start:
+	callq	read_int
+	movq	%rax, %rbx
+	callq	read_int
+	movq	%rax, %rcx
+	addq	%rcx, %rbx
+	movq	%rbx, %rax
+	addq	$42, %rax
+	jmp _conclusion
+
+	.globl main
+main:
+	pushq	%rbp
+	movq	%rsp, %rbp
+	pushq	%rbx
+	subq	$8, %rsp
+	jmp start
+conclusion:
+	addq	$8, %rsp
+	popq	%rbx
+	popq	%rbp
+	retq
+\end{lstlisting}
+\end{minipage}
+\caption{Example with function calls.}
+  \label{fig:example-calling-conventions}
+\end{figure}
+
+
 
 
 \section{Liveness Analysis}
@@ -2685,7 +2813,7 @@ Figure~\ref{fig:live-eg} shows the results of live variables analysis
 for the running example program, with the live-before and live-after
 sets shown between each instruction to make the figure easy to read.
 
-\begin{figure}[tbp]
+\begin{figure}[tp]
 \hspace{20pt}
 \begin{minipage}{0.45\textwidth}
 \begin{lstlisting}
@@ -3278,15 +3406,15 @@ reason for this is that our \code{main} function must adhere to the
 x86 calling conventions that we described in
 Section~\ref{sec:calling-conventions}.  Furthermore, if your register
 allocator assigned variables to other callee-saved registers
-(e.g. rbx, r12, etc.), then those variables must also be saved to the
-stack in the prelude and restored in the conclusion.  The simplest
-approach is to save and restore all of the callee-saved registers. The
-more efficient approach is to keep track of which callee-saved
-registers were used and only save and restore them. Either way, make
-sure to take this use of stack space into account when you are
-calculating the size of the frame and adjusting the \code{rsp} in the
-prelude. Also, don't forget that the size of the frame needs to be a
-multiple of 16 bytes!
+(e.g. \code{rbx}, \code{r12}, etc.), then those variables must also be
+saved to the stack in the prelude and restored in the conclusion.  The
+simplest approach is to save and restore all of the callee-saved
+registers. The more efficient approach is to keep track of which
+callee-saved registers were used and only save and restore
+them. Either way, make sure to take this use of stack space into
+account when you are calculating the size of the frame and adjusting
+the \code{rsp} in the prelude. Also, don't forget that the size of the
+frame needs to be a multiple of 16 bytes!
 
 
 \section{Challenge: Move Biasing}
@@ -3566,7 +3694,7 @@ return control to the operating system.
   % s0_28.rkt
   % (use-minimal-set-of-registers! #t)
   % and only rbx rcx
-% tmp 0
+% tmp 0 rbx
 % z 1  rcx
 % y 0  rbx
 % w 2  16(%rbp)
@@ -3591,6 +3719,7 @@ main:
 	pushq	%rbx
 	subq	$8, %rsp
 	jmp start
+        
 conclusion:
 	addq	$8, %rsp
 	popq	%rbx

部分文件因文件數量過多而無法顯示