4 жил өмнө · 4e87c57495
--- a/book.tex
+++ b/book.tex
@@ -6459,12 +6459,12 @@ from the set.
 
				 \chapter{Functions}
			
 
				 \label{ch:functions}
			
 
				 
			
 
				-This chapter studies the compilation of functions similar to those of
			
 
				-the C language. This corresponds to a subset of Typed Racket in which
			
 
				-only top-level function definitions are allowed. These kind of
			
 
				-functions are an important stepping stone to implementing
			
 
				-lexically-scoped functions in the form of \key{lambda} abstractions,
			
 
				-which is the topic of Chapter~\ref{ch:lambdas}.
			
 
				+This chapter studies the compilation of functions similar to those
			
 
				+found in the C language. This corresponds to a subset of Typed Racket
			
 
				+in which only top-level function definitions are allowed. This kind of
			
 
				+function is an important stepping stone to implementing
			
 
				+lexically-scoped functions, that is, \key{lambda} abstractions, which
			
 
				+is the topic of Chapter~\ref{ch:lambdas}.
			
 
				 
			
 
				 \section{The $R_4$ Language}
			
 
				 
			
@@ -6660,13 +6660,21 @@ address of the \code{add1} label into the \code{rbx} register.
 
				 \begin{lstlisting}
			
 
				    leaq add1(%rip), %rbx
			
 
				 \end{lstlisting}
			
 
				-
			
 
				-In Section~\ref{sec:x86} we saw the use of the \code{callq}
			
 
				-instruction for jumping to a function whose location is given by a
			
 
				-label. Here we instead will be jumping to a function whose location is
			
 
				-given by an address, that is, we need to make an \emph{indirect
			
 
				-  function call}. The x86 syntax for this is a \code{callq}
			
 
				-instruction but with an asterisk before the register name.
			
 
				+The instruction pointer register \key{rip} (aka. the program counter
			
 
				+or PC) always points to the next instruction to be executed. When
			
 
				+combined with an label, as in \code{add1(\%rip)}, the linker computes
			
 
				+the distance $d$ between the address of \code{add1} and where the
			
 
				+\code{rip} would be at that moment and then changes \code{add1(\%rip)}
			
 
				+to \code{$d$(\%rip)}, which at runtime will compute the address of
			
 
				+\code{add1}.
			
 
				+
			
 
				+In Section~\ref{sec:x86} we used of the \code{callq} instruction to
			
 
				+jump to a function whose location is given by a label. To support
			
 
				+function calls in this chapter we instead will be jumping to a
			
 
				+function whose location is given by an address in a register, that is,
			
 
				+we need to make an \emph{indirect function call}. The x86 syntax for
			
 
				+this is a \code{callq} instruction but with an asterisk before the
			
 
				+register name.
			
 
				 \begin{lstlisting}
			
 
				    callq *%rbx
			
 
				 \end{lstlisting}
			
@@ -6675,54 +6683,65 @@ instruction but with an asterisk before the register name.
 
				 \subsection{Calling Conventions}
			
 
				 
			
 
				 The \code{callq} instruction provides partial support for implementing
			
 
				-functions, but it does not handle (1) parameter passing, (2) saving
			
 
				-and restoring frames on the procedure call stack, or (3) determining
			
 
				-how registers are shared by different functions. These issues require
			
 
				-coordination between the caller and the callee, which is often
			
 
				-assembly code written by different programmers or generated by
			
 
				-different compilers. As a result, people have developed
			
 
				+functions: it pushes the return address on the stack and it jumps to
			
 
				+the target. However, \code{callq} does not handle
			
 
				+\begin{enumerate}
			
 
				+\item parameter passing,
			
 
				+\item saving and restoring frames on the procedure call stack, or
			
 
				+\item determining how registers are shared by different functions.
			
 
				+\end{enumerate}
			
 
				+These issues require coordination between the caller and the callee,
			
 
				+which is often assembly code written by different programmers or
			
 
				+generated by different compilers. As a result, people have developed
			
 
				 \emph{conventions} that govern how functions calls are performed.
			
 
				-Here we shall use conventions that are compatible with those used by
			
 
				-the \code{gcc} compiler~\citep{Matz:2013aa}.
			
 
				+Here we use conventions that are compatible with those of the
			
 
				+\code{gcc} compiler~\citep{Matz:2013aa}.
			
 
				 
			
 
				 Regarding (1) parameter passing, the convention is to use the
			
 
				 following six registers: \code{rdi}, \code{rsi}, \code{rdx},
			
 
				-\code{rcx}, \code{r8}, and \code{r9}, in that order. If there are more
			
 
				-than six arguments, then the convention is to use space on the frame
			
 
				-of the caller for the rest of the arguments. However, to ease the
			
 
				-implementation of efficient tail calls (Section~\ref{sec:tail-call}),
			
 
				-we shall arrange to never have more than six arguments.
			
 
				+\code{rcx}, \code{r8}, and \code{r9}, in that order, to pass arguments
			
 
				+to a function. If there are more than six arguments, then the
			
 
				+convention is to use space on the frame of the caller for the rest of
			
 
				+the arguments. However, to ease the implementation of efficient tail
			
 
				+calls (Section~\ref{sec:tail-call}), we arrange to never need more
			
 
				+than six arguments.
			
 
				 %
			
 
				 The register \code{rax} is for the return value of the function.
			
 
				 
			
 
				-Regarding (2) frames and the procedure call stack, the convention is
			
 
				-that the stack grows down, with each function call using a chunk of
			
 
				-space called a frame. The caller sets the stack pointer, register
			
 
				-\code{rsp}, to the last data item in its frame. The callee must not
			
 
				-change anything in the caller's frame, that is, anything that is at or
			
 
				-above the stack pointer. The callee is free to use locations that are
			
 
				-below the stack pointer.
			
 
				+Regarding (2) frames and the procedure call stack, recall from
			
 
				+Section~\ref{sec:x86} that the stack grows down, with each function
			
 
				+call using a chunk of space called a frame. The caller sets the stack
			
 
				+pointer, register \code{rsp}, to the last data item in its frame. The
			
 
				+callee must not change anything in the caller's frame, that is,
			
 
				+anything that is at or above the stack pointer. The callee is free to
			
 
				+use locations that are below the stack pointer.
			
 
				 
			
 
				 Regarding (3) the sharing of registers between different functions,
			
 
				 recall from Section~\ref{sec:calling-conventions} that the registers
			
 
				 are divided into two groups, the caller-saved registers and the
			
 
				 callee-saved registers. The caller should assume that all the
			
 
				 caller-saved registers get overwritten with arbitrary values by the
			
 
				-callee. Thus, the caller should either 1) not put values that are live
			
 
				-across a call in caller-saved registers, or 2) save and restore values
			
 
				-that are live across calls. We shall recommend option 1).  On the flip
			
 
				-side, if the callee wants to use a callee-saved register, the callee
			
 
				-must save the contents of those registers on their stack frame and
			
 
				-then put them back prior to returning to the caller.  The base
			
 
				-pointer, register \code{rbp}, is used as a point-of-reference within a
			
 
				-frame, so that each local variable can be accessed at a fixed offset
			
 
				-from the base pointer.
			
 
				-%
			
 
				-Figure~\ref{fig:call-frames} shows the layout of the caller and callee
			
 
				-frames.
			
 
				-%% If we were to use stack arguments, they would be between the
			
 
				-%% caller locals and the callee return address. 
			
 
				+callee. That is why we recommend in
			
 
				+Section~\ref{sec:calling-conventions} that variables that are live
			
 
				+during a function call should not be assigned to caller-saved
			
 
				+registers.
			
 
				 
			
 
				+On the flip side, if the callee wants to use a callee-saved register,
			
 
				+the callee must save the contents of those registers on their stack
			
 
				+frame and then put them back prior to returning to the caller.  That
			
 
				+is why we recommended in Section~\ref{sec:calling-conventions} that if
			
 
				+the register allocator assigns a variable to a callee-saved register,
			
 
				+then the prelude of the \code{main} function must save that register
			
 
				+to the stack and the conclusion of \code{main} must restore it.  This
			
 
				+recommendation now generalizes to all functions.
			
 
				+
			
 
				+Also recall that the base pointer, register \code{rbp}, is used as a
			
 
				+point-of-reference within a frame, so that each local variable can be
			
 
				+accessed at a fixed offset from the base pointer
			
 
				+(Section~\ref{sec:x86}).
			
 
				+%
			
 
				+Figure~\ref{fig:call-frames} shows the general layout of the caller
			
 
				+and callee frames.
			
 
				 
			
 
				 
			
 
				 \begin{figure}[tbp]
			
@@ -6734,9 +6753,9 @@ Caller View & Callee View & Contents       & Frame \\ \hline
 
				 -8(\key{\%rbp}) &  & callee-saved $1$ \\
			
 
				 \ldots & & \ldots \\
			
 
				 $-8j$(\key{\%rbp}) &  & callee-saved $j$ \\
			
 
				-$-8(j+1)$(\key{\%rbp}) &  & local $1$ \\
			
 
				+$-8(j+1)$(\key{\%rbp}) &  & local variable $1$ \\
			
 
				 \ldots & & \ldots \\
			
 
				-$-8(j+k)$(\key{\%rbp}) &  & local $k$ \\
			
 
				+$-8(j+k)$(\key{\%rbp}) &  & local variable $k$ \\
			
 
				  %% & &  \\
			
 
				 %% $8n-8$\key{(\%rsp)} & $8n+8$(\key{\%rbp})& argument $n$ \\
			
 
				 %% & \ldots           & \ldots \\
			
@@ -6747,9 +6766,9 @@ $-8(j+k)$(\key{\%rbp}) &  & local $k$ \\
 
				 & -8(\key{\%rbp}) & callee-saved $1$ \\
			
 
				 & \ldots & \ldots \\
			
 
				 & $-8n$(\key{\%rbp})  & callee-saved $n$ \\
			
 
				-& $-8(n+1)$(\key{\%rbp})  & local $1$ \\
			
 
				+& $-8(n+1)$(\key{\%rbp})  & local variable $1$ \\
			
 
				 &  \ldots          & \ldots \\
			
 
				-& $-8(n+m)$(\key{\%rsp})   & local $m$\\ \hline
			
 
				+& $-8(n+m)$(\key{\%rsp})   & local variable $m$\\ \hline
			
 
				 \end{tabular}
			
 
				 \caption{Memory layout of caller and callee frames.}
			
 
				 \label{fig:call-frames}
			
@@ -6806,15 +6825,15 @@ optimized in this way.
 
				 However, some care is needed with regards to argument passing in tail
			
 
				 calls.  As mentioned above, for arguments beyond the sixth, the
			
 
				 convention is to use space in the caller's frame for passing
			
 
				-arguments.  But here we've popped the caller's frame and can no longer
			
 
				-use it.  Another alternative is to use space in the callee's frame for
			
 
				-passing arguments. However, this option is also problematic because
			
 
				-the caller and callee's frame overlap in memory.  As we begin to copy
			
 
				-the arguments from their sources in the caller's frame, the target
			
 
				-locations in the callee's frame might overlap with the sources for
			
 
				-later arguments! We solve this problem by not using the stack for
			
 
				-parameter passing but instead use the heap, as we describe in the
			
 
				-Section~\ref{sec:limit-functions-r4}.
			
 
				+arguments.  But for a tail call we pop the caller's frame and can no
			
 
				+longer use it.  Another alternative is to use space in the callee's
			
 
				+frame for passing arguments. However, this option is also problematic
			
 
				+because the caller and callee's frame overlap in memory.  As we begin
			
 
				+to copy the arguments from their sources in the caller's frame, the
			
 
				+target locations in the callee's frame might overlap with the sources
			
 
				+for later arguments! We solve this problem by not using the stack for
			
 
				+passing more than six arguments but instead using the heap, as we
			
 
				+describe in the Section~\ref{sec:limit-functions-r4}.
			
 
				 
			
 
				 As mentioned above, for a tail call we pop the caller's frame prior to
			
 
				 making the tail call. The instructions for popping a frame are the
			
@@ -6833,17 +6852,11 @@ that would unnecessarily overwrite the return address. Instead we can
 
				 simply use the \key{jmp} instruction. Like the indirect function call,
			
 
				 we write an indirect jump with a register prefixed with an asterisk.
			
 
				 We recommend using \code{rax} to hold the jump target because the
			
 
				-preceding ``conclusion'' overwrites just about everything else.
			
 
				+preceding conclusion overwrites just about everything else.
			
 
				 \begin{lstlisting}
			
 
				    jmp *%rax
			
 
				 \end{lstlisting}
			
 
				 
			
 
				-%% Now that we have a good understanding of functions as they appear in
			
 
				-%% $R_4$ and the support for functions in x86, we need to plan the
			
 
				-%% changes to our compiler, that is, do we need any new passes and/or do
			
 
				-%% we need to change any existing passes? Also, do we need to add new
			
 
				-%% kinds of AST nodes to any of the intermediate languages?
			
 
				-
			
 
				 \section{Shrink $R_4$}
			
 
				 \label{sec:shrink-r4}