|
@@ -6459,12 +6459,12 @@ from the set.
|
|
|
\chapter{Functions}
|
|
|
\label{ch:functions}
|
|
|
|
|
|
-This chapter studies the compilation of functions similar to those of
|
|
|
-the C language. This corresponds to a subset of Typed Racket in which
|
|
|
-only top-level function definitions are allowed. These kind of
|
|
|
-functions are an important stepping stone to implementing
|
|
|
-lexically-scoped functions in the form of \key{lambda} abstractions,
|
|
|
-which is the topic of Chapter~\ref{ch:lambdas}.
|
|
|
+This chapter studies the compilation of functions similar to those
|
|
|
+found in the C language. This corresponds to a subset of Typed Racket
|
|
|
+in which only top-level function definitions are allowed. This kind of
|
|
|
+function is an important stepping stone to implementing
|
|
|
+lexically-scoped functions, that is, \key{lambda} abstractions, which
|
|
|
+is the topic of Chapter~\ref{ch:lambdas}.
|
|
|
|
|
|
\section{The $R_4$ Language}
|
|
|
|
|
@@ -6660,13 +6660,21 @@ address of the \code{add1} label into the \code{rbx} register.
|
|
|
\begin{lstlisting}
|
|
|
leaq add1(%rip), %rbx
|
|
|
\end{lstlisting}
|
|
|
-
|
|
|
-In Section~\ref{sec:x86} we saw the use of the \code{callq}
|
|
|
-instruction for jumping to a function whose location is given by a
|
|
|
-label. Here we instead will be jumping to a function whose location is
|
|
|
-given by an address, that is, we need to make an \emph{indirect
|
|
|
- function call}. The x86 syntax for this is a \code{callq}
|
|
|
-instruction but with an asterisk before the register name.
|
|
|
+The instruction pointer register \key{rip} (aka. the program counter
|
|
|
+or PC) always points to the next instruction to be executed. When
|
|
|
+combined with an label, as in \code{add1(\%rip)}, the linker computes
|
|
|
+the distance $d$ between the address of \code{add1} and where the
|
|
|
+\code{rip} would be at that moment and then changes \code{add1(\%rip)}
|
|
|
+to \code{$d$(\%rip)}, which at runtime will compute the address of
|
|
|
+\code{add1}.
|
|
|
+
|
|
|
+In Section~\ref{sec:x86} we used of the \code{callq} instruction to
|
|
|
+jump to a function whose location is given by a label. To support
|
|
|
+function calls in this chapter we instead will be jumping to a
|
|
|
+function whose location is given by an address in a register, that is,
|
|
|
+we need to make an \emph{indirect function call}. The x86 syntax for
|
|
|
+this is a \code{callq} instruction but with an asterisk before the
|
|
|
+register name.
|
|
|
\begin{lstlisting}
|
|
|
callq *%rbx
|
|
|
\end{lstlisting}
|
|
@@ -6675,54 +6683,65 @@ instruction but with an asterisk before the register name.
|
|
|
\subsection{Calling Conventions}
|
|
|
|
|
|
The \code{callq} instruction provides partial support for implementing
|
|
|
-functions, but it does not handle (1) parameter passing, (2) saving
|
|
|
-and restoring frames on the procedure call stack, or (3) determining
|
|
|
-how registers are shared by different functions. These issues require
|
|
|
-coordination between the caller and the callee, which is often
|
|
|
-assembly code written by different programmers or generated by
|
|
|
-different compilers. As a result, people have developed
|
|
|
+functions: it pushes the return address on the stack and it jumps to
|
|
|
+the target. However, \code{callq} does not handle
|
|
|
+\begin{enumerate}
|
|
|
+\item parameter passing,
|
|
|
+\item saving and restoring frames on the procedure call stack, or
|
|
|
+\item determining how registers are shared by different functions.
|
|
|
+\end{enumerate}
|
|
|
+These issues require coordination between the caller and the callee,
|
|
|
+which is often assembly code written by different programmers or
|
|
|
+generated by different compilers. As a result, people have developed
|
|
|
\emph{conventions} that govern how functions calls are performed.
|
|
|
-Here we shall use conventions that are compatible with those used by
|
|
|
-the \code{gcc} compiler~\citep{Matz:2013aa}.
|
|
|
+Here we use conventions that are compatible with those of the
|
|
|
+\code{gcc} compiler~\citep{Matz:2013aa}.
|
|
|
|
|
|
Regarding (1) parameter passing, the convention is to use the
|
|
|
following six registers: \code{rdi}, \code{rsi}, \code{rdx},
|
|
|
-\code{rcx}, \code{r8}, and \code{r9}, in that order. If there are more
|
|
|
-than six arguments, then the convention is to use space on the frame
|
|
|
-of the caller for the rest of the arguments. However, to ease the
|
|
|
-implementation of efficient tail calls (Section~\ref{sec:tail-call}),
|
|
|
-we shall arrange to never have more than six arguments.
|
|
|
+\code{rcx}, \code{r8}, and \code{r9}, in that order, to pass arguments
|
|
|
+to a function. If there are more than six arguments, then the
|
|
|
+convention is to use space on the frame of the caller for the rest of
|
|
|
+the arguments. However, to ease the implementation of efficient tail
|
|
|
+calls (Section~\ref{sec:tail-call}), we arrange to never need more
|
|
|
+than six arguments.
|
|
|
%
|
|
|
The register \code{rax} is for the return value of the function.
|
|
|
|
|
|
-Regarding (2) frames and the procedure call stack, the convention is
|
|
|
-that the stack grows down, with each function call using a chunk of
|
|
|
-space called a frame. The caller sets the stack pointer, register
|
|
|
-\code{rsp}, to the last data item in its frame. The callee must not
|
|
|
-change anything in the caller's frame, that is, anything that is at or
|
|
|
-above the stack pointer. The callee is free to use locations that are
|
|
|
-below the stack pointer.
|
|
|
+Regarding (2) frames and the procedure call stack, recall from
|
|
|
+Section~\ref{sec:x86} that the stack grows down, with each function
|
|
|
+call using a chunk of space called a frame. The caller sets the stack
|
|
|
+pointer, register \code{rsp}, to the last data item in its frame. The
|
|
|
+callee must not change anything in the caller's frame, that is,
|
|
|
+anything that is at or above the stack pointer. The callee is free to
|
|
|
+use locations that are below the stack pointer.
|
|
|
|
|
|
Regarding (3) the sharing of registers between different functions,
|
|
|
recall from Section~\ref{sec:calling-conventions} that the registers
|
|
|
are divided into two groups, the caller-saved registers and the
|
|
|
callee-saved registers. The caller should assume that all the
|
|
|
caller-saved registers get overwritten with arbitrary values by the
|
|
|
-callee. Thus, the caller should either 1) not put values that are live
|
|
|
-across a call in caller-saved registers, or 2) save and restore values
|
|
|
-that are live across calls. We shall recommend option 1). On the flip
|
|
|
-side, if the callee wants to use a callee-saved register, the callee
|
|
|
-must save the contents of those registers on their stack frame and
|
|
|
-then put them back prior to returning to the caller. The base
|
|
|
-pointer, register \code{rbp}, is used as a point-of-reference within a
|
|
|
-frame, so that each local variable can be accessed at a fixed offset
|
|
|
-from the base pointer.
|
|
|
-%
|
|
|
-Figure~\ref{fig:call-frames} shows the layout of the caller and callee
|
|
|
-frames.
|
|
|
-%% If we were to use stack arguments, they would be between the
|
|
|
-%% caller locals and the callee return address.
|
|
|
+callee. That is why we recommend in
|
|
|
+Section~\ref{sec:calling-conventions} that variables that are live
|
|
|
+during a function call should not be assigned to caller-saved
|
|
|
+registers.
|
|
|
|
|
|
+On the flip side, if the callee wants to use a callee-saved register,
|
|
|
+the callee must save the contents of those registers on their stack
|
|
|
+frame and then put them back prior to returning to the caller. That
|
|
|
+is why we recommended in Section~\ref{sec:calling-conventions} that if
|
|
|
+the register allocator assigns a variable to a callee-saved register,
|
|
|
+then the prelude of the \code{main} function must save that register
|
|
|
+to the stack and the conclusion of \code{main} must restore it. This
|
|
|
+recommendation now generalizes to all functions.
|
|
|
+
|
|
|
+Also recall that the base pointer, register \code{rbp}, is used as a
|
|
|
+point-of-reference within a frame, so that each local variable can be
|
|
|
+accessed at a fixed offset from the base pointer
|
|
|
+(Section~\ref{sec:x86}).
|
|
|
+%
|
|
|
+Figure~\ref{fig:call-frames} shows the general layout of the caller
|
|
|
+and callee frames.
|
|
|
|
|
|
|
|
|
\begin{figure}[tbp]
|
|
@@ -6734,9 +6753,9 @@ Caller View & Callee View & Contents & Frame \\ \hline
|
|
|
-8(\key{\%rbp}) & & callee-saved $1$ \\
|
|
|
\ldots & & \ldots \\
|
|
|
$-8j$(\key{\%rbp}) & & callee-saved $j$ \\
|
|
|
-$-8(j+1)$(\key{\%rbp}) & & local $1$ \\
|
|
|
+$-8(j+1)$(\key{\%rbp}) & & local variable $1$ \\
|
|
|
\ldots & & \ldots \\
|
|
|
-$-8(j+k)$(\key{\%rbp}) & & local $k$ \\
|
|
|
+$-8(j+k)$(\key{\%rbp}) & & local variable $k$ \\
|
|
|
%% & & \\
|
|
|
%% $8n-8$\key{(\%rsp)} & $8n+8$(\key{\%rbp})& argument $n$ \\
|
|
|
%% & \ldots & \ldots \\
|
|
@@ -6747,9 +6766,9 @@ $-8(j+k)$(\key{\%rbp}) & & local $k$ \\
|
|
|
& -8(\key{\%rbp}) & callee-saved $1$ \\
|
|
|
& \ldots & \ldots \\
|
|
|
& $-8n$(\key{\%rbp}) & callee-saved $n$ \\
|
|
|
-& $-8(n+1)$(\key{\%rbp}) & local $1$ \\
|
|
|
+& $-8(n+1)$(\key{\%rbp}) & local variable $1$ \\
|
|
|
& \ldots & \ldots \\
|
|
|
-& $-8(n+m)$(\key{\%rsp}) & local $m$\\ \hline
|
|
|
+& $-8(n+m)$(\key{\%rsp}) & local variable $m$\\ \hline
|
|
|
\end{tabular}
|
|
|
\caption{Memory layout of caller and callee frames.}
|
|
|
\label{fig:call-frames}
|
|
@@ -6806,15 +6825,15 @@ optimized in this way.
|
|
|
However, some care is needed with regards to argument passing in tail
|
|
|
calls. As mentioned above, for arguments beyond the sixth, the
|
|
|
convention is to use space in the caller's frame for passing
|
|
|
-arguments. But here we've popped the caller's frame and can no longer
|
|
|
-use it. Another alternative is to use space in the callee's frame for
|
|
|
-passing arguments. However, this option is also problematic because
|
|
|
-the caller and callee's frame overlap in memory. As we begin to copy
|
|
|
-the arguments from their sources in the caller's frame, the target
|
|
|
-locations in the callee's frame might overlap with the sources for
|
|
|
-later arguments! We solve this problem by not using the stack for
|
|
|
-parameter passing but instead use the heap, as we describe in the
|
|
|
-Section~\ref{sec:limit-functions-r4}.
|
|
|
+arguments. But for a tail call we pop the caller's frame and can no
|
|
|
+longer use it. Another alternative is to use space in the callee's
|
|
|
+frame for passing arguments. However, this option is also problematic
|
|
|
+because the caller and callee's frame overlap in memory. As we begin
|
|
|
+to copy the arguments from their sources in the caller's frame, the
|
|
|
+target locations in the callee's frame might overlap with the sources
|
|
|
+for later arguments! We solve this problem by not using the stack for
|
|
|
+passing more than six arguments but instead using the heap, as we
|
|
|
+describe in the Section~\ref{sec:limit-functions-r4}.
|
|
|
|
|
|
As mentioned above, for a tail call we pop the caller's frame prior to
|
|
|
making the tail call. The instructions for popping a frame are the
|
|
@@ -6833,17 +6852,11 @@ that would unnecessarily overwrite the return address. Instead we can
|
|
|
simply use the \key{jmp} instruction. Like the indirect function call,
|
|
|
we write an indirect jump with a register prefixed with an asterisk.
|
|
|
We recommend using \code{rax} to hold the jump target because the
|
|
|
-preceding ``conclusion'' overwrites just about everything else.
|
|
|
+preceding conclusion overwrites just about everything else.
|
|
|
\begin{lstlisting}
|
|
|
jmp *%rax
|
|
|
\end{lstlisting}
|
|
|
|
|
|
-%% Now that we have a good understanding of functions as they appear in
|
|
|
-%% $R_4$ and the support for functions in x86, we need to plan the
|
|
|
-%% changes to our compiler, that is, do we need any new passes and/or do
|
|
|
-%% we need to change any existing passes? Also, do we need to add new
|
|
|
-%% kinds of AST nodes to any of the intermediate languages?
|
|
|
-
|
|
|
\section{Shrink $R_4$}
|
|
|
\label{sec:shrink-r4}
|
|
|
|