Explorar el Código

closure conversion

Jeremy Siek hace 4 años
padre
commit
d821125124
Se han modificado 2 ficheros con 183 adiciones y 111 borrados
  1. 182 111
      book.tex
  2. 1 0
      defs.tex

+ 182 - 111
book.tex

@@ -5580,6 +5580,13 @@ Subsequently, in the \code{uncover-locals} pass
 propagated to all variables (including the temporaries generated by
 \code{remove-complex-opera*}).
 
+To create the s-expression for the \code{Vector} type in
+Figure~\ref{fig:typecheck-R3}, we use the
+\href{https://docs.racket-lang.org/reference/quasiquote.html}{unquote-splicing
+  operator} \code{,@} to insert the list \code{t*} without its usual
+start and end parentheses.  \index{unquote-slicing}
+
+
 \begin{figure}[tp]
 \begin{lstlisting}
 (define (type-check-exp env)
@@ -7668,26 +7675,26 @@ $\Downarrow$
 $\Rightarrow$
 \begin{minipage}{0.5\textwidth}
 \begin{lstlisting}[basicstyle=\ttfamily\scriptsize]
-(program ()
- (define (add86)
-   ((locals (x87 . Integer) (y88 . Integer)) 
-    (num-params . 2))
-   ((add86start .
-      (block ()
-        (movq (reg rcx) (var x87))
-        (movq (reg rdx) (var y88))
-        (movq (var x87) (reg rax))
-        (addq (var y88) (reg rax))
-        (jmp add86conclusion)))))
- (define (main)
-   ((locals . ((tmp89 . (Integer Integer -> Integer))))
-    (num-params . 0))
-   ((mainstart .
-      (block ()
-        (leaq (fun-ref add86) (var tmp89))
-        (movq (int 40) (reg rcx))
-        (movq (int 2) (reg rdx))
-        (tail-jmp (var tmp89))))))
+(define (add86)
+  ((locals (x87 . Integer) (y88 . Integer)) 
+   (num-params . 2))
+  ((add86start .
+     (block ()
+       (movq (reg rcx) (var x87))
+       (movq (reg rdx) (var y88))
+       (movq (var x87) (reg rax))
+       (addq (var y88) (reg rax))
+       (jmp add86conclusion)))))
+(define (main)
+  ((locals . ((tmp89 . (Integer Integer
+                        -> Integer))))
+   (num-params . 0))
+  ((mainstart .
+     (block ()
+       (leaq (fun-ref add86) (var tmp89))
+       (movq (int 40) (reg rcx))
+       (movq (int 2) (reg rdx))
+       (tail-jmp (var tmp89)))))
 \end{lstlisting}
 $\Downarrow$
 \end{minipage}
@@ -7851,21 +7858,21 @@ functional languages such as Racket. By lexical scoping we mean that a
 function's body may refer to variables whose binding site is outside
 of the function, in an enclosing scope.
 %
-Consider the example in Figure~\ref{fig:lexical-scoping} featuring an
-anonymous function defined using the \key{lambda} form.  The body of
-the \key{lambda}, refers to three variables: \code{x}, \code{y}, and
-\code{z}. The binding sites for \code{x} and \code{y} are outside of
-the \key{lambda}. Variable \code{y} is bound by the enclosing
-\key{let} and \code{x} is a parameter of \code{f}. The \key{lambda} is
-returned from the function \code{f}. Below the definition of \code{f},
-we have two calls to \code{f} with different arguments for \code{x},
-first \code{5} then \code{3}. The functions returned from \code{f} are
-bound to variables \code{g} and \code{h}. Even though these two
-functions were created by the same \code{lambda}, they are really
-different functions because they use different values for
-\code{x}. Finally, we apply \code{g} to \code{11} (producing
-\code{20}) and apply \code{h} to \code{15} (producing \code{22}) so
-the result of this program is \code{42}.
+Consider the example in Figure~\ref{fig:lexical-scoping} written in
+$R_5$, which extends $R_4$ with anonymous functions using the
+\key{lambda} form.  The body of the \key{lambda}, refers to three
+variables: \code{x}, \code{y}, and \code{z}. The binding sites for
+\code{x} and \code{y} are outside of the \key{lambda}. Variable
+\code{y} is bound by the enclosing \key{let} and \code{x} is a
+parameter of function \code{f}. The \key{lambda} is returned from the
+function \code{f}. The main expression of the program includes two
+calls to \code{f} with different arguments for \code{x}, first
+\code{5} then \code{3}. The functions returned from \code{f} are bound
+to variables \code{g} and \code{h}. Even though these two functions
+were created by the same \code{lambda}, they are really different
+functions because they use different values for \code{x}. Applying
+\code{g} to \code{11} produces \code{20} whereas applying \code{h} to
+\code{15} produces \code{22}. The result of this program is \code{42}.
 
 \begin{figure}[btp]
 % s4_6.rkt
@@ -7884,20 +7891,102 @@ the result of this program is \code{42}.
 \end{figure}
 
 
+The approach that we shall take for implementing lexically scoped
+functions is to compile them into top-level function definitions,
+translating from $R_5$ into $R_4$.  However, the compiler will need to
+provide special treatment for variable occurrences such as \code{x}
+and \code{y} in the body of the \code{lambda} of
+Figure~\ref{fig:lexical-scoping}. After all, an $R_4$ function may not
+refer to variables defined outside of it. To identify such variable
+occurrences, we review the standard notion of free variable.
+
+\begin{definition}
+A variable is \emph{free in expression} $e$ if the variable occurs
+inside $e$ but does not have an enclosing binding in $e$.\index{free
+  variable}
+\end{definition}
+
+For example, in the expression \code{(+ x (+ y z))} the variables
+\code{x}, \code{y}, and \code{z} are all free.  On the other hand,
+only \code{x} and \code{y} are free in the following expression
+because \code{z} is bound by the \code{lambda}.
+\begin{lstlisting}
+   (lambda: ([z : Integer]) : Integer
+      (+ x (+ y z)))
+\end{lstlisting}
+
+So the free variables of a \code{lambda} are the ones that will need
+special treatment. We need to arrange for some way to transport, at
+runtime, the values of those variables from the point where the
+\code{lambda} was created to the point where the \code{lambda} is
+applied. An efficient solution to the problem, due to
+\citet{Cardelli:1983aa}, is to bundle into a vector the values of the
+free variables together with the function pointer for the lambda's
+code, an arrangement called a \emph{flat closure} (which we shorten to
+just ``closure'').  \index{closure}\index{flat closure} Fortunately,
+we have all the ingredients to make closures, Chapter~\ref{ch:tuples}
+gave us vectors and Chapter~\ref{ch:functions} gave us function
+pointers. The function pointer shall reside at index $0$ and the
+values for the free variables will fill in the rest of the vector.
+
+Let us revisit the example in Figure~\ref{fig:lexical-scoping} to see
+how closures work. It's a three-step dance. The program first calls
+function \code{f}, which creates a closure for the \code{lambda}. The
+closure is a vector whose first element is a pointer to the top-level
+function that we will generate for the \code{lambda}, the second
+element is the value of \code{x}, which is \code{5}, and the third
+element is \code{4}, the value of \code{y}. The closure does not
+contain an element for \code{z} because \code{z} is not a free
+variable of the \code{lambda}. Creating the closure is step 1 of the
+dance. The closure is returned from \code{f} and bound to \code{g}, as
+shown in Figure~\ref{fig:closures}.
+%
+The second call to \code{f} creates another closure, this time with
+\code{3} in the second slot (for \code{x}). This closure is also
+returned from \code{f} but bound to \code{h}, which is also shown in
+Figure~\ref{fig:closures}.
+
+\begin{figure}[tbp]
+\centering \includegraphics[width=0.6\textwidth]{figs/closures}
+\caption{Example closure representation for the \key{lambda}'s
+  in Figure~\ref{fig:lexical-scoping}.}
+\label{fig:closures}
+\end{figure}
+
+Continuing with the example, consider the application of \code{g} to
+\code{11} in Figure~\ref{fig:lexical-scoping}.  To apply a closure, we
+obtain the function pointer in the first element of the closure and
+call it, passing in the closure itself and then the regular arguments,
+in this case \code{11}. This technique for applying a closure is step
+2 of the dance.
+%
+But doesn't this \code{lambda} only take 1 argument, for parameter
+\code{z}? The third and final step of the dance is generating a
+top-level function for a \code{lambda}.  We add an additional
+parameter for the closure and we insert a \code{let} at the beginning
+of the function for each free variable, to bind those variables to the
+appropriate elements from the closure parameter.
+%
+This three-step dance is known as \emph{closure conversion}.  We
+discuss the details of closure conversion in
+Section~\ref{sec:closure-conversion} and the code generated from the
+example in Section~\ref{sec:example-lambda}. But first we define the
+syntax and semantics of $R_5$ in Section~\ref{sec:r5}.
+
 \section{The $R_5$ Language}
+\label{sec:r5}
 
-The syntax for this language with anonymous functions and lexical
-scoping, $R_5$, is defined in Figure~\ref{fig:r5-syntax}. It adds the
-\key{lambda} form to the grammar for $R_4$, which already has syntax
-for function application.  In this chapter we shall describe how to
-compile $R_5$ back into $R_4$, compiling lexically-scoped functions
-into a combination of functions (as in $R_4$) and tuples (as in
-$R_3$).
+The concrete and abstract syntax for $R_5$, a language with anonymous
+functions and lexical scoping, is defined in
+Figures~\ref{fig:r5-concrete-syntax} and ~\ref{fig:r5-syntax}. It adds
+the \key{lambda} form to the grammar for $R_4$, which already has
+syntax for function application.
 
 \begin{figure}[tp]
 \centering
 \fbox{
-\begin{minipage}{0.96\textwidth}
+  \begin{minipage}{0.96\textwidth}
+    \small
 \[
 \begin{array}{lcl}
   \Type &::=& \gray{\key{Integer} \mid \key{Boolean}
@@ -7913,8 +8002,8 @@ $R_3$).
     &\mid& \gray{(\key{eq?}\;\Exp\;\Exp) \mid \IF{\Exp}{\Exp}{\Exp}} \\
     &\mid& \gray{(\key{vector}\;\Exp\ldots) \mid
           (\key{vector-ref}\;\Exp\;\Int)} \\
-    &\mid& \gray{(\key{vector-set!}\;\Exp\;\Int\;\Exp)\mid (\key{void})} \\
-    &\mid& \gray{(\Exp \; \Exp\ldots)} \\
+    &\mid& \gray{(\key{vector-set!}\;\Exp\;\Int\;\Exp)\mid (\key{void})
+     \mid (\Exp \; \Exp\ldots) } \\
     &\mid& (\key{lambda:}\; ([\Var \key{:} \Type]\ldots) \key{:} \Type \; \Exp) \\
   \Def &::=& \gray{(\key{define}\; (\Var \; [\Var \key{:} \Type]\ldots) \key{:} \Type \; \Exp)} \\
   R_5 &::=& \gray{(\key{program} \; \Def\ldots \; \Exp)}
@@ -7922,66 +8011,46 @@ $R_3$).
 \]
 \end{minipage}
 }
-\caption{Syntax of $R_5$, extending $R_4$ (Figure~\ref{fig:r4-syntax}) 
+\caption{Concrete syntax of $R_5$, extending $R_4$ (Figure~\ref{fig:r4-syntax}) 
   with \key{lambda}.}
-\label{fig:r5-syntax}
+\label{fig:r5-concrete-syntax}
 \end{figure}
 
-To compile lexically-scoped functions to top-level function
-definitions, the compiler will need to provide special treatment to
-variable occurrences such as \code{x} and \code{y} in the body of the
-\code{lambda} of Figure~\ref{fig:lexical-scoping}, for the functions
-of $R_4$ may not refer to variables defined outside the function. To
-identify such variable occurrences, we review the standard notion of
-free variable.
-
-\begin{definition}
-A variable is \emph{free with respect to an expression} $e$ if the
-variable occurs inside $e$ but does not have an enclosing binding in
-$e$.\index{free variable}
-\end{definition}
-
-For example, the variables \code{x}, \code{y}, and \code{z} are all
-free with respect to the expression \code{(+ x (+ y z))}.  On the
-other hand, only \code{x} and \code{y} are free with respect to the
-following expression because \code{z} is bound by the \code{lambda}.
-\begin{lstlisting}
-   (lambda: ([z : Integer]) : Integer
-      (+ x (+ y z)))
-\end{lstlisting}
-
-Once we have identified the free variables of a \code{lambda}, we need
-to arrange for some way to transport, at runtime, the values of those
-variables from the point where the \code{lambda} was created to the
-point where the \code{lambda} is applied. Referring again to
-Figure~\ref{fig:lexical-scoping}, the binding of \code{x} to \code{5}
-needs to be used in the application of \code{g} to \code{11}, but the
-binding of \code{x} to \code{3} needs to be used in the application of
-\code{h} to \code{15}. An efficient solution to the problem, due to
-\citet{Cardelli:1983aa}, is to bundle into a vector the values of the
-free variables together with the function pointer for the lambda's
-code, an arrangement called a \emph{flat closure} (which we shorten to
-just ``closure'').
-\index{closure}\index{flat closure}
-Fortunately, we have all the ingredients to make
-closures, Chapter~\ref{ch:tuples} gave us vectors and
-Chapter~\ref{ch:functions} gave us function pointers. The function
-pointer shall reside at index $0$ and the values for free variables
-will fill in the rest of the vector. Figure~\ref{fig:closures} depicts
-the two closures created by the two calls to \code{f} in
-Figure~\ref{fig:lexical-scoping}.  Because the two closures came from
-the same \key{lambda}, they share the same function pointer but differ
-in the values for the free variable \code{x}.
-
-\begin{figure}[tbp]
-\centering \includegraphics[width=0.6\textwidth]{figs/closures}
-\caption{Example closure representation for the \key{lambda}'s
-  in Figure~\ref{fig:lexical-scoping}.}
-\label{fig:closures}
+\begin{figure}[tp]
+\centering
+\fbox{
+  \begin{minipage}{0.96\textwidth}
+    \small
+\[
+\begin{array}{lcl}
+\Exp &::=& \gray{ \INT{\Int} \mid \READ{} \mid \NEG{\Exp} } \\
+     &\mid& \gray{ \ADD{\Exp}{\Exp} 
+      \mid \BINOP{\code{'-}}{\Exp}{\Exp} } \\
+     &\mid& \gray{ \VAR{\Var} \mid \LET{\Var}{\Exp}{\Exp} } \\
+     &\mid& \gray{ \BOOL{\itm{bool}} 
+      \mid \AND{\Exp}{\Exp} }\\
+     &\mid& \gray{ \OR{\Exp}{\Exp}
+      \mid \NOT{\Exp} } \\
+     &\mid& \gray{ \BINOP{\itm{cmp}}{\Exp}{\Exp}
+      \mid \IF{\Exp}{\Exp}{\Exp} } \\
+     &\mid& \gray{ \VECTOR{\Exp} } \\
+     &\mid& \gray{ \VECREF{\Exp}{\Int} }\\
+     &\mid& \gray{ \VECSET{\Exp}{\Int}{\Exp}} \\
+     &\mid& \gray{ \VOID{} \mid \LP\key{HasType}~\Exp~\Type \RP 
+     \mid \APPLY{\Exp}{\Exp\ldots} }\\
+     &\mid& \LAMBDA{[\Var\code{:}\Type]\ldots}{\Type}{\Exp}\\
+ \Def &::=& \gray{ \FUNDEF{\Var}{([\Var \code{:} \Type]\ldots)}{\Type}{\code{'()}}{\Exp} }\\
+  R_5 &::=& \gray{ \PROGRAMDEFSEXP{\code{'()}}{(\Def\ldots)}{\Exp} }
+\end{array}
+\]
+\end{minipage}
+}
+\caption{The abstract syntax of $R_5$, extending $R_4$ (Figure~\ref{fig:r4-syntax}).}
+\label{fig:r5-syntax}
 \end{figure}
 
-
-\section{Interpreting $R_5$}
+\index{interpreter}
+\label{sec:interp-R5}
 
 Figure~\ref{fig:interp-R5} shows the definitional interpreter for
 $R_5$. The clause for \key{lambda} saves the current environment
@@ -7998,13 +8067,13 @@ values.
     (define recur (interp-exp env))
     (match e
       ...
-      [`(lambda: ([,xs : ,Ts] ...) : ,rT ,body)
+      [(Lambda (list `[,xs : ,Ts] ...) rT body)
        `(lambda ,xs ,body ,env)]
-      [`(app ,fun ,args ...)
+      [(Apply fun args)
        (define fun-val ((interp-exp env) fun))
        (define arg-vals (map (interp-exp env) args))
        (match fun-val
-	 [`(lambda (,xs ...) ,body ,lam-env)
+	 [`(lambda ,xs ,body ,lam-env)
 	  (define new-env (append (map cons xs arg-vals) lam-env))
 	  ((interp-exp new-env) body)]
 	 [else (error "interp-exp, expected function, not" fun-val)])]
@@ -8015,7 +8084,7 @@ values.
 \label{fig:interp-R5}
 \end{figure}
 
-\section{Type Checking $R_5$}
+
 \label{sec:type-check-r5}
 \index{type checking}
 
@@ -8030,13 +8099,15 @@ require the body's type to match the declared return type.
 (define (typecheck-R5 env)
   (lambda (e)
     (match e
-      [`(lambda: ([,xs : ,Ts] ...) : ,rT ,body)
-       (define new-env (append (map cons xs Ts) env))
-       (define bodyT ((typecheck-R5 new-env) body))
-       (cond [(equal? rT bodyT)
-              `(,@Ts -> ,rT)]
-             [else
-              (error "mismatch in return type" bodyT rT)])]
+      [(Lambda (and bnd `([,xs : ,Ts] ...)) rT body)
+       (define-values (new-body bodyT) 
+          ((type-check-exp (append (map cons xs Ts) env)) body))
+       (define ty `(,@Ts -> ,rT))
+       (cond
+         [(equal? rT bodyT)
+           (values (HasType (Lambda bnd rT new-body) ty) ty)]
+         [else
+           (error "mismatch in return type" bodyT rT)])]
       ...
       )))
 \end{lstlisting}

+ 1 - 0
defs.tex

@@ -47,6 +47,7 @@
 \newcommand{\APPLY}[2]{\key{(Apply}\;#1\;#2\code{)}}
 \newcommand{\FUNDEF}[5]{\key{(Def}~#1~#2~#3~#4~#5\code{)}}
 \newcommand{\FUNREF}[1]{\key{(FunRef}\;#1\code{)}}
+\newcommand{\LAMBDA}[3]{\key{(Lambda}~#1~#2~#3\code{)}}
 
 \newcommand{\ASSIGN}[2]{\key{(Assign}~#1\;#2\key{)}}
 \newcommand{\RETURN}[1]{\key{(Return}~#1\key{)}}