|
@@ -166,100 +166,163 @@ University.
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter*{Preface}
|
|
\chapter*{Preface}
|
|
|
|
|
|
-The tradition of compiler writing at Indiana University goes back to
|
|
|
|
-research and courses on programming languages by Professor Daniel
|
|
|
|
-Friedman in the 1970's and 1980's. Friedman conducted research on lazy
|
|
|
|
-evaluation~\citep{Friedman:1976aa} in the context of
|
|
|
|
-Lisp~\citep{McCarthy:1960dz} and then studied
|
|
|
|
-continuations~\citep{Felleisen:kx} and
|
|
|
|
-macros~\citep{Kohlbecker:1986dk} in the context of the
|
|
|
|
-Scheme~\citep{Sussman:1975ab}, a dialect of Lisp. One of the students
|
|
|
|
-of those courses, Kent Dybvig, went on to build Chez
|
|
|
|
-Scheme~\citep{Dybvig:2006aa}, a production-quality and efficient
|
|
|
|
-compiler for Scheme. After completing his Ph.D. at the University of
|
|
|
|
-North Carolina, he returned to teach at Indiana University.
|
|
|
|
-Throughout the 1990's and 2000's, Professor Dybvig continued
|
|
|
|
-development of Chez Scheme and taught the compiler course.
|
|
|
|
|
|
+There is a magical moment when a programmer presses the ``run'' button
|
|
|
|
+and the software begins to execute. Somehow a program written in a
|
|
|
|
+high-level language is running on a computer that is only capable of
|
|
|
|
+shuffling bits. This book reveals the wizardry that makes that
|
|
|
|
+transformation possible. Beginning with the groundbreaking work of
|
|
|
|
+Backus and colleagues in the 1950s, computer scientists discovered
|
|
|
|
+techniques for constructing programs, called \emph{compilers}, that
|
|
|
|
+automatically translate high-level programs into machine code.
|
|
|
|
+
|
|
|
|
+This book guides the reader on a journey, constructing their own
|
|
|
|
+compiler for a small but powerful language. Along the way the reader
|
|
|
|
+learns the essential concepts, algorithms, and data structures that
|
|
|
|
+underlie modern compilers. They develop a clear understanding of how
|
|
|
|
+programs are mapped onto computer hardware which is helpful when
|
|
|
|
+reasoning about execution time, debugging errors across layers of the
|
|
|
|
+software stack, and understanding security vulnerabilities in a piece
|
|
|
|
+of code.
|
|
|
|
+%
|
|
|
|
+For readers interested in a career in compiler construction, this book
|
|
|
|
+serves as stepping-stone to more advanced topics such as just-in-time
|
|
|
|
+compilation, program analysis, and program optimization.
|
|
|
|
+%
|
|
|
|
+For readers interested in the creation of programming languages, this
|
|
|
|
+book connects language design choices to their impact on compiler
|
|
|
|
+organization and the generated code.
|
|
|
|
+
|
|
|
|
+Compilers are typically organized into a pipeline with a handful of
|
|
|
|
+stages, called passes, that translate a program into lower-level
|
|
|
|
+abstractions. We take this approach to the extreme by splitting the
|
|
|
|
+compiler into a large number of \emph{nanopasses}, each of which
|
|
|
|
+performs a single task. This makes the compiler easier to debug,
|
|
|
|
+because we test the output of each pass, and it makes the compiler
|
|
|
|
+easier to understand, because each pass involves fewer concepts.
|
|
|
|
+
|
|
|
|
+Most books about compiler construction are structured in the same way
|
|
|
|
+as the compiler, with each chapter describing how to construct one
|
|
|
|
+pass. The problem with that structure is that it becomes easy to lose
|
|
|
|
+sight of which features of the input language motivate the design
|
|
|
|
+choices in a particular pass of the compiler. We instead take an
|
|
|
|
+\emph{incremental} approach in which we build a complete compiler in
|
|
|
|
+each chapter, starting with a tiny language and adding new features in
|
|
|
|
+each subsequent chapter.
|
|
|
|
+
|
|
|
|
+Our choice of language features is designed to elicit the fundamental
|
|
|
|
+concepts and algorithms used in compilers for modern programming
|
|
|
|
+languages.
|
|
|
|
+\begin{itemize}
|
|
|
|
+\item We begin with integer arithmetic and local variables. The
|
|
|
|
+ reader becomes acquainted with the basic tools of compiler
|
|
|
|
+ construction, \emph{abstract syntax trees} and \emph{recursive
|
|
|
|
+ functions}, in Chapter~\ref{ch:trees-recur} and applies them to a
|
|
|
|
+ language with integers and variables in Chapter~\ref{ch:Rvar}. In
|
|
|
|
+ Chapter~\ref{ch:register-allocation-Rvar} we apply \emph{graph
|
|
|
|
+ coloring} to assign variables to registers.
|
|
|
|
+\item Chapter~\ref{ch:Rif} adds conditional control-flow, which
|
|
|
|
+ motivates the need for \emph{control-flow graphs}.
|
|
|
|
+\item Chapter~\ref{ch:Rvec} adds heap-allocated tuples, motivating
|
|
|
|
+ \emph{garbage collection}.
|
|
|
|
+\item Chapter~\ref{ch:Rfun} adds functions similar those in the C
|
|
|
|
+ programming language~\citep{Kernighan:1988nx}: first-class values
|
|
|
|
+ without lexical scoping. The reader learns about the procedure call
|
|
|
|
+ stack, \emph{calling conventions}, and their interaction with
|
|
|
|
+ register allocation and garbage collection.
|
|
|
|
+\item Chapter~\ref{ch:Rlam} adds anonymous functions with lexical
|
|
|
|
+ scoping, i.e., \emph{lambda abstraction}. The reader learns about
|
|
|
|
+ \emph{closure conversion}, in which lambdas are translated into a
|
|
|
|
+ combination of functions and tuples.
|
|
|
|
+\item Chapter~\ref{ch:Rdyn} adds \emph{dynamic typing}. Up until this
|
|
|
|
+ point the input languages are statically typed. The reader extends
|
|
|
|
+ the statically typed language with an \code{Any} type which serves
|
|
|
|
+ as a target for compiling the dynamically typed language.
|
|
|
|
+\item Chapter~\ref{ch:Rwhile} fleshes out support for imperative
|
|
|
|
+ programming languages with the addition of loops and mutable
|
|
|
|
+ variables. These additions elicit the need for \emph{dataflow
|
|
|
|
+ analysis} in the register allocator.
|
|
|
|
+\item Chapter~\ref{ch:Rgrad} uses the \code{Any} type of
|
|
|
|
+ Chapter~\ref{ch:Rdyn} to implement a \emph{gradually typed language}
|
|
|
|
+ in which different regions of a program may be static or dynamically
|
|
|
|
+ typed. The reader implements runtime support for \emph{proxies} that
|
|
|
|
+ allow values to safely move between regions.
|
|
|
|
+\item Chapter~\ref{ch:Rpoly} adds \emph{generics} with autoboxing,
|
|
|
|
+ leveraging the \code{Any} type and type casts developed in Chapters
|
|
|
|
+ \ref{ch:Rdyn} and \ref{ch:Rgrad}.
|
|
|
|
+\end{itemize}
|
|
|
|
+Alas, there are many language features that we do not include. Our
|
|
|
|
+choices are informed by a cost-benefit analysis in which we weigh the
|
|
|
|
+incidental complexity of a feature against the number of fundamental
|
|
|
|
+concepts that it exposes. For example, we include tuples and not
|
|
|
|
+records because they both elicit the study of heap allocation and
|
|
|
|
+garbage collection but records come with more incidental complexity.
|
|
|
|
+
|
|
|
|
+Since 2016 this book has served as the textbook for the compiler
|
|
|
|
+course at Indiana University, a 16-week course for upper-level
|
|
|
|
+undergraduates and first-year graduate students. Prior to this
|
|
|
|
+course, students learn to program in both imperative and functional
|
|
|
|
+languages, study data structures and algorithms, and take discrete
|
|
|
|
+mathematics.
|
|
|
|
+%
|
|
|
|
+The students form groups of 2-4 people and complete one chapter every
|
|
|
|
+two weeks, starting with Chapter~\ref{ch:Rvar} and finishing with
|
|
|
|
+Chapter~\ref{ch:Rdyn}. Most chapters include a challenge problem that
|
|
|
|
+we assign to the graduate students. The last two weeks of the course
|
|
|
|
+are reserved for a final project in which students design and
|
|
|
|
+implement an extension to the compiler of their choosing.
|
|
|
|
+Chapters~\ref{ch:Rwhile}, \ref{ch:Rgrad}, and \ref{ch:Rpoly} can be
|
|
|
|
+used in support of these projects or can be swapped in to replace some
|
|
|
|
+of the earlier chapters. For example, a course with an emphasis on
|
|
|
|
+statically-typed imperative languages would skip Chapter~\ref{ch:Rdyn}
|
|
|
|
+in favor of
|
|
|
|
+Chapter~\ref{ch:Rwhile}. Figure~\ref{fig:chapter-dependences} depicts
|
|
|
|
+the dependencies between chapters.
|
|
|
|
+
|
|
|
|
+This book has also been used in compiler courses at California
|
|
|
|
+Polytechnic State University, Rose–Hulman Institute of Technology, and
|
|
|
|
+University of Massachusetts Lowell.
|
|
|
|
|
|
-The compiler course evolved to incorporate novel pedagogical ideas
|
|
|
|
-while also including elements of effective real-world compilers. One
|
|
|
|
-of Friedman's ideas was to split the compiler into many small
|
|
|
|
-``passes'' so that the code for each pass would be easy to understood
|
|
|
|
-in isolation. In contrast, most compilers of the time were organized
|
|
|
|
-into only a few monolithic passes for reasons of compile-time
|
|
|
|
-efficiency. Another idea, called ``the game'', was to test the code
|
|
|
|
-generated by each pass on interpreters for each intermediate language,
|
|
|
|
-thereby helping to pinpoint errors in individual passes.
|
|
|
|
-%
|
|
|
|
-Dybvig, with later help from his students Dipanwita Sarkar and Andrew
|
|
|
|
-Keep, developed infrastructure to support this approach and evolved
|
|
|
|
-the course, first to use smaller micro-passes and then into even
|
|
|
|
-smaller nano-passes~\citep{Sarkar:2004fk,Keep:2012aa}. I was a student
|
|
|
|
-in this compiler course in the early 2000's as part of my
|
|
|
|
-Ph.D. studies at Indiana University. Needless to say, I enjoyed the
|
|
|
|
-course immensely!
|
|
|
|
-
|
|
|
|
-During that time, another graduate student named Abdulaziz Ghuloum
|
|
|
|
-observed that the front-to-back organization of the course made it
|
|
|
|
-difficult for students to understand the rationale for the compiler
|
|
|
|
-design. Ghuloum proposed an incremental approach in which the students
|
|
|
|
-start by implementing a complete compiler for a very small subset of
|
|
|
|
-the language. In each subsequent stage they add a feature to the
|
|
|
|
-language and then add or modify passes to handle the new
|
|
|
|
-feature~\citep{Ghuloum:2006bh}. In this way, the students see how the
|
|
|
|
-language features motivate aspects of the compiler design.
|
|
|
|
-
|
|
|
|
-After graduating from Indiana University in 2005, I went on to teach
|
|
|
|
-at the University of Colorado. I adapted the nano-pass and incremental
|
|
|
|
-approaches to compiling a subset of the Python
|
|
|
|
-language~\citep{Siek:2012ab}.
|
|
|
|
-%% Python and Scheme are quite different
|
|
|
|
-%% on the surface but there is a large overlap in the compiler techniques
|
|
|
|
-%% required for the two languages. Thus, I was able to teach much of the
|
|
|
|
-%% same content from the Indiana compiler course.
|
|
|
|
-I very much enjoyed teaching the course organized in this way, and
|
|
|
|
-even better, many of the students learned a lot and got excited about
|
|
|
|
-compilers.
|
|
|
|
-
|
|
|
|
-I returned to Indiana University in 2013. In my absence the compiler
|
|
|
|
-course had switched from the front-to-back organization to a
|
|
|
|
-back-to-front~\citep{Dybvig:2010aa}. While that organization also works
|
|
|
|
-well, I prefer the incremental approach and started porting and
|
|
|
|
-adapting the structure of the Colorado course back into the land of
|
|
|
|
-Scheme. In the meantime Indiana University had moved on from Scheme to
|
|
|
|
-Racket~\citep{plt-tr}, so the course is now about compiling a subset
|
|
|
|
-of Racket (and Typed Racket) to the x86 assembly language.
|
|
|
|
-
|
|
|
|
-This is the textbook for the incremental version of the compiler
|
|
|
|
-course at Indiana University (Spring 2016 - present). With this book
|
|
|
|
-I hope to make the Indiana compiler course available to people that
|
|
|
|
-have not had the chance to study compilers at Indiana University.
|
|
|
|
-
|
|
|
|
-%% I have captured what
|
|
|
|
-%% I think are the most important topics from \cite{Dybvig:2010aa} but
|
|
|
|
-%% have omitted topics that are less interesting conceptually. I have
|
|
|
|
-%% also made simplifications to reduce complexity. In this way, this
|
|
|
|
-%% book leans more towards pedagogy than towards the efficiency of the
|
|
|
|
-%% generated code. Also, the book differs in places where we I the
|
|
|
|
-%% opportunity to make the topics more fun, such as in relating register
|
|
|
|
-%% allocation to Sudoku (Chapter~\ref{ch:register-allocation-Rvar}).
|
|
|
|
-
|
|
|
|
-\section*{Prerequisites}
|
|
|
|
-
|
|
|
|
-The material in this book is challenging but rewarding. It is meant to
|
|
|
|
-prepare students for a lifelong career in programming languages.
|
|
|
|
-
|
|
|
|
-The book uses the Racket language both for the implementation of the
|
|
|
|
-compiler and for the language that is compiled, so a student should be
|
|
|
|
-proficient with Racket or Scheme prior to reading this book. There are
|
|
|
|
-many excellent resources for learning Scheme and
|
|
|
|
|
|
+
|
|
|
|
+\begin{figure}[tp]
|
|
|
|
+\begin{tikzpicture}[baseline=(current bounding box.center)]
|
|
|
|
+ \node (C1) at (0,1) {\small Ch.~\ref{ch:trees-recur} Preliminaries};
|
|
|
|
+ \node (C2) at (4,1) {\small Ch.~\ref{ch:Rvar} Variables};
|
|
|
|
+ \node (C3) at (8,1) {\small Ch.~\ref{ch:register-allocation-Rvar} Registers};
|
|
|
|
+ \node (C4) at (0,0) {\small Ch.~\ref{ch:Rif} Control Flow};
|
|
|
|
+ \node (C5) at (4,0) {\small Ch.~\ref{ch:Rvec} Tuples};
|
|
|
|
+ \node (C6) at (8,0) {\small Ch.~\ref{ch:Rfun} Functions};
|
|
|
|
+ \node (C9) at (0,-1) {\small Ch.~\ref{ch:Rwhile} Loops};
|
|
|
|
+ \node (C8) at (4,-1) {\small Ch.~\ref{ch:Rdyn} Dynamic};
|
|
|
|
+ \node (C7) at (8,-1) {\small Ch.~\ref{ch:Rlam} Lambda};
|
|
|
|
+ \node (C10) at (4,-2) {\small Ch.~\ref{ch:Rgrad} Gradual};
|
|
|
|
+ \node (C11) at (8,-2) {\small Ch.~\ref{ch:Rpoly} Generics};
|
|
|
|
+
|
|
|
|
+ \path[->] (C1) edge [above] node {} (C2);
|
|
|
|
+ \path[->] (C2) edge [above] node {} (C3);
|
|
|
|
+ \path[->] (C3) edge [above] node {} (C4);
|
|
|
|
+ \path[->] (C4) edge [above] node {} (C5);
|
|
|
|
+ \path[->] (C5) edge [above] node {} (C6);
|
|
|
|
+ \path[->] (C6) edge [above] node {} (C7);
|
|
|
|
+ \path[->] (C4) edge [above] node {} (C8);
|
|
|
|
+ \path[->] (C4) edge [above] node {} (C9);
|
|
|
|
+ \path[->] (C8) edge [above] node {} (C10);
|
|
|
|
+ \path[->] (C10) edge [above] node {} (C11);
|
|
|
|
+\end{tikzpicture}
|
|
|
|
+ \caption{Diagram of chapter dependencies.}
|
|
|
|
+ \label{fig:chapter-dependences}
|
|
|
|
+\end{figure}
|
|
|
|
+
|
|
|
|
+This book uses the \href{https://racket-lang.org/}{Racket} language
|
|
|
|
+both for the implementation of the compiler and for the input
|
|
|
|
+language, so the reader should be proficient with Racket or Scheme
|
|
|
|
+prior to reading this book. There are many excellent resources for
|
|
|
|
+learning Scheme and
|
|
Racket~\citep{Dybvig:1987aa,Abelson:1996uq,Friedman:1996aa,Felleisen:2001aa,Felleisen:2013aa,Flatt:2014aa}.
|
|
Racket~\citep{Dybvig:1987aa,Abelson:1996uq,Friedman:1996aa,Felleisen:2001aa,Felleisen:2013aa,Flatt:2014aa}.
|
|
|
|
|
|
-It is helpful but not necessary for the student to have prior exposure
|
|
|
|
-to the x86 assembly language~\citep{Intel:2015aa}, as one might obtain
|
|
|
|
-from a computer systems
|
|
|
|
-course~\citep{Bryant:2010aa}. This book introduces the
|
|
|
|
-parts of x86-64 assembly language that are needed.
|
|
|
|
|
|
+The compiler targets x86 assembly language~\citep{Intel:2015aa}, so it
|
|
|
|
+is helpful but not necessary for the reader to have taken a computer
|
|
|
|
+systems course~\citep{Bryant:2010aa}. This book introduces the parts
|
|
|
|
+of x86-64 assembly language that are needed.
|
|
%
|
|
%
|
|
We follow the System V calling
|
|
We follow the System V calling
|
|
conventions~\citep{Bryant:2005aa,Matz:2013aa}, which means that the
|
|
conventions~\citep{Bryant:2005aa,Matz:2013aa}, which means that the
|
|
@@ -275,30 +338,139 @@ code that we generate will \emph{not} work properly with our runtime
|
|
system on Windows. One option to consider for using a Windows computer
|
|
system on Windows. One option to consider for using a Windows computer
|
|
is to run a virtual machine with Linux as the guest operating system.
|
|
is to run a virtual machine with Linux as the guest operating system.
|
|
|
|
|
|
-%\section*{Structure of book}
|
|
|
|
-% You might want to add short description about each chapter in this book.
|
|
|
|
-
|
|
|
|
-%\section*{About the companion website}
|
|
|
|
-%The website\footnote{\url{https://github.com/amberj/latex-book-template}} for %this file contains:
|
|
|
|
-%\begin{itemize}
|
|
|
|
-% \item A link to (freely downlodable) latest version of this document.
|
|
|
|
-% \item Link to download LaTeX source for this document.
|
|
|
|
-% \item Miscellaneous material (e.g. suggested readings etc).
|
|
|
|
-%\end{itemize}
|
|
|
|
|
|
+% TODO: point to support code on github
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+%% The tradition of compiler writing at Indiana University goes back to
|
|
|
|
+%% research and courses on programming languages by Professor Daniel
|
|
|
|
+%% Friedman in the 1970's and 1980's. Friedman conducted research on lazy
|
|
|
|
+%% evaluation~\citep{Friedman:1976aa} in the context of
|
|
|
|
+%% Lisp~\citep{McCarthy:1960dz} and then studied
|
|
|
|
+%% continuations~\citep{Felleisen:kx} and
|
|
|
|
+%% macros~\citep{Kohlbecker:1986dk} in the context of the
|
|
|
|
+%% Scheme~\citep{Sussman:1975ab}, a dialect of Lisp. One of the students
|
|
|
|
+%% of those courses, Kent Dybvig, went on to build Chez
|
|
|
|
+%% Scheme~\citep{Dybvig:2006aa}, a production-quality and efficient
|
|
|
|
+%% compiler for Scheme. After completing his Ph.D. at the University of
|
|
|
|
+%% North Carolina, he returned to teach at Indiana University.
|
|
|
|
+%% Throughout the 1990's and 2000's, Professor Dybvig continued
|
|
|
|
+%% development of Chez Scheme and taught the compiler course.
|
|
|
|
+
|
|
|
|
+%% The compiler course evolved to incorporate novel pedagogical ideas
|
|
|
|
+%% while also including elements of effective real-world compilers. One
|
|
|
|
+%% of Friedman's ideas was to split the compiler into many small
|
|
|
|
+%% ``passes'' so that the code for each pass would be easy to understood
|
|
|
|
+%% in isolation. In contrast, most compilers of the time were organized
|
|
|
|
+%% into only a few monolithic passes for reasons of compile-time
|
|
|
|
+%% efficiency. Another idea, called ``the game'', was to test the code
|
|
|
|
+%% generated by each pass on interpreters for each intermediate language,
|
|
|
|
+%% thereby helping to pinpoint errors in individual passes.
|
|
|
|
+%% %
|
|
|
|
+%% Dybvig, with later help from his students Dipanwita Sarkar and Andrew
|
|
|
|
+%% Keep, developed infrastructure to support this approach and evolved
|
|
|
|
+%% the course, first to use smaller micro-passes and then into even
|
|
|
|
+%% smaller nano-passes~\citep{Sarkar:2004fk,Keep:2012aa}. I was a student
|
|
|
|
+%% in this compiler course in the early 2000's as part of my
|
|
|
|
+%% Ph.D. studies at Indiana University. Needless to say, I enjoyed the
|
|
|
|
+%% course immensely!
|
|
|
|
+
|
|
|
|
+%% During that time, another graduate student named Abdulaziz Ghuloum
|
|
|
|
+%% observed that the front-to-back organization of the course made it
|
|
|
|
+%% difficult for students to understand the rationale for the compiler
|
|
|
|
+%% design. Ghuloum proposed an incremental approach in which the students
|
|
|
|
+%% start by implementing a complete compiler for a very small subset of
|
|
|
|
+%% the language. In each subsequent stage they add a feature to the
|
|
|
|
+%% language and then add or modify passes to handle the new
|
|
|
|
+%% feature~\citep{Ghuloum:2006bh}. In this way, the students see how the
|
|
|
|
+%% language features motivate aspects of the compiler design.
|
|
|
|
+
|
|
|
|
+%% After graduating from Indiana University in 2005, I went on to teach
|
|
|
|
+%% at the University of Colorado. I adapted the nano-pass and incremental
|
|
|
|
+%% approaches to compiling a subset of the Python
|
|
|
|
+%% language~\citep{Siek:2012ab}.
|
|
|
|
+%% %% Python and Scheme are quite different
|
|
|
|
+%% %% on the surface but there is a large overlap in the compiler techniques
|
|
|
|
+%% %% required for the two languages. Thus, I was able to teach much of the
|
|
|
|
+%% %% same content from the Indiana compiler course.
|
|
|
|
+%% I very much enjoyed teaching the course organized in this way, and
|
|
|
|
+%% even better, many of the students learned a lot and got excited about
|
|
|
|
+%% compilers.
|
|
|
|
+
|
|
|
|
+%% I returned to Indiana University in 2013. In my absence the compiler
|
|
|
|
+%% course had switched from the front-to-back organization to a
|
|
|
|
+%% back-to-front~\citep{Dybvig:2010aa}. While that organization also works
|
|
|
|
+%% well, I prefer the incremental approach and started porting and
|
|
|
|
+%% adapting the structure of the Colorado course back into the land of
|
|
|
|
+%% Scheme. In the meantime Indiana University had moved on from Scheme to
|
|
|
|
+%% Racket~\citep{plt-tr}, so the course is now about compiling a subset
|
|
|
|
+%% of Racket (and Typed Racket) to the x86 assembly language.
|
|
|
|
+
|
|
|
|
+%% This is the textbook for the incremental version of the compiler
|
|
|
|
+%% course at Indiana University (Spring 2016 - present). With this book
|
|
|
|
+%% I hope to make the Indiana compiler course available to people that
|
|
|
|
+%% have not had the chance to study compilers at Indiana University.
|
|
|
|
+
|
|
|
|
+%% %% I have captured what
|
|
|
|
+%% %% I think are the most important topics from \cite{Dybvig:2010aa} but
|
|
|
|
+%% %% have omitted topics that are less interesting conceptually. I have
|
|
|
|
+%% %% also made simplifications to reduce complexity. In this way, this
|
|
|
|
+%% %% book leans more towards pedagogy than towards the efficiency of the
|
|
|
|
+%% %% generated code. Also, the book differs in places where we I the
|
|
|
|
+%% %% opportunity to make the topics more fun, such as in relating register
|
|
|
|
+%% %% allocation to Sudoku (Chapter~\ref{ch:register-allocation-Rvar}).
|
|
|
|
+
|
|
|
|
+%% \section*{Prerequisites}
|
|
|
|
+
|
|
|
|
+%% The material in this book is challenging but rewarding. It is meant to
|
|
|
|
+%% prepare students for a lifelong career in programming languages.
|
|
|
|
+
|
|
|
|
+%% %\section*{Structure of book}
|
|
|
|
+%% % You might want to add short description about each chapter in this book.
|
|
|
|
+
|
|
|
|
+%% %\section*{About the companion website}
|
|
|
|
+%% %The website\footnote{\url{https://github.com/amberj/latex-book-template}} for %this file contains:
|
|
|
|
+%% %\begin{itemize}
|
|
|
|
+%% % \item A link to (freely downlodable) latest version of this document.
|
|
|
|
+%% % \item Link to download LaTeX source for this document.
|
|
|
|
+%% % \item Miscellaneous material (e.g. suggested readings etc).
|
|
|
|
+%% %\end{itemize}
|
|
|
|
|
|
\section*{Acknowledgments}
|
|
\section*{Acknowledgments}
|
|
|
|
|
|
-Many people have contributed to the ideas, techniques, and
|
|
|
|
-organization of this book and have taught courses based on it. Many
|
|
|
|
-of the compiler design decisions in this book are drawn from the
|
|
|
|
-assignment descriptions of \cite{Dybvig:2010aa}. We also would like
|
|
|
|
-to thank John Clements, Bor-Yuh Evan Chang, Daniel P. Friedman, Ronald
|
|
|
|
|
|
+The tradition of compiler writing at Indiana University goes back to
|
|
|
|
+research and courses on programming languages by Professor Daniel
|
|
|
|
+Friedman in the 1970's and 1980's. One of his students, Kent Dybvig,
|
|
|
|
+built Chez Scheme~\citep{Dybvig:2006aa}, a production-quality and
|
|
|
|
+efficient compiler for Scheme. Throughout the 1990's and 2000's,
|
|
|
|
+Professor Dybvig taught the compiler course and continued development
|
|
|
|
+of Chez Scheme.
|
|
|
|
+%
|
|
|
|
+The compiler course evolved to incorporate novel pedagogical ideas
|
|
|
|
+while also including elements of efficient real-world compilers. One
|
|
|
|
+of Friedman's ideas was to split the compiler into many small
|
|
|
|
+passes. Another idea, called ``the game'', was to test the code
|
|
|
|
+generated by each pass on interpreters.
|
|
|
|
+
|
|
|
|
+Dybvig, with later help from his students Dipanwita Sarkar and Andrew
|
|
|
|
+Keep, developed infrastructure to support this approach and evolved
|
|
|
|
+the course use even smaller
|
|
|
|
+nanopasses~\citep{Sarkar:2004fk,Keep:2012aa}. Many of the compiler
|
|
|
|
+design decisions in this book are drawn from the assignment
|
|
|
|
+descriptions of \citet{Dybvig:2010aa}. A graduate student named
|
|
|
|
+Abdulaziz Ghuloum observed that the front-to-back organization of the
|
|
|
|
+course made it difficult for students to understand the rationale for
|
|
|
|
+the compiler design. Ghuloum proposed the incremental
|
|
|
|
+approach~\citep{Ghuloum:2006bh}.
|
|
|
|
+
|
|
|
|
+We thank John Clements, Bor-Yuh Evan Chang, Daniel P. Friedman, Ronald
|
|
Garcia, Abdulaziz Ghuloum, Jay McCarthy, Nate Nystrom, Dipanwita
|
|
Garcia, Abdulaziz Ghuloum, Jay McCarthy, Nate Nystrom, Dipanwita
|
|
Sarkar, Oscar Waddell, and Michael Wollowski.
|
|
Sarkar, Oscar Waddell, and Michael Wollowski.
|
|
|
|
|
|
\mbox{}\\
|
|
\mbox{}\\
|
|
\noindent Jeremy G. Siek \\
|
|
\noindent Jeremy G. Siek \\
|
|
-\noindent \url{http://homes.soic.indiana.edu/jsiek} \\
|
|
|
|
|
|
+Indiana University
|
|
|
|
+%\noindent \url{http://homes.soic.indiana.edu/jsiek} \\
|
|
%\noindent Spring 2016
|
|
%\noindent Spring 2016
|
|
|
|
|
|
|
|
|
|
@@ -937,7 +1109,7 @@ do anything. On the other hand, if the error is a
|
|
\code{trapped-error}, then the compiler must produce an executable and
|
|
\code{trapped-error}, then the compiler must produce an executable and
|
|
it is required to report that an error occurred. To signal an error,
|
|
it is required to report that an error occurred. To signal an error,
|
|
exit with a return code of \code{255}. The interpreters in chapters
|
|
exit with a return code of \code{255}. The interpreters in chapters
|
|
-\ref{ch:type-dynamic} and \ref{ch:gradual-typing} use
|
|
|
|
|
|
+\ref{ch:type-dynamic} and \ref{ch:Rgrad} use
|
|
\code{trapped-error}.
|
|
\code{trapped-error}.
|
|
|
|
|
|
%% This convention applies to the languages defined in this
|
|
%% This convention applies to the languages defined in this
|
|
@@ -1066,7 +1238,7 @@ Appendix~\ref{appendix:utilities}.\\
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter{Integers and Variables}
|
|
\chapter{Integers and Variables}
|
|
-\label{ch:int-exp}
|
|
|
|
|
|
+\label{ch:Rvar}
|
|
|
|
|
|
This chapter is about compiling a subset of Racket to x86-64 assembly
|
|
This chapter is about compiling a subset of Racket to x86-64 assembly
|
|
code~\citep{Intel:2015aa}. The subset, named \LangVar{}, includes
|
|
code~\citep{Intel:2015aa}. The subset, named \LangVar{}, includes
|
|
@@ -1488,7 +1660,7 @@ specified by the label and $\key{retq}$ returns from a procedure to
|
|
its caller.
|
|
its caller.
|
|
%
|
|
%
|
|
We discuss procedure calls in more detail later in this chapter and in
|
|
We discuss procedure calls in more detail later in this chapter and in
|
|
-Chapter~\ref{ch:functions}. The instruction $\key{jmp}\,\itm{label}$
|
|
|
|
|
|
+Chapter~\ref{ch:Rfun}. The instruction $\key{jmp}\,\itm{label}$
|
|
updates the program counter to the address of the instruction after
|
|
updates the program counter to the address of the instruction after
|
|
the specified label.
|
|
the specified label.
|
|
|
|
|
|
@@ -1641,7 +1813,7 @@ allowed in front of every instructions. Instead instructions are
|
|
grouped into \emph{blocks}\index{block}\index{basic block} with a
|
|
grouped into \emph{blocks}\index{block}\index{basic block} with a
|
|
label associated with every block, which is why the \key{X86Program}
|
|
label associated with every block, which is why the \key{X86Program}
|
|
struct includes an alist mapping labels to blocks. The reason for this
|
|
struct includes an alist mapping labels to blocks. The reason for this
|
|
-organization becomes apparent in Chapter~\ref{ch:bool-types} when we
|
|
|
|
|
|
+organization becomes apparent in Chapter~\ref{ch:Rif} when we
|
|
introduce conditional branching. The \code{Block} structure includes
|
|
introduce conditional branching. The \code{Block} structure includes
|
|
an $\itm{info}$ field that is not needed for this chapter, but becomes
|
|
an $\itm{info}$ field that is not needed for this chapter, but becomes
|
|
useful in Chapter~\ref{ch:register-allocation-Rvar}. For now, the
|
|
useful in Chapter~\ref{ch:register-allocation-Rvar}. For now, the
|
|
@@ -1779,7 +1951,7 @@ The ordering of \key{uniquify} with respect to
|
|
\key{uniquify} to come first.
|
|
\key{uniquify} to come first.
|
|
|
|
|
|
Last, we consider \key{select-instructions} and \key{assign-homes}.
|
|
Last, we consider \key{select-instructions} and \key{assign-homes}.
|
|
-These two passes are intertwined. In Chapter~\ref{ch:functions} we
|
|
|
|
|
|
+These two passes are intertwined. In Chapter~\ref{ch:Rfun} we
|
|
learn that, in x86, registers are used for passing arguments to
|
|
learn that, in x86, registers are used for passing arguments to
|
|
functions and it is preferable to assign parameters to their
|
|
functions and it is preferable to assign parameters to their
|
|
corresponding registers. On the other hand, by selecting instructions
|
|
corresponding registers. On the other hand, by selecting instructions
|
|
@@ -1867,7 +2039,7 @@ A \LangCVar{} program consists of a control-flow graph represented as
|
|
an alist mapping labels to tails. This is more general than necessary
|
|
an alist mapping labels to tails. This is more general than necessary
|
|
for the present chapter, as we do not yet introduce \key{goto} for
|
|
for the present chapter, as we do not yet introduce \key{goto} for
|
|
jumping to labels, but it saves us from having to change the syntax in
|
|
jumping to labels, but it saves us from having to change the syntax in
|
|
-Chapter~\ref{ch:bool-types}. For now there will be just one label,
|
|
|
|
|
|
+Chapter~\ref{ch:Rif}. For now there will be just one label,
|
|
\key{start}, and the whole program is its tail.
|
|
\key{start}, and the whole program is its tail.
|
|
%
|
|
%
|
|
The $\itm{info}$ field of the \key{CProgram} form, after the
|
|
The $\itm{info}$ field of the \key{CProgram} form, after the
|
|
@@ -2297,7 +2469,7 @@ output. The reader might be tempted to instead organize
|
|
\code{cont} parameter and perhaps using \code{append} to combine
|
|
\code{cont} parameter and perhaps using \code{append} to combine
|
|
statements. We warn against that alternative because the
|
|
statements. We warn against that alternative because the
|
|
accumulator-passing style is key to how we generate high-quality code
|
|
accumulator-passing style is key to how we generate high-quality code
|
|
-for conditional expressions in Chapter~\ref{ch:bool-types}.
|
|
|
|
|
|
+for conditional expressions in Chapter~\ref{ch:Rif}.
|
|
|
|
|
|
\begin{exercise}\normalfont
|
|
\begin{exercise}\normalfont
|
|
%
|
|
%
|
|
@@ -2655,7 +2827,7 @@ all, fast code is useless if it produces incorrect results!
|
|
|
|
|
|
\index{register allocation}
|
|
\index{register allocation}
|
|
|
|
|
|
-In Chapter~\ref{ch:int-exp} we learned how to store variables on the
|
|
|
|
|
|
+In Chapter~\ref{ch:Rvar} we learned how to store variables on the
|
|
stack. In this Chapter we learn how to improve the performance of the
|
|
stack. In this Chapter we learn how to improve the performance of the
|
|
generated code by placing some variables into registers. The CPU can
|
|
generated code by placing some variables into registers. The CPU can
|
|
access a register in a single cycle, whereas accessing the stack can
|
|
access a register in a single cycle, whereas accessing the stack can
|
|
@@ -2724,7 +2896,7 @@ then model register allocation as a graph coloring problem
|
|
|
|
|
|
If we run out of registers despite these efforts, we place the
|
|
If we run out of registers despite these efforts, we place the
|
|
remaining variables on the stack, similar to what we did in
|
|
remaining variables on the stack, similar to what we did in
|
|
-Chapter~\ref{ch:int-exp}. It is common to use the verb \emph{spill}
|
|
|
|
|
|
+Chapter~\ref{ch:Rvar}. It is common to use the verb \emph{spill}
|
|
for assigning a variable to a stack location. The decision to spill a
|
|
for assigning a variable to a stack location. The decision to spill a
|
|
variable is handled as part of the graph coloring process
|
|
variable is handled as part of the graph coloring process
|
|
(Section~\ref{sec:graph-coloring}).
|
|
(Section~\ref{sec:graph-coloring}).
|
|
@@ -2800,7 +2972,7 @@ rdi rsi rdx rcx r8 r9
|
|
\end{lstlisting}
|
|
\end{lstlisting}
|
|
If there are more than six arguments, then the convention is to use
|
|
If there are more than six arguments, then the convention is to use
|
|
space on the frame of the caller for the rest of the
|
|
space on the frame of the caller for the rest of the
|
|
-arguments. However, in Chapter~\ref{ch:functions} we arrange never to
|
|
|
|
|
|
+arguments. However, in Chapter~\ref{ch:Rfun} we arrange never to
|
|
need more than six arguments. For now, the only function we care about
|
|
need more than six arguments. For now, the only function we care about
|
|
is \code{read\_int} and it takes zero arguments.
|
|
is \code{read\_int} and it takes zero arguments.
|
|
%
|
|
%
|
|
@@ -3407,7 +3579,7 @@ particular, we assign $-1$ to \code{rax} and $-2$ to \code{rsp}.
|
|
%% One might wonder why we include registers at all in the liveness
|
|
%% One might wonder why we include registers at all in the liveness
|
|
%% analysis and interference graph. For example, we never allocate a
|
|
%% analysis and interference graph. For example, we never allocate a
|
|
%% variable to \code{rax} and \code{rsp}, so it would be harmless to
|
|
%% variable to \code{rax} and \code{rsp}, so it would be harmless to
|
|
-%% leave them out. As we see in Chapter~\ref{ch:tuples}, when we begin
|
|
|
|
|
|
+%% leave them out. As we see in Chapter~\ref{ch:Rvec}, when we begin
|
|
%% to use register for passing arguments to functions, it will be
|
|
%% to use register for passing arguments to functions, it will be
|
|
%% necessary for those registers to appear in the interference graph
|
|
%% necessary for those registers to appear in the interference graph
|
|
%% because those registers will also be assigned to variables, and we
|
|
%% because those registers will also be assigned to variables, and we
|
|
@@ -3692,7 +3864,7 @@ We recommend creating an auxiliary function named \code{color-graph}
|
|
that takes an interference graph and a list of all the variables in
|
|
that takes an interference graph and a list of all the variables in
|
|
the program. This function should return a mapping of variables to
|
|
the program. This function should return a mapping of variables to
|
|
their colors (represented as natural numbers). By creating this helper
|
|
their colors (represented as natural numbers). By creating this helper
|
|
-function, you will be able to reuse it in Chapter~\ref{ch:functions}
|
|
|
|
|
|
+function, you will be able to reuse it in Chapter~\ref{ch:Rfun}
|
|
when we add support for functions.
|
|
when we add support for functions.
|
|
|
|
|
|
To prioritize the processing of highly saturated nodes inside the
|
|
To prioritize the processing of highly saturated nodes inside the
|
|
@@ -4270,7 +4442,7 @@ conclusion:
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter{Booleans and Control Flow}
|
|
\chapter{Booleans and Control Flow}
|
|
-\label{ch:bool-types}
|
|
|
|
|
|
+\label{ch:Rif}
|
|
\index{Boolean}
|
|
\index{Boolean}
|
|
\index{control flow}
|
|
\index{control flow}
|
|
\index{conditional expression}
|
|
\index{conditional expression}
|
|
@@ -5543,7 +5715,7 @@ Use the \code{tsort} and \code{transpose} functions of the Racket
|
|
As an aside, a topological ordering is only guaranteed to exist if the
|
|
As an aside, a topological ordering is only guaranteed to exist if the
|
|
graph does not contain any cycles. That is indeed the case for the
|
|
graph does not contain any cycles. That is indeed the case for the
|
|
control-flow graphs that we generate from \LangIf{} programs.
|
|
control-flow graphs that we generate from \LangIf{} programs.
|
|
-However, in Chapter~\ref{ch:loop} we add loops to \LangLoop{} and
|
|
|
|
|
|
+However, in Chapter~\ref{ch:Rwhile} we add loops to \LangLoop{} and
|
|
learn how to handle cycles in the control-flow graph.
|
|
learn how to handle cycles in the control-flow graph.
|
|
|
|
|
|
You'll need to construct a directed graph to represent the
|
|
You'll need to construct a directed graph to represent the
|
|
@@ -5956,7 +6128,7 @@ blocks on several test programs.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter{Tuples and Garbage Collection}
|
|
\chapter{Tuples and Garbage Collection}
|
|
-\label{ch:tuples}
|
|
|
|
|
|
+\label{ch:Rvec}
|
|
\index{tuple}
|
|
\index{tuple}
|
|
\index{vector}
|
|
\index{vector}
|
|
|
|
|
|
@@ -5976,7 +6148,7 @@ no longer needed, which is why we also study \emph{garbage collection}
|
|
|
|
|
|
Section~\ref{sec:r3} introduces the \LangVec{} language including its
|
|
Section~\ref{sec:r3} introduces the \LangVec{} language including its
|
|
interpreter and type checker. The \LangVec{} language extends the \LangIf{}
|
|
interpreter and type checker. The \LangVec{} language extends the \LangIf{}
|
|
-language of Chapter~\ref{ch:bool-types} with vectors and Racket's
|
|
|
|
|
|
+language of Chapter~\ref{ch:Rif} with vectors and Racket's
|
|
\code{void} value. The reason for including the later is that the
|
|
\code{void} value. The reason for including the later is that the
|
|
\code{vector-set!} operation returns a value of type
|
|
\code{vector-set!} operation returns a value of type
|
|
\code{Void}\footnote{Racket's \code{Void} type corresponds to what is
|
|
\code{Void}\footnote{Racket's \code{Void} type corresponds to what is
|
|
@@ -7333,7 +7505,7 @@ from the set.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter{Functions}
|
|
\chapter{Functions}
|
|
-\label{ch:functions}
|
|
|
|
|
|
+\label{ch:Rfun}
|
|
\index{function}
|
|
\index{function}
|
|
|
|
|
|
This chapter studies the compilation of functions similar to those
|
|
This chapter studies the compilation of functions similar to those
|
|
@@ -7341,7 +7513,7 @@ found in the C language. This corresponds to a subset of Typed Racket
|
|
in which only top-level function definitions are allowed. This kind of
|
|
in which only top-level function definitions are allowed. This kind of
|
|
function is an important stepping stone to implementing
|
|
function is an important stepping stone to implementing
|
|
lexically-scoped functions, that is, \key{lambda} abstractions, which
|
|
lexically-scoped functions, that is, \key{lambda} abstractions, which
|
|
-is the topic of Chapter~\ref{ch:lambdas}.
|
|
|
|
|
|
+is the topic of Chapter~\ref{ch:Rlam}.
|
|
|
|
|
|
\section{The \LangFun{} Language}
|
|
\section{The \LangFun{} Language}
|
|
|
|
|
|
@@ -8294,7 +8466,7 @@ except the \code{retq} is replaced with \code{jmp *$\itm{arg}$}.
|
|
Regarding function definitions, you will need to generate a prelude
|
|
Regarding function definitions, you will need to generate a prelude
|
|
and conclusion for each one. This code is similar to the prelude and
|
|
and conclusion for each one. This code is similar to the prelude and
|
|
conclusion that you generated for the \code{main} function in
|
|
conclusion that you generated for the \code{main} function in
|
|
-Chapter~\ref{ch:tuples}. To review, the prelude of every function
|
|
|
|
|
|
+Chapter~\ref{ch:Rvec}. To review, the prelude of every function
|
|
should carry out the following steps.
|
|
should carry out the following steps.
|
|
\begin{enumerate}
|
|
\begin{enumerate}
|
|
\item Start with \code{.global} and \code{.align} directives followed
|
|
\item Start with \code{.global} and \code{.align} directives followed
|
|
@@ -8502,7 +8674,7 @@ mainconclusion:
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter{Lexically Scoped Functions}
|
|
\chapter{Lexically Scoped Functions}
|
|
-\label{ch:lambdas}
|
|
|
|
|
|
+\label{ch:Rlam}
|
|
\index{lambda}
|
|
\index{lambda}
|
|
\index{lexical scoping}
|
|
\index{lexical scoping}
|
|
|
|
|
|
@@ -8577,8 +8749,8 @@ applied. An efficient solution to the problem, due to
|
|
free variables together with the function pointer for the lambda's
|
|
free variables together with the function pointer for the lambda's
|
|
code, an arrangement called a \emph{flat closure} (which we shorten to
|
|
code, an arrangement called a \emph{flat closure} (which we shorten to
|
|
just ``closure''). \index{closure}\index{flat closure} Fortunately,
|
|
just ``closure''). \index{closure}\index{flat closure} Fortunately,
|
|
-we have all the ingredients to make closures, Chapter~\ref{ch:tuples}
|
|
|
|
-gave us vectors and Chapter~\ref{ch:functions} gave us function
|
|
|
|
|
|
+we have all the ingredients to make closures, Chapter~\ref{ch:Rvec}
|
|
|
|
+gave us vectors and Chapter~\ref{ch:Rfun} gave us function
|
|
pointers. The function pointer resides at index $0$ and the
|
|
pointers. The function pointer resides at index $0$ and the
|
|
values for the free variables will fill in the rest of the vector.
|
|
values for the free variables will fill in the rest of the vector.
|
|
|
|
|
|
@@ -9214,7 +9386,7 @@ work of \citet{Keep:2012ab}.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter{Dynamic Typing}
|
|
\chapter{Dynamic Typing}
|
|
-\label{ch:type-dynamic}
|
|
|
|
|
|
+\label{ch:Rdyn}
|
|
\index{dynamic typing}
|
|
\index{dynamic typing}
|
|
|
|
|
|
In this chapter we discuss the compilation of \LangDyn{}, a dynamically
|
|
In this chapter we discuss the compilation of \LangDyn{}, a dynamically
|
|
@@ -9317,7 +9489,7 @@ There is no type checker for \LangDyn{} because it is not a statically
|
|
typed language (it's dynamically typed!).
|
|
typed language (it's dynamically typed!).
|
|
|
|
|
|
The definitional interpreter for \LangDyn{} is presented in
|
|
The definitional interpreter for \LangDyn{} is presented in
|
|
-Figure~\ref{fig:interp-Rdyn} and its auxiliary functions are defined in
|
|
|
|
|
|
+Figure~\ref{fig:interp-Rdyn} and its auxiliary functions are defined i
|
|
Figure~\ref{fig:interp-Rdyn-aux}. Consider the match case for
|
|
Figure~\ref{fig:interp-Rdyn-aux}. Consider the match case for
|
|
\code{(Int n)}. Instead of simply returning the integer \code{n} (as
|
|
\code{(Int n)}. Instead of simply returning the integer \code{n} (as
|
|
in the interpreter for \LangVar{} in Figure~\ref{fig:interp-Rvar}), the
|
|
in the interpreter for \LangVar{} in Figure~\ref{fig:interp-Rvar}), the
|
|
@@ -10284,7 +10456,7 @@ for the compilation of \LangDyn{}.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter{Loops and Assignment}
|
|
\chapter{Loops and Assignment}
|
|
-\label{ch:loop}
|
|
|
|
|
|
+\label{ch:Rwhile}
|
|
|
|
|
|
% TODO: define R'_8
|
|
% TODO: define R'_8
|
|
|
|
|
|
@@ -10893,7 +11065,7 @@ in $A_b \cap F_b$ with \code{AssignedFree} to produce \itm{params'}.
|
|
Let $P$ be the set of parameter names in \itm{params}. The result is
|
|
Let $P$ be the set of parameter names in \itm{params}. The result is
|
|
$\LAMBDA{\itm{params'}}{T}{\itm{body'}}$, $A_b - P$, and $(F_b \cup
|
|
$\LAMBDA{\itm{params'}}{T}{\itm{body'}}$, $A_b - P$, and $(F_b \cup
|
|
\mathrm{FV}(\itm{body})) - P$, where $\mathrm{FV}$ computes the free
|
|
\mathrm{FV}(\itm{body})) - P$, where $\mathrm{FV}$ computes the free
|
|
-variables of an expression (see Chapter~\ref{ch:lambdas}).
|
|
|
|
|
|
+variables of an expression (see Chapter~\ref{ch:Rlam}).
|
|
|
|
|
|
\paragraph{Convert Assignments}
|
|
\paragraph{Convert Assignments}
|
|
|
|
|
|
@@ -11211,7 +11383,7 @@ for the compilation of \LangLoop{}.
|
|
\section{Challenge: Arrays}
|
|
\section{Challenge: Arrays}
|
|
\label{sec:arrays}
|
|
\label{sec:arrays}
|
|
|
|
|
|
-In Chapter~\ref{ch:tuples} we studied tuples, that is, sequences of
|
|
|
|
|
|
+In Chapter~\ref{ch:Rvec} we studied tuples, that is, sequences of
|
|
elements whose length is determined at compile-time and where each
|
|
elements whose length is determined at compile-time and where each
|
|
element of a tuple may have a different type (they are
|
|
element of a tuple may have a different type (they are
|
|
heterogeous). This challenge is also about sequences, but this time
|
|
heterogeous). This challenge is also about sequences, but this time
|
|
@@ -11435,7 +11607,7 @@ an array:
|
|
\end{itemize}
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
-Recall that in Chapter~\ref{ch:type-dynamic}, we use a $3$-bit tag to
|
|
|
|
|
|
+Recall that in Chapter~\ref{ch:Rdyn}, we use a $3$-bit tag to
|
|
differentiate the kinds of values that have been injected into the
|
|
differentiate the kinds of values that have been injected into the
|
|
\code{Any} type. We use the bit pattern \code{110} (or $6$ in decimal)
|
|
\code{Any} type. We use the bit pattern \code{110} (or $6$ in decimal)
|
|
to indicate that the value is an array.
|
|
to indicate that the value is an array.
|
|
@@ -11448,7 +11620,7 @@ the passes to handle arrays.
|
|
|
|
|
|
The array-access operators \code{vectorof-ref} and
|
|
The array-access operators \code{vectorof-ref} and
|
|
\code{vectorof-set!} are similar to the \code{any-vector-ref} and
|
|
\code{vectorof-set!} are similar to the \code{any-vector-ref} and
|
|
-\code{any-vector-set!} operators of Chapter~\ref{ch:type-dynamic} in
|
|
|
|
|
|
+\code{any-vector-set!} operators of Chapter~\ref{ch:Rdyn} in
|
|
that the type checker cannot tell whether the index will be in bounds,
|
|
that the type checker cannot tell whether the index will be in bounds,
|
|
so the bounds check must be performed at run time. Recall that the
|
|
so the bounds check must be performed at run time. Recall that the
|
|
\code{reveal-casts} pass (Section~\ref{sec:reveal-casts-Rany}) wraps
|
|
\code{reveal-casts} pass (Section~\ref{sec:reveal-casts-Rany}) wraps
|
|
@@ -11538,7 +11710,7 @@ arrays by laying out each row in the array, one after the next.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter{Gradual Typing}
|
|
\chapter{Gradual Typing}
|
|
-\label{ch:gradual-typing}
|
|
|
|
|
|
+\label{ch:Rgrad}
|
|
\index{gradual typing}
|
|
\index{gradual typing}
|
|
|
|
|
|
This chapter studies a language, \LangGrad{}, in which the programmer
|
|
This chapter studies a language, \LangGrad{}, in which the programmer
|
|
@@ -11630,7 +11802,7 @@ syntax.
|
|
Both the type checker and the interpreter for \LangGrad{} require some
|
|
Both the type checker and the interpreter for \LangGrad{} require some
|
|
interesting changes to enable gradual typing, which we discuss in the
|
|
interesting changes to enable gradual typing, which we discuss in the
|
|
next two sections in the context of the \code{map-vec} example from
|
|
next two sections in the context of the \code{map-vec} example from
|
|
-Chapter~\ref{ch:functions}. In Figure~\ref{fig:gradual-map-vec} we
|
|
|
|
|
|
+Chapter~\ref{ch:Rfun}. In Figure~\ref{fig:gradual-map-vec} we
|
|
revised the \code{map-vec} example, omitting the type annotations from
|
|
revised the \code{map-vec} example, omitting the type annotations from
|
|
the \code{add1} function.
|
|
the \code{add1} function.
|
|
|
|
|
|
@@ -12706,7 +12878,7 @@ recommend the reader to the online gradual typing bibliography:
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter{Parametric Polymorphism}
|
|
\chapter{Parametric Polymorphism}
|
|
-\label{ch:parametric-polymorphism}
|
|
|
|
|
|
+\label{ch:Rpoly}
|
|
\index{parametric polymorphism}
|
|
\index{parametric polymorphism}
|
|
\index{generics}
|
|
\index{generics}
|
|
|
|
|
|
@@ -12752,7 +12924,7 @@ declaration comes before the \code{define}. In the abstract syntax,
|
|
the return type in the \code{Def} is \code{Any}, but that should be
|
|
the return type in the \code{Def} is \code{Any}, but that should be
|
|
ignored in favor of the return type in the type declaration. (The
|
|
ignored in favor of the return type in the type declaration. (The
|
|
\code{Any} comes from using the same parser as in
|
|
\code{Any} comes from using the same parser as in
|
|
-Chapter~\ref{ch:type-dynamic}.) The presence of a type declaration
|
|
|
|
|
|
+Chapter~\ref{ch:Rdyn}.) The presence of a type declaration
|
|
enables the use of an \code{All} type for a function, thereby making
|
|
enables the use of an \code{All} type for a function, thereby making
|
|
it polymorphic. The grammar for types is extended to include
|
|
it polymorphic. The grammar for types is extended to include
|
|
polymorphic types and type variables.
|
|
polymorphic types and type variables.
|
|
@@ -13154,7 +13326,7 @@ add just one new pass, \code{erase-types}, to compile \LangInst{} to
|
|
\section{Erase Types}
|
|
\section{Erase Types}
|
|
\label{sec:erase-types}
|
|
\label{sec:erase-types}
|
|
|
|
|
|
-We use the \code{Any} type from Chapter~\ref{ch:type-dynamic} to
|
|
|
|
|
|
+We use the \code{Any} type from Chapter~\ref{ch:Rdyn} to
|
|
represent type variables. For example, Figure~\ref{fig:map-vec-erase}
|
|
represent type variables. For example, Figure~\ref{fig:map-vec-erase}
|
|
shows the output of the \code{erase-types} pass on the polymorphic
|
|
shows the output of the \code{erase-types} pass on the polymorphic
|
|
\code{map-vec} (Figure~\ref{fig:map-vec-poly}). The occurrences of
|
|
\code{map-vec} (Figure~\ref{fig:map-vec-poly}). The occurrences of
|