4 سال پیش · d736185f80
--- a/book.tex
+++ b/book.tex
@@ -169,56 +169,55 @@ University.
 
				 There is a magical moment when a programmer presses the ``run'' button
			
 
				 and the software begins to execute. Somehow a program written in a
			
 
				 high-level language is running on a computer that is only capable of
			
 
				-shuffling bits. This book reveals the wizardry that makes that moment
			
 
				+shuffling bits. Here we reveal the wizardry that makes that moment
			
 
				 possible. Beginning with the groundbreaking work of Backus and
			
 
				 colleagues in the 1950s, computer scientists discovered techniques for
			
 
				 constructing programs, called \emph{compilers}, that automatically
			
 
				 translate high-level programs into machine code.
			
 
				 
			
 
				-This book guides the reader on the journey of constructing their own
			
 
				-compiler for a small but powerful language. Along the way the reader
			
 
				-learns the essential concepts, algorithms, and data structures that
			
 
				-underlie modern compilers. They develop an understanding of how
			
 
				-programs are mapped onto computer hardware which is helpful when
			
 
				-reasoning about execution time, debugging errors across layers of the
			
 
				-software stack, and finding security vulnerabilities.
			
 
				-%
			
 
				-For readers interested in a career in compiler construction, this book
			
 
				-is a stepping-stone to advanced topics such as just-in-time
			
 
				-compilation, program analysis, and program optimization.
			
 
				-%
			
 
				-For readers interested in the design of programming languages, this
			
 
				-book connects language design choices to their impact on the compiler
			
 
				-and generated code.
			
 
				-
			
 
				-A compiler is typically organized as a pipeline with a handful of
			
 
				-passes that translate a program into ever lower levels of
			
 
				-abstraction. We take this approach to the extreme by partitioning our
			
 
				-compiler into a large number of \emph{nanopasses}, each of which
			
 
				-performs a single task. This makes the compiler easier to debug,
			
 
				-because we test the output of each pass, and it makes the compiler
			
 
				-easier to understand, because each pass involves fewer concepts.
			
 
				-
			
 
				-Most books about compiler construction are structured like the
			
 
				-compiler, with each chapter describing one pass. The problem with that
			
 
				-structure is that it obfuscates how language features motivate design
			
 
				-choices in the compiler. We take an \emph{incremental} approach in
			
 
				-which we build a complete compiler in each chapter, starting with a
			
 
				-tiny language and adding new features in subsequent chapters.
			
 
				+We take you on a journey by constructing your own compiler for a small
			
 
				+but powerful language. Along the way we explain the essential
			
 
				+concepts, algorithms, and data structures that underlie compilers. We
			
 
				+develop your understanding of how programs are mapped onto computer
			
 
				+hardware, which is helpful when reasoning about properties at the
			
 
				+junction between hardware and software such as execution time,
			
 
				+software errors, and security vulnerabilities.  For those interested
			
 
				+in pursuing compiler construction, our goal is to provide a
			
 
				+stepping-stone to advanced topics such as just-in-time compilation,
			
 
				+program analysis, and program optimization.  For those interested in
			
 
				+designing and implementing their own programming languages, we connect
			
 
				+language design choices to their impact on the compiler its generated
			
 
				+code.
			
 
				+
			
 
				+A compiler is typically organized as a sequence of stages that
			
 
				+progressively translates a program to code that runs on hardware. We
			
 
				+take this approach to the extreme by partitioning our compiler into a
			
 
				+large number of \emph{nanopasses}, each of which performs a single
			
 
				+task. This allows us to test the output of each pass in isolation, and
			
 
				+furthermore, allows us to focus our attention making the compiler far
			
 
				+easier to understand.
			
 
				+
			
 
				+%% [TODO: easier to understand/debug for those maintaining the compiler,
			
 
				+%%   proving correctness]
			
 
				+
			
 
				+The most familiar approach to describing compilers is with one pass
			
 
				+per chapter.  The problem with that is it obfuscates how language
			
 
				+features motivate design choices in a compiler. We take an
			
 
				+\emph{incremental} approach in which we build a complete compiler in
			
 
				+each chapter, starting with arithmetic and variables and add new
			
 
				+features in subsequent chapters.
			
 
				 
			
 
				 Our choice of language features is designed to elicit the fundamental
			
 
				-concepts and algorithms used in compilers for modern programming
			
 
				-languages.
			
 
				+concepts and algorithms used in compilers.
			
 
				 \begin{itemize}
			
 
				-\item We begin with integer arithmetic and local variables.  The
			
 
				-  reader becomes acquainted with the basic tools of compiler
			
 
				-  construction, \emph{abstract syntax trees} and \emph{recursive
			
 
				-    functions}, in Chapter~\ref{ch:trees-recur} and applies them to a
			
 
				-  language with integers and variables in Chapter~\ref{ch:Rvar}. In
			
 
				-  Chapter~\ref{ch:register-allocation-Rvar} we apply \emph{graph
			
 
				-    coloring} to assign variables to registers.
			
 
				-\item Chapter~\ref{ch:Rif} adds conditional control-flow, which
			
 
				-  motivates an elegant recursive algorithm for mapping expressions to
			
 
				+\item We begin with integer arithmetic and local variables in
			
 
				+  Chapters~\ref{ch:trees-recur} and \ref{ch:Rvar}, where we introduce
			
 
				+  the fundamental tools of compiler construction: \emph{abstract
			
 
				+    syntax trees} and \emph{recursive functions}. 
			
 
				+\item In Chapter~\ref{ch:register-allocation-Rvar} we apply
			
 
				+  \emph{graph coloring} to assign variables to machine registers.
			
 
				+\item Chapter~\ref{ch:Rif} adds \code{if} expressions, which motivates
			
 
				+  an elegant recursive algorithm for mapping expressions to
			
 
				   \emph{control-flow graphs}.
			
 
				 \item Chapter~\ref{ch:Rvec} adds heap-allocated tuples, motivating
			
 
				   \emph{garbage collection}.
			
@@ -339,104 +338,6 @@ assembly code that we generate does \emph{not} work with the runtime
 
				 system on Windows. One workaround is to use a virtual machine with
			
 
				 Linux as the guest operating system.
			
 
				 
			
 
				-% TODO: point to support code on github
			
 
				-
			
 
				-
			
 
				-
			
 
				-%% The tradition of compiler writing at Indiana University goes back to
			
 
				-%% research and courses on programming languages by Professor Daniel
			
 
				-%% Friedman in the 1970's and 1980's. Friedman conducted research on lazy
			
 
				-%% evaluation~\citep{Friedman:1976aa} in the context of
			
 
				-%% Lisp~\citep{McCarthy:1960dz} and then studied
			
 
				-%% continuations~\citep{Felleisen:kx} and
			
 
				-%% macros~\citep{Kohlbecker:1986dk} in the context of the
			
 
				-%% Scheme~\citep{Sussman:1975ab}, a dialect of Lisp.  One of the students
			
 
				-%% of those courses, Kent Dybvig, went on to build Chez
			
 
				-%% Scheme~\citep{Dybvig:2006aa}, a production-quality and efficient
			
 
				-%% compiler for Scheme. After completing his Ph.D. at the University of
			
 
				-%% North Carolina, he returned to teach at Indiana University.
			
 
				-%% Throughout the 1990's and 2000's, Professor Dybvig continued
			
 
				-%% development of Chez Scheme and taught the compiler course.
			
 
				-
			
 
				-%% The compiler course evolved to incorporate novel pedagogical ideas
			
 
				-%% while also including elements of effective real-world compilers.  One
			
 
				-%% of Friedman's ideas was to split the compiler into many small
			
 
				-%% ``passes'' so that the code for each pass would be easy to understood
			
 
				-%% in isolation.  In contrast, most compilers of the time were organized
			
 
				-%% into only a few monolithic passes for reasons of compile-time
			
 
				-%% efficiency. Another idea, called ``the game'', was to test the code
			
 
				-%% generated by each pass on interpreters for each intermediate language,
			
 
				-%% thereby helping to pinpoint errors in individual passes.
			
 
				-%% %
			
 
				-%% Dybvig, with later help from his students Dipanwita Sarkar and Andrew
			
 
				-%% Keep, developed infrastructure to support this approach and evolved
			
 
				-%% the course, first to use smaller micro-passes and then into even
			
 
				-%% smaller nano-passes~\citep{Sarkar:2004fk,Keep:2012aa}. I was a student
			
 
				-%% in this compiler course in the early 2000's as part of my
			
 
				-%% Ph.D. studies at Indiana University. Needless to say, I enjoyed the
			
 
				-%% course immensely!
			
 
				-
			
 
				-%% During that time, another graduate student named Abdulaziz Ghuloum
			
 
				-%% observed that the front-to-back organization of the course made it
			
 
				-%% difficult for students to understand the rationale for the compiler
			
 
				-%% design. Ghuloum proposed an incremental approach in which the students
			
 
				-%% start by implementing a complete compiler for a very small subset of
			
 
				-%% the language. In each subsequent stage they add a feature to the
			
 
				-%% language and then add or modify passes to handle the new
			
 
				-%% feature~\citep{Ghuloum:2006bh}.  In this way, the students see how the
			
 
				-%% language features motivate aspects of the compiler design.
			
 
				-
			
 
				-%% After graduating from Indiana University in 2005, I went on to teach
			
 
				-%% at the University of Colorado. I adapted the nano-pass and incremental
			
 
				-%% approaches to compiling a subset of the Python
			
 
				-%% language~\citep{Siek:2012ab}.
			
 
				-%% %% Python and Scheme are quite different
			
 
				-%% %% on the surface but there is a large overlap in the compiler techniques
			
 
				-%% %% required for the two languages. Thus, I was able to teach much of the
			
 
				-%% %% same content from the Indiana compiler course.
			
 
				-%% I very much enjoyed teaching the course organized in this way, and
			
 
				-%% even better, many of the students learned a lot and got excited about
			
 
				-%% compilers.
			
 
				-
			
 
				-%% I returned to Indiana University in 2013.  In my absence the compiler
			
 
				-%% course had switched from the front-to-back organization to a
			
 
				-%% back-to-front~\citep{Dybvig:2010aa}. While that organization also works
			
 
				-%% well, I prefer the incremental approach and started porting and
			
 
				-%% adapting the structure of the Colorado course back into the land of
			
 
				-%% Scheme. In the meantime Indiana University had moved on from Scheme to
			
 
				-%% Racket~\citep{plt-tr}, so the course is now about compiling a subset
			
 
				-%% of Racket (and Typed Racket) to the x86 assembly language.
			
 
				-
			
 
				-%% This is the textbook for the incremental version of the compiler
			
 
				-%% course at Indiana University (Spring 2016 - present).  With this book
			
 
				-%% I hope to make the Indiana compiler course available to people that
			
 
				-%% have not had the chance to study compilers at Indiana University.
			
 
				-
			
 
				-%% %% I have captured what
			
 
				-%% %% I think are the most important topics from \cite{Dybvig:2010aa} but
			
 
				-%% %% have omitted topics that are less interesting conceptually. I have
			
 
				-%% %% also made simplifications to reduce complexity.  In this way, this
			
 
				-%% %% book leans more towards pedagogy than towards the efficiency of the
			
 
				-%% %% generated code. Also, the book differs in places where we I the
			
 
				-%% %% opportunity to make the topics more fun, such as in relating register
			
 
				-%% %% allocation to Sudoku (Chapter~\ref{ch:register-allocation-Rvar}).
			
 
				-
			
 
				-%% \section*{Prerequisites}
			
 
				-
			
 
				-%% The material in this book is challenging but rewarding. It is meant to
			
 
				-%% prepare students for a lifelong career in programming languages.
			
 
				-
			
 
				-%% %\section*{Structure of book}
			
 
				-%% % You might want to add short description about each chapter in this book.
			
 
				-
			
 
				-%% %\section*{About the companion website}
			
 
				-%% %The website\footnote{\url{https://github.com/amberj/latex-book-template}} for %this file contains:
			
 
				-%% %\begin{itemize}
			
 
				-%% %  \item A link to (freely downlodable) latest version of this document.
			
 
				-%% %  \item Link to download LaTeX source for this document.
			
 
				-%% %  \item Miscellaneous material (e.g. suggested readings etc).
			
 
				-%% %\end{itemize}
			
 
				-
			
 
				 \section*{Acknowledgments}
			
 
				 
			
 
				 The tradition of compiler construction at Indiana University goes back
			
@@ -471,13 +372,11 @@ We thank Ronald Garcia for being Jeremy's partner when they took the
 
				 compiler course in the early 2000's and especially for finding the bug
			
 
				 that sent the garbage collector on a wild goose chase!
			
 
				 
			
 
				-%Oscar Waddell ??
			
 
				-
			
 
				 \mbox{}\\
			
 
				 \noindent Jeremy G. Siek \\
			
 
				 Bloomington, Indiana
			
 
				-%\noindent \url{http://homes.soic.indiana.edu/jsiek} \\
			
 
				-%\noindent Spring 2016
			
 
				+
			
 
				+%Oscar Waddell ??