瀏覽代碼

uniform notion of block, introduce basic block terminology in the right places

Jeremy Siek 3 年之前
父節點
當前提交
1e89e2c626
共有 1 個文件被更改,包括 31 次插入23 次删除
  1. 31 23
      book.tex

+ 31 - 23
book.tex

@@ -2587,9 +2587,9 @@ Figure~\ref{fig:x86-int-ast}. We refer to this language as
 The main difference compared to the concrete syntax of \LangXInt{}
 The main difference compared to the concrete syntax of \LangXInt{}
 (Figure~\ref{fig:x86-int-concrete}) is that labels are not allowed in
 (Figure~\ref{fig:x86-int-concrete}) is that labels are not allowed in
 front of every instruction. Instead instructions are grouped into
 front of every instruction. Instead instructions are grouped into
-\emph{blocks}\index{subject}{block} with a
-label associated with every block, which is why the \key{X86Program}
-struct includes an alist mapping labels to blocks. The reason for this
+\emph{basic blocks}\index{subject}{basic block} with a
+label associated with every basic block, which is why the \key{X86Program}
+struct includes an alist mapping labels to basic blocks. The reason for this
 organization becomes apparent in Chapter~\ref{ch:Lif} when we
 organization becomes apparent in Chapter~\ref{ch:Lif} when we
 introduce conditional branching. The \code{Block} structure includes
 introduce conditional branching. The \code{Block} structure includes
 an $\itm{info}$ field that is not needed for this chapter but becomes
 an $\itm{info}$ field that is not needed for this chapter but becomes
@@ -2741,9 +2741,8 @@ Our compiler for \LangVar{} consists of the following passes.
 {\if\edition\racketEd
 {\if\edition\racketEd
 \item[\key{explicate\_control}] makes the execution order of the
 \item[\key{explicate\_control}] makes the execution order of the
   program explicit. It converts the abstract syntax tree
   program explicit. It converts the abstract syntax tree
-  representation into a graph in which each node contains a sequence
-  of statements and the edges between nodes say which nodes contain
-  jumps to other nodes.
+  representation into a graph in which each node is a labeled sequence
+  of statements and the edges are \code{goto} statements.
 \fi}
 \fi}
 
 
 \item[\key{select\_instructions}] handles the difference between
 \item[\key{select\_instructions}] handles the difference between
@@ -7285,6 +7284,7 @@ making it straightforward to compile \code{if} statements to x86.  The
 \key{CProgram} construct contains an alist mapping labels to $\Tail$
 \key{CProgram} construct contains an alist mapping labels to $\Tail$
 expressions. A \code{goto} statement transfers control to the $\Tail$
 expressions. A \code{goto} statement transfers control to the $\Tail$
 expression corresponding to its label.
 expression corresponding to its label.
+%
 Figure~\ref{fig:c1-concrete-syntax} defines the concrete syntax of the
 Figure~\ref{fig:c1-concrete-syntax} defines the concrete syntax of the
 \LangCIf{} intermediate language and Figure~\ref{fig:c1-syntax}
 \LangCIf{} intermediate language and Figure~\ref{fig:c1-syntax}
 defines its abstract syntax.
 defines its abstract syntax.
@@ -7308,13 +7308,14 @@ statement to finish the program with a specified value.
 %
 %
 The \key{CProgram} construct contains a dictionary mapping labels to
 The \key{CProgram} construct contains a dictionary mapping labels to
 lists of statements that end with a \code{return} statement, a
 lists of statements that end with a \code{return} statement, a
-\code{goto}, or a conditional \code{goto}.  Statement lists of this
-form are called \emph{basic blocks}\index{subject}{basic block}: there
-is a control transfer at the end and control only enters at the
-beginning of the list, which is marked by the label.
+\code{goto}, or a conditional \code{goto}.
+%% Statement lists of this
+%% form are called \emph{basic blocks}\index{subject}{basic block}: there
+%% is a control transfer at the end and control only enters at the
+%% beginning of the list, which is marked by the label.
 %
 %
-A \code{goto} statement transfers control to basic block corresponding
-to its label.
+A \code{goto} statement transfers control to the sequence of statements
+associated with its label.
 %
 %
 The concrete syntax for \LangCIf{} is defined in
 The concrete syntax for \LangCIf{} is defined in
 Figure~\ref{fig:c1-concrete-syntax} and the abstract syntax is defined
 Figure~\ref{fig:c1-concrete-syntax} and the abstract syntax is defined
@@ -7366,7 +7367,7 @@ in Figure~\ref{fig:c1-syntax}.
 \Stmt &::=& \PRINT{\Atm} \MID \EXPR{\Exp} \\
 \Stmt &::=& \PRINT{\Atm} \MID \EXPR{\Exp} \\
      &\MID& \ASSIGN{\VAR{\Var}}{\Exp}  
      &\MID& \ASSIGN{\VAR{\Var}}{\Exp}  
      \MID \RETURN{\Exp} \MID \GOTO{\itm{label}} \\
      \MID \RETURN{\Exp} \MID \GOTO{\itm{label}} \\
-    &\MID& \IFSTMT{\CMP{\Atm}{\itm{cmp}}{\Atm}}{\LS\GOTO{\itm{label}}\RS}{\LS\GOTO{\itm{label}}\RS} 
+     &\MID& \IFSTMT{\CMP{\Atm}{\itm{cmp}}{\Atm}}{\LS\GOTO{\itm{label}}\RS}{\LS\GOTO{\itm{label}}\RS}
 \end{array}
 \end{array}
 }
 }
   
   
@@ -7444,6 +7445,11 @@ language. Figures~\ref{fig:x86-1-concrete} and \ref{fig:x86-1} define
 the concrete and abstract syntax for the \LangXIf{} subset of x86,
 the concrete and abstract syntax for the \LangXIf{} subset of x86,
 which includes instructions for logical operations, comparisons, and
 which includes instructions for logical operations, comparisons, and
 \racket{conditional} jumps.
 \racket{conditional} jumps.
+%
+\python{The abstract syntax for an \LangXIf{} program contains a
+  dictionary mapping labels to sequences of instructions, each of
+  which we refer to as a \emph{basic block}\index{subject}{basic
+    block}.}
 
 
 One challenge is that x86 does not provide an instruction that
 One challenge is that x86 does not provide an instruction that
 directly implements logical negation (\code{not} in \LangIf{} and
 directly implements logical negation (\code{not} in \LangIf{} and
@@ -7484,7 +7490,7 @@ $\Atm$ to x86.
    \MID \key{cmpq}~\Arg\key{,}~\Arg
    \MID \key{cmpq}~\Arg\key{,}~\Arg
     \MID  \key{set}cc~\Arg 
     \MID  \key{set}cc~\Arg 
     \MID \key{movzbq}~\Arg\key{,}~\Arg \\
     \MID \key{movzbq}~\Arg\key{,}~\Arg \\
-    &\MID& \key{j}cc~\itm{label}
+    &\MID& \key{j}cc~\itm{label} \\
 \end{array}
 \end{array}
 }
 }
 
 
@@ -7558,7 +7564,8 @@ $\Atm$ to x86.
        &\MID& \BININSTR{\scode{set}}{\itm{cc}}{\Arg} 
        &\MID& \BININSTR{\scode{set}}{\itm{cc}}{\Arg} 
        \MID \BININSTR{\scode{movzbq}}{\Arg}{\Arg}\\
        \MID \BININSTR{\scode{movzbq}}{\Arg}{\Arg}\\
        &\MID&  \JMPIF{\itm{cc}}{\itm{label}} \\
        &\MID&  \JMPIF{\itm{cc}}{\itm{label}} \\
-\LangXIfM{} &::= & \XPROGRAM{\itm{info}}{\LC\itm{label} \,\key{:}\, \Instr^{*} \key{,} \ldots \RC }
+\Block &::= & \Instr^{+} \\
+\LangXIfM{} &::= & \XPROGRAM{\itm{info}}{\LC\itm{label} \,\key{:}\, \Block \key{,} \ldots \RC }
 \end{array}
 \end{array}
 \]
 \]
 \fi}
 \fi}
@@ -7897,16 +7904,17 @@ Unfortunately, this approach duplicates the two branches from the
 outer \code{if} and a compiler must never duplicate code!  After all,
 outer \code{if} and a compiler must never duplicate code!  After all,
 the two branches could be very large expressions.
 the two branches could be very large expressions.
 
 
-We need a way to perform the above transformation but without
-duplicating code. That is, we need a way for different parts of a
-program to refer to the same piece of code.
+How can we apply the above transformation but without duplicating
+code? In other words, how can two different parts of a program refer
+to one piece of code.
 %
 %
-Put another way, we need to move away from abstract syntax
-\emph{trees} and instead use \emph{graphs}.
+The answer is that we must move away from abstract syntax \emph{trees}
+and instead use \emph{graphs}.
 %
 %
 At the level of x86 assembly this is straightforward because we can
 At the level of x86 assembly this is straightforward because we can
 label the code for each branch and insert jumps in all the places that
 label the code for each branch and insert jumps in all the places that
-need to execute the branch.
+need to execute the branch. In this way, jump instructions are edges
+in the graph and the basic blocks are the nodes.
 %
 %
 Likewise, our language \LangCIf{} provides the ability to label a
 Likewise, our language \LangCIf{} provides the ability to label a
 sequence of statements and to jump to a label via \code{goto}.
 sequence of statements and to jump to a label via \code{goto}.
@@ -14101,8 +14109,8 @@ language, whose syntax is defined in Figure~\ref{fig:x86-3}.
 \Instr &::=& \ldots
 \Instr &::=& \ldots
      \MID \key{callq}\;\key{*}\Arg \MID \key{tailjmp}\;\Arg 
      \MID \key{callq}\;\key{*}\Arg \MID \key{tailjmp}\;\Arg 
      \MID \key{leaq}\;\Arg\key{,}\;\key{\%}\Reg \\
      \MID \key{leaq}\;\Arg\key{,}\;\key{\%}\Reg \\
-\Block &::= & \itm{label}\key{:}\, \Instr^{*} \\
-\Def &::= & \key{.globl}\,\itm{label}\; \Block^{*} \\
+\Block &::= & \Instr^{+} \\
+\Def &::= & \key{.globl}\,\itm{label}\; (\itm{label}\key{:}\, \Block)^{*} \\
 \LangXIndCallM{} &::= & \Def\ldots
 \LangXIndCallM{} &::= & \Def\ldots
 \end{array}
 \end{array}
 \]
 \]