|
@@ -4858,14 +4858,14 @@ rules. This process called \emph{state closure} is continued
|
|
|
until there are no more rules to add (similar to the prediction
|
|
|
actions of an Earley parser). We then examine each dotted rule in the
|
|
|
current state $I$. Suppose a dotted rule has the form $A ::=
|
|
|
-\alpha.X\beta$, where $A$ and $X$ are symbols and $\alpha$ and $\beta$
|
|
|
+s_1.\,X s_2$, where $A$ and $X$ are symbols and $s_1$ and $s_2$
|
|
|
are sequences of symbols. We create a new state, call it $J$. If $X$
|
|
|
is a terminal, we create a shift edge from $I$ to $J$ (analogous to
|
|
|
scanning in Earley), whereas if $X$ is a nonterminal, we create a
|
|
|
goto edge from $I$ to $J$. We then need to add some dotted rules to
|
|
|
state $J$. We start by adding all dotted rules from state $I$ that
|
|
|
-have the form $B ::= \gamma.X\kappa$ (where $B$ is any nonterminal and
|
|
|
-$\gamma$ and $\kappa$ are arbitrary sequences of symbols), but with
|
|
|
+have the form $B ::= s_1.\,Xs_2$ (where $B$ is any nonterminal and
|
|
|
+$s_1$ and $s_2$ are arbitrary sequences of symbols), but with
|
|
|
the period moved past the $X$. (This is analogous to completion in
|
|
|
the Earley algorithm.) We then perform state closure on $J$. This
|
|
|
process repeats until there are no more states or edges to add.
|
|
@@ -4878,40 +4878,55 @@ rule. We then put a reduce $n$ action into that state for every token
|
|
|
$Y$. For example, in Figure~\ref{fig:shift-reduce} state 4 has an
|
|
|
dotted rule with a period at the end. We therefore put a reduce by
|
|
|
rule 3 action into state 4 for every
|
|
|
-token. (Figure~\ref{fig:shift-reduce} does not show a reduce rule for
|
|
|
-\code{INT} in state 4 because this grammar does not allow two
|
|
|
-consecutive \code{INT} tokens in the input. We will not go into how
|
|
|
-this can be figured out, but in any event it does no harm to have a
|
|
|
-reduce rule for \code{INT} in state 4; it just means the input will be
|
|
|
-rejected at a later point in the parsing process.)
|
|
|
+token.
|
|
|
+%% (Figure~\ref{fig:shift-reduce} does not show a reduce rule for
|
|
|
+%% \code{INT} in state 4 because this grammar does not allow two
|
|
|
+%% consecutive \code{INT} tokens in the input. We will not go into how
|
|
|
+%% this can be figured out, but in any event it does no harm to have a
|
|
|
+%% reduce rule for \code{INT} in state 4; it just means the input will be
|
|
|
+%% rejected at a later point in the parsing process.)
|
|
|
+
|
|
|
+When inserting reduce actions, take care to spot any shift/reduce or
|
|
|
+reduce/reduce conflicts. If there are any, abort the construction of
|
|
|
+the parse table.
|
|
|
+
|
|
|
|
|
|
\begin{exercise}
|
|
|
-On a piece of paper, walk through the parse table generation
|
|
|
-process for the grammar in Figure~\ref{fig:parser1} and check
|
|
|
-your results against Figure~\ref{fig:shift-reduce}.
|
|
|
+ \normalfont\normalsize
|
|
|
+%
|
|
|
+On a piece of paper, walk through the parse table generation process
|
|
|
+for the grammar at the top of figure~\ref{fig:shift-reduce} and check
|
|
|
+your results against parse table in figure~\ref{fig:shift-reduce}.
|
|
|
\end{exercise}
|
|
|
|
|
|
|
|
|
\begin{exercise}
|
|
|
+ \normalfont\normalsize
|
|
|
+%
|
|
|
Change the parser in your compiler for \LangVar{} to set the
|
|
|
\code{parser} option of Lark to \code{'lalr'}. Test your compiler on
|
|
|
all the \LangVar{} programs that you have created. In doing so, Lark
|
|
|
may signal an error due to shift/reduce or reduce/reduce conflicts
|
|
|
in your grammar. If so, change your Lark grammar for \LangVar{} to
|
|
|
remove those conflicts.
|
|
|
-
|
|
|
\end{exercise}
|
|
|
|
|
|
|
|
|
\section{Further Reading}
|
|
|
|
|
|
-UNDER CONSTRUCTION
|
|
|
-
|
|
|
-finite automata
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
+In this chapter we have just scratched the surface of the field of
|
|
|
+parsing, with the study of a very general put less efficient algorithm
|
|
|
+(Earley) and with a more limited but highly efficient algorithm
|
|
|
+(LALR). There are many more algorithms, and classes of grammars, that
|
|
|
+fall between these two. We recommend the reader to \citet{Aho:2006wb}
|
|
|
+for a thorough treatment of parsing.
|
|
|
+
|
|
|
+Regarding lexical analysis, we described the specification language,
|
|
|
+the regular expressions, but not the algorithms for recognizing them.
|
|
|
+In short, regular expressions can be translated to nondeterministic
|
|
|
+finite automata, which in turn are translated to finite automata. We
|
|
|
+refer the reader again to \citet{Aho:2006wb} for all the details of
|
|
|
+lexical analysis.
|
|
|
|
|
|
\fi}
|
|
|
|