CSCI 8610	     Topics in Theoretical Computer Science       Fall 2006

                  Errata for the Course Notes by Jeff Shallit
________________________________________________________________________________

p. 29, line 0 (header): the header runs into the page number on the right.

p. 48, line 4: insert space before "for"; optionally, replace "in" with "$\in$".

p. 52, def. of $Q'$: braces needed around $q_0'$.

p. 52, def. of $\delta'(q_0',a)$: insert space before "for".

p. 52, def. of $\delta'(\tau_w,a)$: insert space before "for" and 
   add "and $a \in \Sigma$" at the end.

pp. 80-81: as stated and proved, Ogden's lemma doesn't fully support all of
   its later uses.  There are applications made of it which require
   conclusions to be reached about the CFG itself (as opposed to the language it
   generates), and one of these is strengthened if no requirements (such as 
   Chomsky normal form) are imposed on the CFG.  This can be fixed as follows. 

     First, prove Lemma 4.3.1 directly with an arbitrary CFG, rather than an 
   equivalent one in Chomsky normal form.  If $G$ is a CFG with $k$ variables
   and no production with a body longer than $m$, then $n = 2m^k$ is a pumping
   length for $L(G)$, and the proof goes through with little change except for
   replacing most occurrances of $2$ with $m$.

     Second, add as a new Corollary that under the hypotheses of the lemma,
   there is a subtree of the derivation tree for $z$ with yield $vwx$, rooted at
   some variable $A$ which also occurs inside the subtree with yield $w$
   (exactly as in Figure 4.1).  This requires no additional proof, since the
   proof of Lemma 4.3.1 shows that such a configuration exists.  This Corollary
   can be referrred to as "Ogden's lemma for $G$".

     Here are the applications of the modified Ogden's lemma and Corollary:
       p. 89, line 6: the configuration needed for the contradiction is
   guaranteed by Ogden's lemma for $G$, so the argument here no longer
   implicitly relies on the proof of Lemma 4.3.1 (as opposed to the stated
   conclusions of the lemma);
       pp. 90-91, the proof of the optimality of the triple construction
   requires the configurations which are guaranteed by the new Corollary, so no
   longer implicitly relies on the proof of Lemma 4.3.1 (as opposed to the
   stated conclusions), and also the optimality now applies to any CFG, not just
   those in Chomsky normal form (so the result actually proved is strengthened); 
       pp. 92-94, the proof of Parikh's Theorem requires the configurations
   which are guaranteed by the new Corollary, so again no longer implicitly
   relies on the proof of Lemma 4.3.1 (as opposed to the statement of it).


p. 89, line 11: delete "and".

p. 89, 3rd paragraph of the proof: end the displayed portion at the comma.

p. 89, 3rd paragraph of the proof, next line: should start "which gives an ...".

p. 90, line -12: if the new corollary to Lemma 4.3.1 is referrred to as "Ogden's
   lemma for $G$", replace "applied to $G$" with "for $G$".

p. 92, line -7: delete the last letter of "derivationj".

p. 92, line -7: in view of the modified Ogden's lemma and Corollary (pp. 80-81),
   the restriction here to Chomsky normal form may be omitted.

p. 95, line -3: just after the definition of $F'$, insert 
   " , $q_0', d, f \notin Q, X_0 \notin \Gamma, ".  

p. 97, line -5: replace "after processing $u$." with "after processing $u$
   (including any $\varepsilon$-moves needed to minimize the stack height)."

p. 98, line -4: replace "linear-ness" with "linearity".

p. 100, 2nd line of exc. 4: insert space between "some" and "$n$".

p. 100, 2nd line of exc. 4: move the right brace over to just before the period.

p. 105, line 5: should read "4.7" instead of "??".

p. 105, last line: the addition of notes for Section 4.9 would be in order.

p. 111, 3rd line above Lemma 5.3.1: should read "... iff A --> \alpha\beta \in P
   and there exists".

p. 113, 1st line of Thm. 5.3.3: should read 
   "... iff C --> \eta\gamma \in P and ...".

p. 114, 1st line of proof of Lemma 5.3.4: should read
   "... length of a derivation of \mu ...".

p. 114, 2nd line above (5.1): should read "... derivations above from \mu_1, 
   \gamma, and \mu_2 takes < r steps. ...".  (This avoids the impression that
   the derivation from B must take < r steps; if the /mu's are all terminals,
   it would take exactly r steps.)

p. 115, line 2: should read "... a derivation ...".

p. 115, line 5: should finish "... where the displayed occurrence of B".

p. 115, line 11: the first sentence of the line isn't quite right.  It doesn't
   seem to be needed, so could just be omitted.  All we know is that \delta
   can be derived from \beta\delta_1.

p. 115, Thm. 5.3.7: should start "If G is an unambiguous CFG with no 
   useless symbols, no \epsilon-rules, and no unit rules,".  The extra
   restrictions appear in Thm. 5.3.8 and exc. 9, both of which are used 
   in the proof.

p. 115, lines -3 and -2: here \alpha must be \epsilon, since there are
   no \epsilon-rules, so it would be clearer to omit it.

p. 116, step 3 of ADD: could add "i.e., if \alpha = \epsilon," after 
   "If i=j ".  This follows from having excluded \epsilon-rules.
   
p. 116, step 4 of ADD: should be omitted now, as the situation can't
   arise without \epsilon-rules.

p. 116, ADD: another case is needed.  ADD((A --> \alpha . , i), j)
   should just append (A --> \alpha . , i) to I_j.

p. 116, Earley's algorithm: step D should start "For all productions
   B --> \gamma \in P, if ...".

p. 116, last paragraph of the proof: an amortized analysis is not needed,
   since each action is taken at most once.  However we do need to know
   that BV_j is checked a bounded number of times for each member of I_j;
   this is true, as <= |P| entries in BV_j must be checked in each case.

p. 117, 1st sentence of Thm 5.3.8: add " and that [M_{i,j}] has already 
   been computed and stored."

p. 117, Thm 5.3.8: this seems like a lot of trouble to take for a O(n^2)
   parsing method when simple memoization of the parse table would allow
   for O(n) parsing.  Memoization only affects the implied constant in
   the time and space complexity of the table construction.

p. 120, line 13: $ is not a terminal symbol, but is treated as one here.

p. 120, line 14: replace "any" with "some".

p. 120, line 14: "derivable from $S$" is redundant.

p. 122, proof of Thm. 5.4.4: it's hard to follow because three things are
   happening at once.  The theorem could be recast so as to apply to any CFG,
   and would then say that the parser based on the table M(\gamma,x) on input
   z has accepting computation paths in 1-1 correspondence with leftmost 
   derivations of z.  This can be divided into the two directions of the 
   correspondence.  The third thing, which could be a corollary, is that if
   the grammar is LL(1) then the parser is deterministic.

p. 122, line -5: the two occurrences of "however" are too close to each other.

p. 123, 2nd line of 3(iii)(a): insert " ... \union " in front of F(Y_j).

p. 124, COMPUTE-FOLLOW(G): it's confusing to define G( ) when the input is G;
   that could be remedied by calling it H( ) instead.  The other problem is 
   that rule 3(i) could add terminals to H(X) even through X may not appear
   in any sentential form derivable from S.  That could only happen if X is
   useless, so this could be taken care of by adding to the input conditions
   that G should have no useless symbols.

p. 125, 6th line of section 5.5: "|" missing after second ellipsis.

p. 125, 2nd paragraph of section 5.5: restrict the procedure to CFGs with no
   epsilon rules.  This is because if one of the \beta's is empty and one of 
   the \alpha's starts with E then this technique will create a new 2-step
   left recursion involving E, E', then E again.

p. 127, item 3: add that A' is to be a new variable.

p. 127, first line of Sect. 5.6: should read "In the previous two sections ...".

p. 127, mid-page: make it explicit that a CFG $G = (V,\Sigma,P,S) with no
   useless symbols has been given.  (Useless symbols can't help, and in some
   cases they create special problems, as noted for p. 130.  Alternatively,
   this hypothesis can be delayed until the statement of Thm. 5.6.3.)

p. 128, mid-page: label the three rules for $\delta$ as (1), (2), and (3)
   since they are referred to that way on pages 130-131; also add that
   $\delta(q,Y) = \emptyset$ in all other cases.

p. 129, statement of Thm. 5.6.3: if it is not already stated on p. 127 
   then the theorem statement needs to include the hypothesis that our 
   CFG, $G$, has no useless symbols.

p. 130, 5 lines above (5.11): the terminability of \beta^{'} requires
   a suitable hypothesis, e.g., $G$ contains no useless symbols.  As noted
   above this could be introduced near the start of the section (p. 127) or
   else in the statement of Theorem 5.6.3 (p. 129).  

p. 130, equation (5.11): subscript the derivation relations with {rm}s.

p. 130, line -6: should read "... would follow that ...".

p. 130, lines -7 to -5: this argument is repeated on the next page (line 4),
   so a slight tightening of the proof is obtained by deleting it here.

p. 130, line -4: should read "... a rightmost derivation as in (5.11) ...";
   uniqueness can't be presumed since the grammar has not yet been assumed
   to be LR(0).

p. 131, just after the Def. of LR(0): say that the Knuth DFA is obtained from
   the Knuth $\varepsilon-NFA$ by the usual subset construction.
   Alternatively, this could be said on p. 129 where an example of a Knuth
   DFA is presented in Fig. 5.6.

p. 131, line 17: should read "... where $q_0$ now denotes the initial state of"
   (to empahsize that $q_0$ no longer refers to the initial state of the NFA, as
   it did near the top of the page).

p. 131, line 17: add that $q$ is the only state of the PDA, but the stack
   operations described are more general than those allowed in the definition of 
   a PDA in Section 1.5.  Converting the PDA as given here to conform to the
   strict definition would be straightforward but would require additional 
   states to carry out the extended stack operations as series of simpler ones.

p. 131, line -11: should read "... A --> \alpha\bigdot ...".

p. 131, line -5: add at the end of the sentence "(after emptying the stack)".
   This is to accord with Thm. 5.6.6, in which the DPDA accepts by empty stack.

p. 131, lines -3, -2, -1: $x$ can be dispensed with entirely from the statement
   of Thm 5.6.4; if called for later $x$ must exist because the grammar is 
   LR(0) and therefore contains no useless symbols.
   
p. 132, line 1: use \alpha instead of \gamma to follow the notation in the 
   statement of the theorem.

p. 132, line 5: should read "... hence the end is inside ...".

p. 132, line 6: should read "(ii) the handle ends at $X_k$;", since the
   case B --> X_{j+1}...X_k also needs to be ruled out.

p. 132, proof of (i): cases (i) and (iii) are symmetric but the proof of
   the latter is much more detailed; case (i) could be dealt with by 
   reference to the symmetry and the proof of (iii).

p. 132, line -2: add before line -1 "..., which is defined as long as the 
   stack is nonempty."

p. 132, line -1: at the end of the first sentence add "assuming the stack
   to be nonempty at step i."

p. 133, line 3: subscript the ==> with {rm}.

p. 133, lines 10 through -7 contain eight occurrences of $\alpha$ with no 
   subscript; replace these with, say, $\gamma$ to avoid confusion with the
   subscripted $\alpha$'s in this half of the proof.

p. 133, lines -6 and -5: the sentence which spans these two lines can now
   read "Hence $X_1X_2\cdotsX_k = \gamma = \alpha_n$." (assuming the previous
   change).

p. 133, lines -5 and -4: should read $x$ in place of $w$.

pp. 133 and 134: the proof that $L(G) \subseteq L(M)$ can be made more direct.
   Let M' be the same as M except to allow variable symbols as input, and 
   number the \alpha's from subscript 0 (for $S$, at the start of the 
   derivation) up to $n$ (for $x$, at the end of the derivation).  If M' has
   a choice it is to perform a reduce rather than a shift (just to avoid
   having to prove that this is never actually an issue).  Let |-- refer to
   the action of M'; we can then prove by induction on i that 
   $(q,\alpha_i,q_0) |--* (q,\epsilon,\epsilon)$ for $i = 0, ..., n$.  Since
   $L(M') \cap \Sigma^{*} = L(M)$ and $\alpha_n = x$, this suffices.

p. 134, lines 13, 15, 16, and 17: these contain four occurrences of $\alpha$
   with no subscript; replace these with, say, $\gamma$ to avoid confusion 
   with the subscripted $\alpha$'s in this half of the proof.

p. 134, line -2 of Section 5.6: should end "At this point we have $i=n$ and
   $y=\varepsilon,".

p. 135, exc. 13: add to the hypotheses that j, i, i_1, etc., are all integers. 

p. 144, 1st paragraph of Section 6.4: should read $M$ in place of $L$ in two
   places.  Add a new sentence at the end, to the effect that # is not in
   $\Gamma$ or $Q$, and as usual we assume that $\Gamma$ and $Q$ are disjoint.

p. 149, definition of CSL: should be that $L - {\varepsilon}$ is generated by 
   a CSG.  Here are some of the difficulties that arise later if the empty 
   string is excluded from CSLs --
     Thm 7.1.7 is trivial, since the language consisting of just the empty
   string is recursive (indeed, regular!) but not a CSL;
     Thm 7.1.9 is not correct because of the empty string;
     on p. 158, the correspondence between CSLs and LBAs is inexact because
   of the empty string;
     on p. 165, the extended PDAs of exc. 2 accept nothing, since they can 
   never empty the stack; 
     on p. 166 exc. 8 is trivial, just as for Thm 7.1.7; 
     
p. 150 ff.; "length-increasing" really means "length-nondedcreasing";

p. 153, line 2: replace "are not contained" with "are tape symbols which 
   are not contained".

p. 153, Thm 7.1.5: if the empty string is allowed in CSLs (p. 149), then  
   "$ ... - {\epsilon}$" can be dropped here.

p. 153: early in the proof of Thm 7.1.5, $M'$ is discussed as having two
   "tracks", but four "tracks" would make for a more complete but still
   easily explained simulation.  The reason is that the original language-
   accepting LBA must be converted to a language-generating LBA with two
   tapes.  Simulating the latter requires that the two tape head positions
   be simulated, as well as the two tape contents; two additional tracks 
   are convenient for this purpose, since the simulator only has one head.

p. 154, line 8: should read [$pX]Z --> [$Y][qZ] (instead of having
   [pZ] as the rightmost symbol; here $ denotes the left endmarker).

p. 158, Section 7.2: it should be pointed out that this extends Chomsky's
   hierarchy by having added the classes DCFL and REC.

p. 158, Section 7.2: Since this section is historical, it would add value
   by giving Chomsky's numerical designations for the four original grammar
   types of his hierarchy: type 3 for regular, type 2 for CF, type 1 for CS, and
   type 0 for unrestricted.

p. 158, Section 7.2: the grammar classes for DCFL's are LR(k) for k >= 0;
   also, these are grammars for L$ rather than L, where $ is not in L's
   alphabet (since LR(0) languages have the  prefix property). 

p. 158: if the empty string is allowed in CSLs (p. 149), then the last line
   of Section 7.2 can be omitted, along with the last word of the previous
   line.  

p. 159, describing $\delta$: should read "partial function" instead of 
   "function", since it may be undefined.

p. 159, describing $Z_0$: add that $Z_0$ is in $\Gamma$.

p. 159, describing $q_0$: add that $q_0$ is in $Q$.

p. 159: swap the descriptions of $Z_0$ and $q_0$ (so that the descriptions
   follow the order of the memebers of the 9-tuple comprising the 2DPDA).

p. 159, describing $(q,h,\alpha)$, say that $h$ is an integer rather than 
   a natural number, since the definition allows the 2DPDA to crash by moving
   left to position $h = -1$.  Alternatively, specify the |-- relation so 
   that the crash occurs one step earlier in that scenario.

p. 159, line -3: in the definition of $L(M)$, replace the second occurrence of 
   $Z_0$ with $\varepsilon$, so that "knowledge" of the stack depth is not
   imputed to $M$.  (Normally, a PDA "knows" only the top symbol on the stack,
   and halts/crashes if the stack is empty.)  This modification accords with 
   the operation of the example shown in Figure 7.3.

p. 160, example 7.3.3: point out that the algorithm presented is carried out
   by the DPDA of Figure 7.3.

p. 160, example 7.3.3: first (item 0, perhaps) check to see if $w$ is in the
   language $(a+b)^{*}c(a+b)^{*}$ and reject if not.  For the rest of the 
   algorithm we then know that $w = xcy$ where $x,y \in (a+b)^{*}$.  

p. 160, example 7.3.3: in item 5, $M$ can halt and reject in a different
   way, for if the current prefix of $x$ is a proper prefix of $y$ the
   point will be reached when n-i+1 < |y|, causing $M$ to crash when $a$
   or $b$ is compared with $Z_0$. 

p. 161, table: if the suggestion for the definition of $L(M)$ (p. 159) is
   accepted, then the computation needs one more step to empty the stack.

p. 162: the RAM description is incomplete, as there is no program or input.
   Since details of the RAM are never invoked later, the description could
   be omitted entirely.  The statement of Cook's theorem could be left as is for 
   historical purposes, with a comment to the effect that the RAM is a model
   of computation in which time is proportional to the number of statements
   executed (so our usual worst case analysis of the algorithm can be applied,
   which is what is actually done in the proof).

p. 164, Lemma 7.3.7: the proof can be one line, since $next[i]$ is assigned
   only when $i$ is popped, which happens at most once by Lemma 7.3.6.

p. 164, last line: top($S$) is a list, so a partial configuration will be
   deleted, not popped.

pp. 164-165: the amortized analysis is more complex than need be, while 
   ignoring some operations entirely (e.g., comparisons).  Another approach
   would be to observe that the pseudocode is linear except for the "while"
   and "for each" loops.  Denoting the number of possible partial 
   configurations (given on p. 164 on line -4) by p(n) (where n = |x|)
   we can see that the "while" loop can be executed at most 2p(n)+1 times;
   for by Lemma 7.3.6 the stack is pushed during at most p(n) executions and
   hence popped at most p(n)+1 times.  The "for each" loop can execute
   at most p(n) times, again by Lemma 7.3.6.

p. 165, exc. 1: if the empty string is allowed in CSLs (p. 149), then
   substitutions need to be restricted to CSLs without the empty string 
   and "positive closure" can be replaced with "Kleene closure".

p. 165, exc. 2: replace "push" by "pop the stack and push".