CSCI 8610 Topics in Theoretical Computer Science Fall 2006 Errata for the Course Notes by Jeff Shallit ________________________________________________________________________________ p. 29, line 0 (header): the header runs into the page number on the right. p. 48, line 4: insert space before "for"; optionally, replace "in" with "$\in$". p. 52, def. of $Q'$: braces needed around $q_0'$. p. 52, def. of $\delta'(q_0',a)$: insert space before "for". p. 52, def. of $\delta'(\tau_w,a)$: insert space before "for" and add "and $a \in \Sigma$" at the end. pp. 80-81: as stated and proved, Ogden's lemma doesn't fully support all of its later uses. There are applications made of it which require conclusions to be reached about the CFG itself (as opposed to the language it generates), and one of these is strengthened if no requirements (such as Chomsky normal form) are imposed on the CFG. This can be fixed as follows. First, prove Lemma 4.3.1 directly with an arbitrary CFG, rather than an equivalent one in Chomsky normal form. If $G$ is a CFG with $k$ variables and no production with a body longer than $m$, then $n = 2m^k$ is a pumping length for $L(G)$, and the proof goes through with little change except for replacing most occurrances of $2$ with $m$. Second, add as a new Corollary that under the hypotheses of the lemma, there is a subtree of the derivation tree for $z$ with yield $vwx$, rooted at some variable $A$ which also occurs inside the subtree with yield $w$ (exactly as in Figure 4.1). This requires no additional proof, since the proof of Lemma 4.3.1 shows that such a configuration exists. This Corollary can be referrred to as "Ogden's lemma for $G$". Here are the applications of the modified Ogden's lemma and Corollary: p. 89, line 6: the configuration needed for the contradiction is guaranteed by Ogden's lemma for $G$, so the argument here no longer implicitly relies on the proof of Lemma 4.3.1 (as opposed to the stated conclusions of the lemma); pp. 90-91, the proof of the optimality of the triple construction requires the configurations which are guaranteed by the new Corollary, so no longer implicitly relies on the proof of Lemma 4.3.1 (as opposed to the stated conclusions), and also the optimality now applies to any CFG, not just those in Chomsky normal form (so the result actually proved is strengthened); pp. 92-94, the proof of Parikh's Theorem requires the configurations which are guaranteed by the new Corollary, so again no longer implicitly relies on the proof of Lemma 4.3.1 (as opposed to the statement of it). p. 89, line 11: delete "and". p. 89, 3rd paragraph of the proof: end the displayed portion at the comma. p. 89, 3rd paragraph of the proof, next line: should start "which gives an ...". p. 90, line -12: if the new corollary to Lemma 4.3.1 is referrred to as "Ogden's lemma for $G$", replace "applied to $G$" with "for $G$". p. 92, line -7: delete the last letter of "derivationj". p. 92, line -7: in view of the modified Ogden's lemma and Corollary (pp. 80-81), the restriction here to Chomsky normal form may be omitted. p. 95, line -3: just after the definition of $F'$, insert " , $q_0', d, f \notin Q, X_0 \notin \Gamma, ". p. 97, line -5: replace "after processing $u$." with "after processing $u$ (including any $\varepsilon$-moves needed to minimize the stack height)." p. 98, line -4: replace "linear-ness" with "linearity". p. 100, 2nd line of exc. 4: insert space between "some" and "$n$". p. 100, 2nd line of exc. 4: move the right brace over to just before the period. p. 105, line 5: should read "4.7" instead of "??". p. 105, last line: the addition of notes for Section 4.9 would be in order. p. 111, 3rd line above Lemma 5.3.1: should read "... iff A --> \alpha\beta \in P and there exists". p. 113, 1st line of Thm. 5.3.3: should read "... iff C --> \eta\gamma \in P and ...". p. 114, 1st line of proof of Lemma 5.3.4: should read "... length of a derivation of \mu ...". p. 114, 2nd line above (5.1): should read "... derivations above from \mu_1, \gamma, and \mu_2 takes < r steps. ...". (This avoids the impression that the derivation from B must take < r steps; if the /mu's are all terminals, it would take exactly r steps.) p. 115, line 2: should read "... a derivation ...". p. 115, line 5: should finish "... where the displayed occurrence of B". p. 115, line 11: the first sentence of the line isn't quite right. It doesn't seem to be needed, so could just be omitted. All we know is that \delta can be derived from \beta\delta_1. p. 115, Thm. 5.3.7: should start "If G is an unambiguous CFG with no useless symbols, no \epsilon-rules, and no unit rules,". The extra restrictions appear in Thm. 5.3.8 and exc. 9, both of which are used in the proof. p. 115, lines -3 and -2: here \alpha must be \epsilon, since there are no \epsilon-rules, so it would be clearer to omit it. p. 116, step 3 of ADD: could add "i.e., if \alpha = \epsilon," after "If i=j ". This follows from having excluded \epsilon-rules. p. 116, step 4 of ADD: should be omitted now, as the situation can't arise without \epsilon-rules. p. 116, ADD: another case is needed. ADD((A --> \alpha . , i), j) should just append (A --> \alpha . , i) to I_j. p. 116, Earley's algorithm: step D should start "For all productions B --> \gamma \in P, if ...". p. 116, last paragraph of the proof: an amortized analysis is not needed, since each action is taken at most once. However we do need to know that BV_j is checked a bounded number of times for each member of I_j; this is true, as <= |P| entries in BV_j must be checked in each case. p. 117, 1st sentence of Thm 5.3.8: add " and that [M_{i,j}] has already been computed and stored." p. 117, Thm 5.3.8: this seems like a lot of trouble to take for a O(n^2) parsing method when simple memoization of the parse table would allow for O(n) parsing. Memoization only affects the implied constant in the time and space complexity of the table construction. p. 120, line 13: $ is not a terminal symbol, but is treated as one here. p. 120, line 14: replace "any" with "some". p. 120, line 14: "derivable from $S$" is redundant. p. 122, proof of Thm. 5.4.4: it's hard to follow because three things are happening at once. The theorem could be recast so as to apply to any CFG, and would then say that the parser based on the table M(\gamma,x) on input z has accepting computation paths in 1-1 correspondence with leftmost derivations of z. This can be divided into the two directions of the correspondence. The third thing, which could be a corollary, is that if the grammar is LL(1) then the parser is deterministic. p. 122, line -5: the two occurrences of "however" are too close to each other. p. 123, 2nd line of 3(iii)(a): insert " ... \union " in front of F(Y_j). p. 124, COMPUTE-FOLLOW(G): it's confusing to define G( ) when the input is G; that could be remedied by calling it H( ) instead. The other problem is that rule 3(i) could add terminals to H(X) even through X may not appear in any sentential form derivable from S. That could only happen if X is useless, so this could be taken care of by adding to the input conditions that G should have no useless symbols. p. 125, 6th line of section 5.5: "|" missing after second ellipsis. p. 125, 2nd paragraph of section 5.5: restrict the procedure to CFGs with no epsilon rules. This is because if one of the \beta's is empty and one of the \alpha's starts with E then this technique will create a new 2-step left recursion involving E, E', then E again. p. 127, item 3: add that A' is to be a new variable. p. 127, first line of Sect. 5.6: should read "In the previous two sections ...". p. 127, mid-page: make it explicit that a CFG $G = (V,\Sigma,P,S) with no useless symbols has been given. (Useless symbols can't help, and in some cases they create special problems, as noted for p. 130. Alternatively, this hypothesis can be delayed until the statement of Thm. 5.6.3.) p. 128, mid-page: label the three rules for $\delta$ as (1), (2), and (3) since they are referred to that way on pages 130-131; also add that $\delta(q,Y) = \emptyset$ in all other cases. p. 129, statement of Thm. 5.6.3: if it is not already stated on p. 127 then the theorem statement needs to include the hypothesis that our CFG, $G$, has no useless symbols. p. 130, 5 lines above (5.11): the terminability of \beta^{'} requires a suitable hypothesis, e.g., $G$ contains no useless symbols. As noted above this could be introduced near the start of the section (p. 127) or else in the statement of Theorem 5.6.3 (p. 129). p. 130, equation (5.11): subscript the derivation relations with {rm}s. p. 130, line -6: should read "... would follow that ...". p. 130, lines -7 to -5: this argument is repeated on the next page (line 4), so a slight tightening of the proof is obtained by deleting it here. p. 130, line -4: should read "... a rightmost derivation as in (5.11) ..."; uniqueness can't be presumed since the grammar has not yet been assumed to be LR(0). p. 131, just after the Def. of LR(0): say that the Knuth DFA is obtained from the Knuth $\varepsilon-NFA$ by the usual subset construction. Alternatively, this could be said on p. 129 where an example of a Knuth DFA is presented in Fig. 5.6. p. 131, line 17: should read "... where $q_0$ now denotes the initial state of" (to empahsize that $q_0$ no longer refers to the initial state of the NFA, as it did near the top of the page). p. 131, line 17: add that $q$ is the only state of the PDA, but the stack operations described are more general than those allowed in the definition of a PDA in Section 1.5. Converting the PDA as given here to conform to the strict definition would be straightforward but would require additional states to carry out the extended stack operations as series of simpler ones. p. 131, line -11: should read "... A --> \alpha\bigdot ...". p. 131, line -5: add at the end of the sentence "(after emptying the stack)". This is to accord with Thm. 5.6.6, in which the DPDA accepts by empty stack. p. 131, lines -3, -2, -1: $x$ can be dispensed with entirely from the statement of Thm 5.6.4; if called for later $x$ must exist because the grammar is LR(0) and therefore contains no useless symbols. p. 132, line 1: use \alpha instead of \gamma to follow the notation in the statement of the theorem. p. 132, line 5: should read "... hence the end is inside ...". p. 132, line 6: should read "(ii) the handle ends at $X_k$;", since the case B --> X_{j+1}...X_k also needs to be ruled out. p. 132, proof of (i): cases (i) and (iii) are symmetric but the proof of the latter is much more detailed; case (i) could be dealt with by reference to the symmetry and the proof of (iii). p. 132, line -2: add before line -1 "..., which is defined as long as the stack is nonempty." p. 132, line -1: at the end of the first sentence add "assuming the stack to be nonempty at step i." p. 133, line 3: subscript the ==> with {rm}. p. 133, lines 10 through -7 contain eight occurrences of $\alpha$ with no subscript; replace these with, say, $\gamma$ to avoid confusion with the subscripted $\alpha$'s in this half of the proof. p. 133, lines -6 and -5: the sentence which spans these two lines can now read "Hence $X_1X_2\cdotsX_k = \gamma = \alpha_n$." (assuming the previous change). p. 133, lines -5 and -4: should read $x$ in place of $w$. pp. 133 and 134: the proof that $L(G) \subseteq L(M)$ can be made more direct. Let M' be the same as M except to allow variable symbols as input, and number the \alpha's from subscript 0 (for $S$, at the start of the derivation) up to $n$ (for $x$, at the end of the derivation). If M' has a choice it is to perform a reduce rather than a shift (just to avoid having to prove that this is never actually an issue). Let |-- refer to the action of M'; we can then prove by induction on i that $(q,\alpha_i,q_0) |--* (q,\epsilon,\epsilon)$ for $i = 0, ..., n$. Since $L(M') \cap \Sigma^{*} = L(M)$ and $\alpha_n = x$, this suffices. p. 134, lines 13, 15, 16, and 17: these contain four occurrences of $\alpha$ with no subscript; replace these with, say, $\gamma$ to avoid confusion with the subscripted $\alpha$'s in this half of the proof. p. 134, line -2 of Section 5.6: should end "At this point we have $i=n$ and $y=\varepsilon,". p. 135, exc. 13: add to the hypotheses that j, i, i_1, etc., are all integers. p. 144, 1st paragraph of Section 6.4: should read $M$ in place of $L$ in two places. Add a new sentence at the end, to the effect that # is not in $\Gamma$ or $Q$, and as usual we assume that $\Gamma$ and $Q$ are disjoint. p. 149, definition of CSL: should be that $L - {\varepsilon}$ is generated by a CSG. Here are some of the difficulties that arise later if the empty string is excluded from CSLs -- Thm 7.1.7 is trivial, since the language consisting of just the empty string is recursive (indeed, regular!) but not a CSL; Thm 7.1.9 is not correct because of the empty string; on p. 158, the correspondence between CSLs and LBAs is inexact because of the empty string; on p. 165, the extended PDAs of exc. 2 accept nothing, since they can never empty the stack; on p. 166 exc. 8 is trivial, just as for Thm 7.1.7; p. 150 ff.; "length-increasing" really means "length-nondedcreasing"; p. 153, line 2: replace "are not contained" with "are tape symbols which are not contained". p. 153, Thm 7.1.5: if the empty string is allowed in CSLs (p. 149), then "$ ... - {\epsilon}$" can be dropped here. p. 153: early in the proof of Thm 7.1.5, $M'$ is discussed as having two "tracks", but four "tracks" would make for a more complete but still easily explained simulation. The reason is that the original language- accepting LBA must be converted to a language-generating LBA with two tapes. Simulating the latter requires that the two tape head positions be simulated, as well as the two tape contents; two additional tracks are convenient for this purpose, since the simulator only has one head. p. 154, line 8: should read [$pX]Z --> [$Y][qZ] (instead of having [pZ] as the rightmost symbol; here $ denotes the left endmarker). p. 158, Section 7.2: it should be pointed out that this extends Chomsky's hierarchy by having added the classes DCFL and REC. p. 158, Section 7.2: Since this section is historical, it would add value by giving Chomsky's numerical designations for the four original grammar types of his hierarchy: type 3 for regular, type 2 for CF, type 1 for CS, and type 0 for unrestricted. p. 158, Section 7.2: the grammar classes for DCFL's are LR(k) for k >= 0; also, these are grammars for L$ rather than L, where $ is not in L's alphabet (since LR(0) languages have the prefix property). p. 158: if the empty string is allowed in CSLs (p. 149), then the last line of Section 7.2 can be omitted, along with the last word of the previous line. p. 159, describing $\delta$: should read "partial function" instead of "function", since it may be undefined. p. 159, describing $Z_0$: add that $Z_0$ is in $\Gamma$. p. 159, describing $q_0$: add that $q_0$ is in $Q$. p. 159: swap the descriptions of $Z_0$ and $q_0$ (so that the descriptions follow the order of the memebers of the 9-tuple comprising the 2DPDA). p. 159, describing $(q,h,\alpha)$, say that $h$ is an integer rather than a natural number, since the definition allows the 2DPDA to crash by moving left to position $h = -1$. Alternatively, specify the |-- relation so that the crash occurs one step earlier in that scenario. p. 159, line -3: in the definition of $L(M)$, replace the second occurrence of $Z_0$ with $\varepsilon$, so that "knowledge" of the stack depth is not imputed to $M$. (Normally, a PDA "knows" only the top symbol on the stack, and halts/crashes if the stack is empty.) This modification accords with the operation of the example shown in Figure 7.3. p. 160, example 7.3.3: point out that the algorithm presented is carried out by the DPDA of Figure 7.3. p. 160, example 7.3.3: first (item 0, perhaps) check to see if $w$ is in the language $(a+b)^{*}c(a+b)^{*}$ and reject if not. For the rest of the algorithm we then know that $w = xcy$ where $x,y \in (a+b)^{*}$. p. 160, example 7.3.3: in item 5, $M$ can halt and reject in a different way, for if the current prefix of $x$ is a proper prefix of $y$ the point will be reached when n-i+1 < |y|, causing $M$ to crash when $a$ or $b$ is compared with $Z_0$. p. 161, table: if the suggestion for the definition of $L(M)$ (p. 159) is accepted, then the computation needs one more step to empty the stack. p. 162: the RAM description is incomplete, as there is no program or input. Since details of the RAM are never invoked later, the description could be omitted entirely. The statement of Cook's theorem could be left as is for historical purposes, with a comment to the effect that the RAM is a model of computation in which time is proportional to the number of statements executed (so our usual worst case analysis of the algorithm can be applied, which is what is actually done in the proof). p. 164, Lemma 7.3.7: the proof can be one line, since $next[i]$ is assigned only when $i$ is popped, which happens at most once by Lemma 7.3.6. p. 164, last line: top($S$) is a list, so a partial configuration will be deleted, not popped. pp. 164-165: the amortized analysis is more complex than need be, while ignoring some operations entirely (e.g., comparisons). Another approach would be to observe that the pseudocode is linear except for the "while" and "for each" loops. Denoting the number of possible partial configurations (given on p. 164 on line -4) by p(n) (where n = |x|) we can see that the "while" loop can be executed at most 2p(n)+1 times; for by Lemma 7.3.6 the stack is pushed during at most p(n) executions and hence popped at most p(n)+1 times. The "for each" loop can execute at most p(n) times, again by Lemma 7.3.6. p. 165, exc. 1: if the empty string is allowed in CSLs (p. 149), then substitutions need to be restricted to CSLs without the empty string and "positive closure" can be replaced with "Kleene closure". p. 165, exc. 2: replace "push" by "pop the stack and push".