– Topdown parsing
– Bottomup parsing
– Universal
• Non recursive predictive parsing
– Predictive parser can be implemented by recursivedescent parsing
(may need to manipulate the grammar, e.g eliminating left recursion and left factoring).
• Requirement: by looking at the first terminal symbol that a nonterminal symbol can derive, we should be able to choose the right production to expand the nonterminal symbol.
– If the requirement is met, the parser easily be implemented using a nonrecursive scheme by building a parsing table.
• A parsing table example
(1) E>TE’
(2) E’>+TE’
(3) E’>e
(4) T>FT’
(5) T’>*FT’
(6) T’>e
(7) F>(E)
(8) F>id id + * ( ) $
E (1) (1)
E’ (2) (3) (3)
T (4) (4)
T’ (6) (5) (6) (6)
F (8) (7)
– A stack of grammar symbols ($ on the bottom)
– A string of input tokens ($ at the end)
– A parsing table, M[NT, T] of productions
– Algorithm:
– put ‘$ Start’ on the stack ($ is the end of input string).
1) if top == input == $ then accept
2) if top == input then pop top of the stack; advance to next input symbol; goto 1;
3) If top is nonterminal if M[top, input] is a production then replace top with the production; goto 1 else error
4) else error
– Example:
(1) E>TE’
(2) E’>+TE’
(3) E’>e
(4) T>FT’
(5) T’>*FT’
(6) T’>e
(7) F>(E)
(8) F>id id + * ( ) $
E (1) (1)
E’ (2) (3) (3)
T (4) (4)
T’ (6) (5) (6) (6)
F (8) (7)
Stack input production
$E id+id*id$
$E’T id+id*id$ E>TE’
$E’T’F id+id*id$ T>FT’
$E’T’id id+id*id$ F>id
$E’T’ +id*id$
…...
This produces leftmost derivation:
E=>TE’=>FT’E’=>idT’E’=>….=>id+id*id
– First(a): Here, a is a string of symbols. The set of terminals that begin strings derived from a. If a is empty string or generates empty string, then empty string is in First(a).
– Follow(A): Here, A is a nonterminal symbol.
Follow(A) is the set of terminals that can immediately follow A in a sentential form.
– Example:
S>iEtS  iEtSeSa
E>b
First(a) = ?, First(iEtS) = ?, First(S) = ?
Follow(E) = ? Follow(S) = ?
– With first(a) and follow(A), we can build the parsing table. For each production A>a:
• Add A>a to M[A, t] for each t in First(a).
• If First(a) contains empty string
– Add A>a to M[A, t] for each t in Follow(A)
– if $ is in Follow(A), add A>a to M[A, $]
• Make each undefined entry of M error.
– See the example 4.18 (page 191).
• Compute FIRST(X)
– If X is a terminal then FIRST(X) = {X}
– If X>e, add e to FIRST(X)
– if X>Y1 Y2 … Yk and Y1 Y2 … Yi1==>e, where I<= k, add every none e in FIRST(Yi) to FIRST(X). If Y1…Yk=>e, add e to
FIRST(X).
– FIRST(Y1 Y2 … Yk): similar to the third step.
E>TE’ FIRST(E) = {(, id}
E’>+TE’e FIRST(E’)={+, e}
T>FT’ FIRST(T) = {(, id}
T’>*FT’  e FIRST(T’) = {*, e}
F>(E)  id FIRST(F) = {(, id}
• Compute Follow(A).
– If S is the start symbol, add $ to Follow(S).
– If A>aBb, add Frist(b){e} to Follow(B).
– If A>aB or A>aBb and b=>e, add Follow(A) to Follow(B).
E>TE’ First(E) = {(, id}, Follow(E)={), $}
E’>+TE’e First(E’)={+, e}, Follow(E’) = {), $}
T>FT’ First(T) = {(, id}, Follow(T) = {+, ), $}
T’>*FT’  e First(T’) = {*, e}, Follow(T’) = {+, ), $}
F>(E)  id First(F) = {(, id}, Follow(F) = {*, +, ), $}
– First L: scans input from left to right
– Second L: produces a leftmost derivation
– 1: uses one input symbol of lookahead at each step to make a parsing decision.
– A grammar whose parsing table has no multiplydefined entries is a LL(1) grammar.
– No ambiguous or leftrecursive grammar can be LL(1)
– A grammar is LL(1) iff for each set of A productions, where
– A
1

2
 ...

n
The following conditions hold:
First (
i
)
First (
j
)
{}, when 1
i
n and 1
j
n and i
j if
i
(a) no,
, the n j
e, when i
j
(b) First(
j
)
Follow(A)
{}, when i
j.
S> i E t S e S  i E t S  a
E > b