operator precendence parsing
Operator Precedence Parsing
· The grammars that have the property that no production right side is ε or has two adjacent nonterminals.
· A grammar with the later property is called an operator grammar.
· The following grammar for expressions is not an operator grammar:
E → E A E | ( E ) | - E | id
A → + | - | * | / | ^
o Because the right side EAE has two consecutive nonterminals.
· To obtain the (Grammar 2.9.1) as operator grammar, substitute for A each of its alternatives for production E.
E → E + E | E – E | E * E | E / E | E ^ E | ( E ) | - E | id
· Operator precedence parsing has number of disadvantage.
o It is hard to handle tokens like minus sign, which has two different precedence. (depending on whether it is unary or binary).
o Since the relationship between a grammar for the language being parsed and the operator-precedence parser itself is weak, it cannot always be sure that the parser accepts exactly the desired language.
o Only small class of grammars can be parsed using operator-precedence technique.
· Way to proceed the operator-precedence parsing:
o In operator-precedence parsing three disjoint precedence relations between certain pairs of terminals has been defined.
They are <., , .>
o These precedence relations guide the selection of handles and have the following meaning:
o Using operator-precedence relations:
§ The intention of the precedence relations is to delimit the handle of a right sentential form, with <. marking the left end, appearing in the interior of the handle, and .> marking the right end.
§ To be more precise, a right sentential form of an operator grammar that no adjacent nonterminals appear on the right sides of production implies that no right sentential form will have two adjacent nonterminals either.
§ Then the right sentential form is written as β0a1β1…anβn, where each βi is either ε or a single nonterminal and each ai is a single terminal.
§ Exactly one of the relations <., , .> holds between two terminals. Further use $ to mark each end of the string, and define $ <. b and b .> $ for all terminals b.
§ Example 2.9.1
· Consider the grammar
E → E + E | E * E | ( E ) | id
The corresponding operator precedence relation table is as follows:
Consider the initial right sentential form id + id * id.
· Then the string with the precedence relations inserted is:
$ <. id .> + <. id .> * <. id .> $ (2.9.1)
§ The handle can be found by following process:
· 1. Scan the string from the left end until the first .> is encountered. In above(2.9.1), this occurs between the first id and +.
· 2. The scan backwards (to the left over any ’s until a <. Is encountered. In(2.9.1), scan backwards to $ (left).
· 3. The handle contains everything to the left of the first .> and to the right of the <. Encountered in step (2), including any intervening or surrounding nonterminals. In (2.9.1), the handle is the first id (leftmost).
§ Then from the (Grammar 2.9.3), reduce id to E. i.e., the input string becomes E + id * id.
§ Similarly continue the above three step. Now (2.9.1) becomes E + E * E, after reducing the two id’s from (2.9.1).
§ Then when the nonterminals are deleted, there will be $ + * $, insert the operator relation and reduce.
o Algorithm 2.9.1 Operator-Precedence parsing algorithm.
§ Input: An input string w and a table of precedence relations.
§ Output: If well formed with a placeholder nonterminal E labeling all interior nodes; otherwise, an error indication.
§ Method: Initially, the stack contains $ and the input buffer the string w$.
(1) set ip to point to the first symbol of w$;
(2) repeat forever
(3) if $ is on top of the stack and ip points to $ then
(4) return
(5) else begin
(6) let a be the topmost terminal symbol on the stack
and let b be the symbol pointed to by ip;
(6) if a <. b or a b then begin
(7) push b onto the stack;
(8) advance ip to the next input symbol;
end;
(9) else if a .> b then /* reduce */
(10) repeat
(11) pop the stack
(12) until the top stack terminal is replaced by <.
to the terminal most recently popped
(13) else error( )
end;
· Operator-Precedence Relations from Associativity and Precedence:
o The rules are designed to select the “proper” handles to reflect a given set of associativity and precedence rules for binary operators.
§ 1. If operator θ1 has higher precedence than operator θ2, make θ1 .> θ2 and θ2 <. θ1. These relations ensure that, in an expression of the form E + E * E + E, the central E * E is the handle that will be reduced first.
§ 2. If θ1 and θ2 are operators of equal precedence ( they may in fact be the same operator), then make θ1 .> θ2 and θ2 .> θ1, if the operators are left associative, or make θ1 <. θ2 and θ2 <. θ1, if the operators are right associative. These relations ensure that E – E + E will have handle E – E selected and E ^ E ^ E will have right E ^ E selected.
§ 3. Make θ <. id and id .> θ,
θ <. ( and ( <. θ,
) .> θ and θ .> ),
θ .> $ and $ .> θ,
for all operators θ. Also
( ), $ <. (, $ <. id,
(<. (, id .> $, ) .> $,
(<. id, id .> ), ) .> )
o These rules ensure that both id and (E) will be reduced to E. Also, $ serves as both the left and right endmarker, causing handles to be found between $’s wherever possible.
o Example 2.9.2
§ Consider the grammar
E → E + E | E – E | E * E | E / E | E ^ E | ( E ) | - E | id (Grammar 2.9.2)
produce the operator precedence relations for the above grammar.
and then parse the input id * ( id ^ id) – id / id from the operator-
precedence relations
from the above operator-precedence relation table, parse the input string.
· Precedence Functions:
o Compilers using operator-precedence parsers need not store the table of precedence relations.
o In most cases, the table is encoded by two precedence functions f and g that map terminal symbols to integers.
o To select f and g so that for symbols a and b,
§ f(a) < g(b) whenever a <. b,
§ f(a) = g(b) whenever a b,
§ f(a) > g(b) whenever a .> b
o Thus the precedence relation between a and b determined by a numerical comparison betweenf(a) and g(b).
o Algorithm 2.9.2 Constructing precedence functions:
§ Input: An operator precedence matrix.
§ Output: Precedence functions representing the input matrix, or an indication that none exist.
§ Method:
1. Create symbols fa and ga for each a that is a terminal or $.
2. Partition the created symbols into as many groups as possible, in such a way that if a b, then fa and gb are in the same group.
3. Create a directed graph whose nodes are the groups found in (2). For any a and b, if a <. b, place an edge from the group of gb to the group of fa. If a .> b, place an edge from the group of fa to that of gb.
Note: that an edge or path from fa to gb means that f(a) must exceed g(b) ; a path from gb to fa means that g(b) must exceed f(a).
4. If the graph constructed in (3) has a cycle, then no precedence functions exist. If there are no cycles, let f(a) be the length of the longest path beginning at the group of fa; let g(b) be the length of the longest path from the group of gb.
o Example 2.9.3
§ For the operator-precedence relation for the (Grammar 2.9.2) the precedence function is given as:
o Exercise 2.9.1
§ Consider the grammar
E → E + E | E * E | ( E ) | id (Grammar 2.9.3)
§ The corresponding operator precedence relation table is as follows:
Construct the graph using the Algorithm 2.9.2 and derive the precedence function.
§ The graph representing the precedence function is as follows:
§ The resulting precedence functions are:
§ The path is represented as, for example, there is a path from gid to f* to g* to f+ to g+ to f$= 5, the precedence function g representing id is 5.
· Error Recovery in operator-precedence parsing:
o There are two points in the parsing process at which an operator-precedence parser can discover syntactic error:
§ If no precedence relation holds between the terminal on top of the stack and the current input.
§ If a handle has been found, but there is no production with this handle as a right side.
o The error checker does the following errors:
§ Missimg operand
§ Missing operator
§ No expression between parenthesis
§ Illegal b on line (line containing b)
§ Missing d on line (line containing c)
§ Missing E on line (line containing b)
o These eroor diagonistic issued at handling of errors during reduction.
o Dudring handling of shift/reduce errors; the diagonistics issues are:
§ Missing operand
§ Unbalanced right parenthesis
§ Missing right paraenthesis
§ Missing operator
Comments
Post a Comment