HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, an ambiguous grammar is a
context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be em ...
for which there exists a string that can have more than one leftmost derivation or
parse tree A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term ''parse tree'' itself is used primarily in co ...
, while an unambiguous grammar is a context-free grammar for which every valid string has a unique leftmost derivation or parse tree. Many languages admit both ambiguous and unambiguous grammars, while some languages admit only ambiguous grammars. Any non-empty language admits an ambiguous grammar by taking an unambiguous grammar and introducing a duplicate rule or synonym (the only language without ambiguous grammars is the empty language). A language that only admits ambiguous grammars is called an inherently ambiguous language, and there are inherently ambiguous
context-free language In formal language theory, a context-free language (CFL) is a language generated by a context-free grammar (CFG). Context-free languages have many applications in programming languages, in particular, most arithmetic expressions are generated by ...
s. Deterministic context-free grammars are always unambiguous, and are an important subclass of unambiguous grammars; there are non-deterministic unambiguous grammars, however. For computer
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s, the reference grammar is often ambiguous, due to issues such as the
dangling else The dangling else is a problem in programming of parser generators in which an optional else clause in an if–then(–else) statement results in nested conditionals being ambiguous. Formally, the reference context-free grammar of the language i ...
problem. If present, these ambiguities are generally resolved by adding precedence rules or other context-sensitive parsing rules, so the overall phrase grammar is unambiguous. Some parsing algorithms (such as ( Earley or GLR parsers) can generate sets of parse trees (or "parse forests") from strings that are
syntactically ambiguous Syntactic ambiguity, also called structural ambiguity, amphiboly or amphibology, is a situation where a sentence may be interpreted in more than one way due to ambiguous sentence structure. Syntactic ambiguity arises not from the range of mean ...
.


Examples


Trivial language

The simplest example is the following ambiguous grammar (with start symbol A) for the trivial language that consists of only the empty string: :A → A , ε …meaning that the nonterminal A can be derived to either itself again, or to the empty string. Thus the empty string has leftmost derivations of length 1, 2, 3, and indeed of any length, depending on how many times the rule A → A is used. This language also has an unambiguous grammar, consisting of a single production rule: :A → ε …meaning that the unique production can produce only the empty string, which is the unique string in the language. In the same way, any grammar for a non-empty language can be made ambiguous by adding duplicates.


Unary string

The
regular language In theoretical computer science and formal language theory, a regular language (also called a rational language) is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science (as opposed to ...
of unary strings of a given character, say 'a' (the regular expression a*), has the unambiguous grammar: :A → aA , ε …but also has the ambiguous grammar: :A → aA , Aa , ε These correspond to producing a right-associative tree (for the unambiguous grammar) or allowing both left- and right- association. This is elaborated below.


Addition and subtraction

The
context free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be emp ...
:A → A + A , A − A , a is ambiguous since there are two leftmost derivations for the string a + a + a: As another example, the grammar is ambiguous since there are two
parse tree A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term ''parse tree'' itself is used primarily in co ...
s for the string a + a − a: : The language that it generates, however, is not inherently ambiguous; the following is a non-ambiguous grammar generating the same language: :A → A + a , A − a , a


Dangling else

A common example of ambiguity in computer programming languages is the
dangling else The dangling else is a problem in programming of parser generators in which an optional else clause in an if–then(–else) statement results in nested conditionals being ambiguous. Formally, the reference context-free grammar of the language i ...
problem. In many languages, the else in an If–then(–else) statement is optional, which results in nested conditionals having multiple ways of being recognized in terms of the context-free grammar. Concretely, in many languages one may write conditionals in two valid forms: the if-then form, and the if-then-else form – in effect, making the else clause optional:The following example uses
Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, Frenc ...
syntax
In a grammar containing the rules Statement → if Condition then Statement , if Condition then Statement else Statement , ... Condition → ... some ambiguous phrase structures can appear. The expression if a then if b then s else s2 can be parsed as either if a then begin if b then s end else s2 or as if a then begin if b then s else s2 end depending on whether the else is associated with the first if or second if. This is resolved in various ways in different languages. Sometimes the grammar is modified so that it is unambiguous, such as by requiring an endif statement or making else mandatory. In other cases the grammar is left ambiguous, but the ambiguity is resolved by making the overall phrase grammar context-sensitive, such as by associating an else with the nearest if. In this latter case the grammar is unambiguous, but the context-free grammar is ambiguous.


An unambiguous grammar with multiple derivations

The existence of multiple derivations of the same string does not suffice to indicate that the grammar is ambiguous; only multiple ''leftmost'' derivations (or, equivalently, multiple parse trees) indicate ambiguity. For example, the simple grammar S → A + A A → 0 , 1 is an unambiguous grammar for the language . While each of these four strings has only one leftmost derivation, it has two different derivations, for example S A + A ⇒ 0 + A ⇒ 0 + 0 and S ⇒ A + A ⇒ A + 0 ⇒ 0 + 0 Only the former derivation is a leftmost one.


Recognizing ambiguous grammars

The
decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whe ...
of whether an arbitrary grammar is ambiguous is undecidable because it can be shown that it is equivalent to the Post correspondence problem. At least, there are tools implementing some semi-decision procedure for detecting ambiguity of context-free grammars. The efficiency of parsing a context-free grammar is determined by the automaton that accepts it. Deterministic context-free grammars are accepted by deterministic pushdown automata and can be parsed in linear time, for example by an LR parser. They are a strict subset of the context-free grammars, which are accepted by
pushdown automata In the theory of computation, a branch of theoretical computer science, a pushdown automaton (PDA) is a type of automaton that employs a stack. Pushdown automata are used in theories about what can be computed by machines. They are more capab ...
and can be parsed in polynomial time, for example by the CYK algorithm. Unambiguous context-free grammars can be nondeterministic. For example, the language of even-length
palindrome A palindrome is a word, number, phrase, or other sequence of symbols that reads the same backwards as forwards, such as the words ''madam'' or ''racecar'', the date and time ''11/11/11 11:11,'' and the sentence: "A man, a plan, a canal – Pana ...
s on the alphabet of 0 and 1 has the unambiguous context-free grammar S → 0S0 , 1S1 , ε. An arbitrary string of this language cannot be parsed without reading all its symbols first, which means that a pushdown automaton has to try alternative state transitions to accommodate for the different possible lengths of a semi-parsed string. Nevertheless, removing grammar ambiguity may produce a deterministic context-free grammar and thus allow for more efficient parsing. Compiler generators such as
YACC Yacc (Yet Another Compiler-Compiler) is a computer program for the Unix operating system developed by Stephen C. Johnson. It is a Look Ahead Left-to-Right Rightmost Derivation (LALR) parser generator, generating a LALR parser (the part of a co ...
include features for resolving some kinds of ambiguity, such as by using the precedence and associativity constraints.


Inherently ambiguous languages

The existence of inherently ambiguous languages was proven with Parikh's theorem in 1961 by Rohit Parikh in an MIT research report. While some context-free languages (the set of strings that can be generated by a grammar) have both ambiguous and unambiguous grammars, there exist context-free languages for which no unambiguous context-free grammar can exist. An example of an inherently ambiguous language is the union of \ with \. This set is context-free, since the union of two context-free languages is always context-free. But give a proof that there is no way to unambiguously parse strings in the (non-context-free) common subset \.p.99-103, Sect.4.7


See also

*
GLR parser A GLR parser (GLR standing for "Generalized LR", where L stands for "left-to-right" and R stands for "rightmost (derivation)") is an extension of an LR parser algorithm to handle non-deterministic and ambiguous grammars. The theoretical foundat ...
, a type of parser for ambiguous and nondeterministic grammars * Chart parser, another type of parser for ambiguous grammars *
Syntactic ambiguity Syntactic ambiguity, also called structural ambiguity, amphiboly or amphibology, is a situation where a sentence may be interpreted in more than one way due to ambiguous sentence structure. Syntactic ambiguity arises not from the range of mean ...


References

* * * * *


Notes

{{reflist, group=note


External links


dk.brics.grammar
- a grammar ambiguity analyzer.

- tool for analyzing context-free grammars with respect to language universality, ambiguity, and similar properties. Formal languages Ambiguity