HOME

TheInfoList



OR:

An affix grammar is a kind of
formal grammar In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe ...
; it is used to describe the
syntax In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure ( constituenc ...
of languages, mainly
computer language A computer language is a formal language used to communicate with a computer. Types of computer languages include: * Construction language – all forms of communication by which a human can specify an executable problem solution to a comput ...
s, using an approach based on how natural language is typically described.Koster, Cornelis HA.
Affix grammars for natural languages
" Attribute Grammars, Applications and Systems. Springer, Berlin, Heidelberg, 1991.
The grammatical rules of an affix grammar are those of a
context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be em ...
, except that certain parts in the nonterminals (the
affix In linguistics, an affix is a morpheme that is attached to a word stem to form a new word or word form. Affixes may be derivational, like English ''-ness'' and ''pre-'', or inflectional, like English plural ''-s'' and past tense ''-ed''. They ...
es) are used as arguments. If the same affix occurs multiple times in a rule, its value must agree, i.e. it must be the same everywhere. In some types of affix grammar, more complex relationships between affix values are possible.


Example

We can describe an extremely simple fragment of English in the following manner: : ''Sentence'' → ''Subject'' ''Predicate'' : ''Subject'' → ''Noun'' : ''Predicate'' → ''Verb'' ''Object'' : ''Object'' → ''Noun'' : : ''Noun'' → John : ''Noun'' → Mary : ''Noun'' → children : ''Noun'' → parents : : ''Verb'' → like : ''Verb'' → likes : ''Verb'' → help : ''Verb'' → helps This
context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be em ...
describes simple sentences such as : John likes children : Mary helps John : children help parents : parents like John With more nouns and verbs, and more rules to introduce other parts of speech, a large range of English sentences can be described; so this is a promising approach for describing the syntax of English. However, the given grammar also describes sentences such as : John like children : children helps parents These sentences are wrong: in English, subject and verb have a
grammatical number In linguistics, grammatical number is a grammatical category of nouns, pronouns, adjectives and verb agreement that expresses count distinctions (such as "one", "two" or "three or more"). English and other languages present number categories of ...
, which must agree. An affix grammar can express this directly: : ''Sentence'' → ''Subject'' + ''number Predicate'' + ''number'' : ''Subject'' + ''number'' → ''Noun'' + ''number'' : ''Predicate'' + ''number'' → ''Verb'' + ''number Object'' : ''Object'' → ''Noun'' + ''number'' : ''Noun'' + ''singular'' → John : ''Noun'' + ''singular'' → Mary : ''Noun'' + ''plural'' → children : ''Noun'' + ''plural'' → parents : : ''Verb'' + ''singular'' → likes : ''Verb'' + ''plural'' → like : ''Verb'' + ''singular'' → helps : ''Verb'' + ''plural'' → help This grammar only describes correct English sentences, although it could be argued that : John likes John is still incorrect and should instead read : John likes himself This, too, can be incorporated using affixes, if the means of describing the relationships between different affix values are powerful enough. As remarked above, these means depend on the type of affix grammar chosen.


Types

In the simplest type of affix grammar, affixes can only take values from a finite domain, and affix values can only be related through agreement, as in the example. Applied in this way, affixes increase compactness of grammars, but do not add expressive power. Another approach is to allow affixes to take arbitrary strings as values and allow concatenations of affixes to be used in rules. The ranges of allowable values for affixes can be described with context-free grammar rules. This produces the formalism of
two-level grammar A two-level grammar is a formal grammar that is used to generate another formal grammar, such as one with an infinite rule set. This is how a Van Wijngaarden grammar was used to specify Algol 68. A context-free grammar that defines the rules for a ...
s, also known as '' Van Wijngaarden grammars'' or ''2VW'' grammars. These have been successfully used to describe complicated languages, in particular, the syntax of the
Algol 68 ALGOL 68 (short for ''Algorithmic Language 1968'') is an imperative programming language that was conceived as a successor to the ALGOL 60 programming language, designed with the goal of a much wider scope of application and more rigorously d ...
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
. However, it turns out that, even though affix values can only be manipulated with string concatenation, this formalism is
Turing complete Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. Turing was highly influential in the development of theoretical ...
; hence, even the most basic questions about the language described by an arbitrary 2VW grammar are undecidable in general. Extended Affix Grammars, developed in the 1980s, are a more restricted version of the same idea. They were mainly applied to describe the grammar of natural language, e.g. English. Another possibility is to allow the values of affixes to be computed by code written in some programming language. Two basic approaches have been used: * In attribute grammars, the affixes (called attributes) can take values from arbitrary domains (e.g. integer or real numbers, complex data structures) and arbitrary functions can be specified, written in a language of choice, to describe how affix values in rules are derived from each other. * In CDL (the
Compiler Description Language Compiler Description Language (CDL) is a programming language based on affix grammars. It is very similar to Backus–Naur form (BNF) notation. It was designed for the development of compilers. It is very limited in its capabilities and control f ...
) and its successor CDL2, developed in the 1970s, fragments of source code (usually in
assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence b ...
) can be used in rules instead of normal right-hand sides, allowing primitives for input scanning and affix value computations to be expressed directly. Designed as a basis for practical
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
construction, this approach was used to write compilers, and other software, e.g. a
text editor A text editor is a type of computer program that edits plain text. Such programs are sometimes known as "notepad" software (e.g. Windows Notepad). Text editors are provided with operating systems and software development packages, and can be ...
.


References

{{reflist Formal languages Compiler construction Syntax Grammar frameworks