HOME

TheInfoList



OR:

In
logic Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the science of deductively valid inferences or of logical truths. It is a formal science investigating how conclusions follow from prem ...
and
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, a metasyntax describes the allowable structure and composition of phrases and sentences of a
metalanguage In logic and linguistics, a metalanguage is a language used to describe another language, often called the ''object language''. Expressions in a metalanguage are often distinguished from those in the object language by the use of italics, quota ...
, which is used to describe either a
natural language In neuropsychology, linguistics, and philosophy of language, a natural language or ordinary language is any language that has evolved naturally in humans through use and repetition without conscious planning or premeditation. Natural languages ...
or a computer
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
.Sellink, Alex, and Chris Verhoef.
Development, assessment, and reengineering of language descriptions
" Software Maintenance and Reengineering, 2000. Proceedings of the Fourth European. IEEE, 2000.
Some of the widely used formal metalanguages for computer languages are
Backus–Naur form In computer science, Backus–Naur form () or Backus normal form (BNF) is a metasyntax notation for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document format ...
(BNF), extended Backus–Naur form (EBNF),
Wirth syntax notation Wirth syntax notation (WSN) is a metasyntax, that is, a formal way to describe formal languages. Originally proposed by Niklaus Wirth in 1977 as an alternative to Backus–Naur form (BNF). It has several advantages over BNF in that it contains an ex ...
(WSN), and augmented Backus–Naur form (ABNF). These metalanguages have their own metasyntax each composed of terminal symbols, nonterminal symbols, and ''metasymbols''. A terminal symbol, such as a word or a token, is a stand-alone structure in a language being defined. A nonterminal symbol represents a
syntactic In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituency) ...
category, which defines one or more valid phrasal or sentence structure consisted of an n-element subset. Metasymbols provide syntactic information for denotational purposes in a given metasyntax. Terminals, nonterminals, and metasymbols do not apply across all metalanguages. Typically, the metalanguage for token-level languages (formally called "
regular language In theoretical computer science and formal language theory, a regular language (also called a rational language) is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science (as opposed to ...
s") does not have nonterminals because nesting is not an issue in these regular languages. English, as a metalanguage for describing certain languages, does not contain metasymbols since all explanation could be done using English expression. There are only certain formal metalanguages used for describing recursive languages (formally called context-free languages) that have terminals, nonterminals, and metasymbols in their metasyntax.


Element of metasyntax

* Terminals: a stand-alone syntactic structure. Terminals could be denoted by double quoting the name of the terminals. :e.g. , , , * Nonterminals: a symbolic representation defining a set of allowable syntactic structures that is composed of a subset of elements. Nonterminals could be denoted by angle bracketing the name of the nonterminals. :e.g. , , * Metasymbol: a symbolic representation denoting syntactic information. :e.g. , , , , ,


Methods of phrase termination

* Juxtaposition: e.g. * Alternation: e.g. * Repetition: e.g. * Optional phrase: e.g. * Grouping: e.g.


Specific metasyntax conventions


The standard convention

* '
Backus–Naur form In computer science, Backus–Naur form () or Backus normal form (BNF) is a metasyntax notation for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document format ...
' denotes nonterminal symbols by angle bracketing the name of the
syntactic category A syntactic category is a syntactic unit that theories of syntax assume. Word classes, largely corresponding to traditional parts of speech (e.g. noun, verb, preposition, etc.), are syntactic categories. In phrase structure grammars, the ''phrasal c ...
, while it denotes terminal symbols by double quoting the terminal words. Terminals can never appear on the left-hand side of the metasymbol in a
derivation Derivation may refer to: Language * Morphological derivation, a word-formation process * Parse tree or concrete syntax tree, representing a string's syntax in formal grammars Law * Derivative work, in copyright law * Derivation proceeding, a proc ...
rule. The body of the definition on the right-hand side may be composed with several alternative forms with each alternative syntactic construct being separated by the metasymbol . Each of these alternative construct may be either terminal or nonterminal. * ' Extended Backus–Naur form' uses all facilities in BNF and introduces two more metasymbols for additional features. One of these two new features is applied to denote an optional phrase in a statement by square bracketing the optional phrase. The second feature is applied to denote a phrase that is to be repeated zero or more times by curly bracketing the phrase. * '
Wirth syntax notation Wirth syntax notation (WSN) is a metasyntax, that is, a formal way to describe formal languages. Originally proposed by Niklaus Wirth in 1977 as an alternative to Backus–Naur form (BNF). It has several advantages over BNF in that it contains an ex ...
' uses all facilities in EBNF except that the nonterminals are not necessarily angle bracketed but is always defined on the right-hand side of in its production rule. It also does not require every nonterminal to be explicitly defined. Nonterminals such as and are implicitly defined as ASCII-character and optional white space respectively. * ' Augmented Backus–Naur form' denotes nonterminal symbols by starting a one-word-name with an alphabet as the name of the syntactic category. Angle brackets are not required. Terminal symbols are either denoted by double quoted words or denoted by the following numeric structure: a , followed by or or , followed by a numeric value or a ''concatenation of numeric values'' separated by . Metasymbol is placed between two numeric values to denote ''value range''. As that of BNF, the terminals of ABNF never occurs on the left-hand-side of the metasymbol in the derivation rule. Metasymbol denotes ''alternations''. White space is used to separate elements in the body of the definition. The metasyntax for ''repetition'' in ABNF has several forms. A preceding an element denotes the element to be repeated zero or more times. Numeric value followed by followed by numeric value followed by an element denotes the element to be repeated at least times and at most times. A single numeric value preceding an element denotes the element to be repeated times. ''Comments'' may be express after metasymbol . As in EBNF, square bracketing a phrase denotes the phrase to be ''optional''.


Variations

The metasyntax convention of these formal metalanguages are not yet formalized. Many metasyntactic variations or extensions exist in the reference manual of various computer programming languages. One variation to the standard convention for denoting nonterminals and terminals is to remove metasymbols such as angle brackets and quotations and apply ''font types'' to the intended words. In
Ada Ada may refer to: Places Africa * Ada Foah, a town in Ghana * Ada (Ghana parliament constituency) * Ada, Osun, a town in Nigeria Asia * Ada, Urmia, a village in West Azerbaijan Province, Iran * Ada, Karaman, a village in Karaman Province, T ...
, for example, syntactic categories are denoted by applying lower case sans-serif font on the intended words or symbols. All terminal words or symbols, in Ada, consist of characters of code position between and (inclusive). The definition for each character set is referred to the International Standard described by ISO/IEC 10646:2003. In C and
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
, syntactic categories are denoted using italic font while terminal symbols are denoted by
gothic Gothic or Gothics may refer to: People and languages *Goths or Gothic people, the ethnonym of a group of East Germanic tribes **Gothic language, an extinct East Germanic language spoken by the Goths **Crimean Gothic, the Gothic language spoken b ...
font. In J, its metasyntax does not apply metasymbols to describe J's syntax at all. Rather, all syntactic explanations are done in a metalanguage very similar to English called Dictionary, which is uniquely documented for J.


Advantage of the extensions

The purpose of the new extensions is to provide a simpler and unambiguous metasyntax. In terms of simplicity, BNF's metanotation definitely does not help to make the metasyntax easier-to-read as the open-end and close-end metasymbols appear too abundantly. In terms of ambiguity, BNF's metanotation generates unnecessary complexity when quotation marks, apostrophes, less-than signs or greater-than signs come to serve as terminal symbols, which they often do. The extended metasyntax utilizes properties such as case, font, and code position of characters to reduce unnecessary aforementioned complexity. Moreover, some metalanguages use fonted separator categories to incorporate metasyntactic features for layout conventions, which are not formally supported by BNF.


See also

*
Adaptive grammar An adaptive grammar is a formal grammar that explicitly provides mechanisms within the formalism to allow its own production rules to be manipulated. Overview John N. Shutt defines adaptive grammar as a grammatical formalism that allows rule set ...
* Comparison of parser generators * Metapragmatics * Metasemantics * Metavariable (logic)


References

{{Metasyntax