Wirth Syntax Notation
   HOME

TheInfoList



OR:

Wirth syntax notation (WSN) is a
metasyntax A metasyntax is a syntax used to define the syntax of a programming language or formal language. It describes the allowable structure and composition of phrases and sentences of a metalanguage, which is used to describe either a natural langua ...
, that is, a formal way to describe
formal language In logic, mathematics, computer science, and linguistics, a formal language is a set of strings whose symbols are taken from a set called "alphabet". The alphabet of a formal language consists of symbols that concatenate into strings (also c ...
s. Originally proposed by
Niklaus Wirth Niklaus Emil Wirth ( IPA: ) (15 February 1934 – 1 January 2024) was a Swiss computer scientist. He designed several programming languages, including Pascal, and pioneered several classic topics in software engineering. In 1984, he won the Tu ...
in 1977 as an alternative to
Backus–Naur form In computer science, Backus–Naur form (BNF, pronounced ), also known as Backus normal form, is a notation system for defining the Syntax (programming languages), syntax of Programming language, programming languages and other Formal language, for ...
(BNF). It has several advantages over BNF in that it contains an explicit iteration construct, and it avoids the use of an explicit symbol for the empty string (such as or ε). WSN has been used in several
international standards An international standard is a technical standard developed by one or more international standards organizations. International standards are available for consideration and use worldwide. The most prominent such organization is the International O ...
, starting with
ISO 10303-21 STEP-file is a widely used data exchange form of ISO 10303, STEP. ISO 10303 can represent 3D objects in computer-aided design (CAD) and related information. A STEP-file is ASCII text with the format defined in ISO 10303-21 ''Clear Text Encoding ...
. It was also used to define the syntax of
EXPRESS Express, The Expresss or EXPRESS may refer to: Arts, entertainment and media Film * ''Express: Aisle to Glory'', a 1998 comedy short film featuring Kal Penn * ''The Express: The Ernie Davis Story'', a 2008 film starring Dennis Quaid * The Expre ...
, the
data modelling Data modeling in software engineering is the process of creating a data model for an information system by applying certain formal techniques. It may be applied as part of broader Model-driven engineering (MDE) concept. Overview Data modeling ...
language of STEP.


WSN defined in itself

SYNTAX = . PRODUCTION = IDENTIFIER "=" EXPRESSION "." . EXPRESSION = TERM . TERM = FACTOR . FACTOR = IDENTIFIER , LITERAL , " EXPRESSION " , "(" EXPRESSION ")" , "" . IDENTIFIER = letter . LITERAL = """" character """" . The equals sign indicates a production. The element on the left is defined to be the combination of elements on the right. A production is terminated by a full stop (period). *Repetition is denoted by curly brackets, ''e.g.,'' stands for ε , a , aa , aaa , .... *Optionality is expressed by square brackets, ''e.g.,'' stands for ab , b. *Parentheses serve for groupings, ''e.g.,'' (a, b)c stands for ac , bc. We take these concepts for granted today, but they were novel and even controversial in 1977. Wirth later incorporated some of the concepts (with a different syntax and notation) into
extended Backus–Naur form Extension, extend or extended may refer to: Mathematics Logic or set theory * Axiom of extensionality * Extensible cardinal * Extension (model theory) * Extension (proof theory) * Extension (predicate logic), the set of tuples of values ...
. Notice that letter and character are left undefined. This is because numeric characters (digits 0 to 9) may be included in both definitions or excluded from one, depending on the language being defined, ''e.g.'': digit = "0" , "1" , "2" , "3" , "4" , "5" , "6" , "7" , "8" , "9" . upper-case = "A" , "B" , … , "Y" , "Z" . lower-case = "a" , "b" , … , "y" , "z" . letter = upper-case , lower-case . If character goes on to include digit and other printable
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
characters, then it diverges even more from letter, which one can assume does not include the digit characters or any of the special (non-
alphanumeric Alphanumericals or alphanumeric characters are any collection of number characters and letters in a certain language. Sometimes such characters may be mistaken one for the other. Merriam-Webster suggests that the term "alphanumeric" may often ...
) characters.


Another example

The syntax of BNF can be represented with WSN as follows, based on translating the BNF example of itself: syntax = rule syntax . rule = opt-whitespace "<" rule-name ">" opt-whitespace "::=" opt-whitespace expression line-end . opt-whitespace = . expression = list " expression . line-end = opt-whitespace EOL , line-end line-end . list = term opt-whitespace list . term = literal , "<" rule-name ">" . literal = """" text """" , "'" text "'" . This definition appears overcomplicated because the concept of "optional
whitespace White space or whitespace may refer to: Technology * Whitespace characters, characters in computing that represent horizontal or vertical space * White spaces (radio), allocated but locally unused radio frequencies * TV White Space Database, a m ...
" must be explicitly defined in BNF, but it is implicit in WSN. Even in this example, text is left undefined, but it is assumed to mean " ASCII-character ". (EOL is also left undefined.) Notice how the
kludge A kludge or kluge () is a workaround or makeshift solution that is clumsy, inelegant, inefficient, difficult to extend, and hard to maintain. This term is used in diverse fields such as computer science, aerospace engineering, Internet slang, ...
"<" rule-name ">" has been used twice because text was not explicitly defined. One of the problems with BNF which this example illustrates is that by allowing both single-quote and double-quote characters to be used for a literal, there is an added potential for human error in attempting to create a machine-readable syntax. One of the concepts that migrated to later meta syntaxes was the idea that giving the user multiple choices made it harder to write parsers for grammars defined by the syntax, so computer languages in general have become more restrictive in how a ''quoted-literal'' is defined.


Syntax diagram

Syntax diagram Syntax diagrams (or railroad diagrams) are a way to represent a context-free grammar. They represent a graphical alternative to Backus–Naur form, EBNF, Augmented Backus–Naur form, and other text-based grammars as metalanguages. Early books ...
:


References

{{Wirth Metalanguages