In the theory of

formal languages In logic, mathematics, computer science, and linguistics, a formal language is a set of string (computer science), strings whose symbols are taken from a set called "#Definition, alphabet". The alphabet of a formal language consists of symbol ...

computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...

, and

linguistics Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...

, a Dyck word is a balanced

string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...

of brackets. The set of Dyck words forms a Dyck language. The simplest, Dyck-1, uses just two matching brackets, e.g. ( and ). Dyck words and language are named after the mathematician

Walther von Dyck Walther Franz Anton von Dyck (6 December 1856 – 5 November 1934), born Dyck () and later ennobled, was a German mathematician. He is credited with being the first to define a mathematical group, in the modern sense in . He laid the foundation ...

. They have applications in the

parsing Parsing, syntax analysis, or syntactic analysis is a process of analyzing a String (computer science), string of Symbol (formal), symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal gramm ...

of expressions that must have a correctly nested sequence of brackets, such as arithmetic or algebraic expressions.

Formal definition

Let

\Sigma = \

be the alphabet consisting of the symbols

and And or AND may refer to: Logic, grammar and computing * Conjunction, connecting two words, phrases, or clauses * Logical conjunction in mathematical logic, notated as "∧", "⋅", "&", or simple juxtaposition * Bitwise AND, a Boolean oper ...

Let

\Sigma^

denote its

Kleene closure In mathematical logic and theoretical computer science, the Kleene star (or Kleene operator or Kleene closure) is a unary operation on a set to generate a set of all finite-length strings that are composed of zero or more repetitions of members ...

. The Dyck language is defined as: :

\.

Context-free grammar

It may be helpful to define the Dyck language via a

context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context. In particular, in a context-free grammar, each production rule is of the fo ...

in some situations. The Dyck language is generated by the context-free grammar with a single non-terminal , and the production: : That is, ''S'' is either the

empty string In formal language theory, the empty string, or empty word, is the unique String (computer science), string of length zero. Formal theory Formally, a string is a finite, ordered sequence of character (symbol), characters such as letters, digits ...

() or is " , an element of the Dyck language, the matching ", and an element of the Dyck language. An alternative context-free grammar for the Dyck language is given by the production: : That is, ''S'' is zero or more occurrences of the combination of " , an element of the Dyck language, and a matching ", where multiple elements of the Dyck language on the right side of the production are free to differ from each other.

Alternative definition

In yet other contexts it may instead be helpful to define the Dyck language by splitting

\Sigma^

into equivalence classes, as follows. For any element

u \in \Sigma^

of length

,  u ,

, we define

partial function In mathematics, a partial function from a set to a set is a function from a subset of (possibly the whole itself) to . The subset , that is, the '' domain'' of viewed as a function, is called the domain of definition or natural domain ...

\operatorname : \Sigma^ \times \mathbb \rightarrow \Sigma^

and

\operatorname : \Sigma^ \times \mathbb \rightarrow \Sigma^

by :

\operatorname(u, j)

u

with "

[]

" inserted into the

j

th position :

\operatorname(u, j)

u

with "

[]

" deleted from the

j

th position with the understanding that

\operatorname(u, j)

is undefined for

j > , u,

and

\operatorname(u, j)

is undefined if

j > , u,  - 2

. We define an

equivalence relation In mathematics, an equivalence relation is a binary relation that is reflexive, symmetric, and transitive. The equipollence relation between line segments in geometry is a common example of an equivalence relation. A simpler example is equ ...

R

\Sigma^

as follows: for elements

a, b \in \Sigma^

we have

(a, b) \in R

if and only if there exists a sequence of zero or more applications of the

\operatorname

and

\operatorname

functions starting with

a

and ending with

b

. That the sequence of zero operations is allowed accounts for the reflexivity of

R

Symmetry Symmetry () in everyday life refers to a sense of harmonious and beautiful proportion and balance. In mathematics, the term has a more precise definition and is usually used to refer to an object that is Invariant (mathematics), invariant und ...

follows from the observation that any finite sequence of applications of

\operatorname

to a string can be undone with a finite sequence of applications of

\operatorname

. Transitivity is clear from the definition. The equivalence relation partitions the language

\Sigma^

into equivalence classes. If we take

\epsilon

to denote the empty string, then the language corresponding to the equivalence class

\operatorname(\epsilon)

is called the Dyck language.

Generalizations

Typed Dyck language

There exist variants of the Dyck language with multiple delimiters, e.g., Dyck-2 on the alphabet "(", ")", " , and ". The words of such a language are the ones which are well-parenthesized for all delimiters, i.e., one can read the word from left to right, push every opening delimiter on the stack, and whenever we reach a closing delimiter then we must be able to pop the matching opening delimiter from the top of the stack. (The counting algorithm above does not generalise). For example, the following is a valid sentence in Dyck-3 (with matching delimiters colored the same): :

Finite depth

A Dyck language sentence can be pictured as a descent and ascent through the levels of nested brackets. As one reads along a Dyck sentence, each opening bracket increases the nesting depth by 1, and each closing bracket decreases by 1. The depth of a sentence is the maximal depth reached within the sentence. For example, we can annotate the following sentence with the depth at each step:

0 ( 1 2 [ 3 2 2 ">3_.html" ;"title="2 [ 3 ">2 [ 3 2 2 1 ( 2 ) 1 1 ) 0 [ 1 ">3_">2_[_3_<_a>2__2_.html" ;"title="3_.html" ;"title="2 [ 3 ">2 [ 3 2 2 ">3_.html" ;"title="2 [ 3 ">2 [ 3 2 2 1 ( 2 ) 1 1 ) 0 [ 1 0

and the entire sentence has depth 3. We define Dyck-(k, m) as the language with k types of brackets and maximal depth m. This has applications in the formal theory of Recurrent neural network">recurrent neural networks Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important. Unlike feedforward neural networks, which proces ...

Formal definition

Context-free grammar

Alternative definition

Generalizations

Typed Dyck language

Finite depth

Properties

Examples

See also

Notes

References