abstract algebra In mathematics, more specifically algebra, abstract algebra or modern algebra is the study of algebraic structures, which are set (mathematics), sets with specific operation (mathematics), operations acting on their elements. Algebraic structur ...

, the free monoid on a set is the monoid whose elements are all the finite sequences (or strings) of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero elements, often called the

empty string In formal language theory, the empty string, or empty word, is the unique String (computer science), string of length zero. Formal theory Formally, a string is a finite, ordered sequence of character (symbol), characters such as letters, digits ...

and denoted by ε or λ, as the

identity element In mathematics, an identity element or neutral element of a binary operation is an element that leaves unchanged every element when the operation is applied. For example, 0 is an identity element of the addition of real numbers. This concept is use ...

. The free monoid on a set ''A'' is usually denoted ''A''^∗. The free semigroup on ''A'' is the sub semigroup of ''A''^∗ containing all elements except the empty string. It is usually denoted ''A''⁺.

/ref> More generally, an abstract monoid (or semigroup) ''S'' is described as free if it is isomorphic to the free monoid (or semigroup) on some set. As the name implies, free monoids and semigroups are those objects which satisfy the usual universal property defining free objects, in the respective categories of monoids and semigroups. It follows that every monoid (or semigroup) arises as a homomorphic image of a free monoid (or semigroup). The study of semigroups as images of free semigroups is called combinatorial semigroup theory. Free monoids (and monoids in general) are associative, by definition; that is, they are written without any parenthesis to show grouping or order of operation. The non-associative equivalent is the free magma.

Examples

Natural numbers

The monoid (N₀,+) of natural numbers (including zero) under addition is a free monoid on a singleton free generator, in this case, the natural number 1. According to the formal definition, this monoid consists of all sequences like "1", "1+1", "1+1+1", "1+1+1+1", and so on, including the empty sequence. Mapping each such sequence to its evaluation result and the empty sequence to zero establishes an isomorphism from the set of such sequences to N₀. This isomorphism is compatible with "+", that is, for any two sequences ''s'' and ''t'', if ''s'' is mapped (i.e. evaluated) to a number ''m'' and ''t'' to ''n'', then their concatenation ''s''+''t'' is mapped to the sum ''m''+''n''.

Kleene star

In formal language theory, usually a finite set of "symbols" A (sometimes called the

alphabet An alphabet is a standard set of letter (alphabet), letters written to represent particular sounds in a spoken language. Specifically, letters largely correspond to phonemes as the smallest sound segments that can distinguish one word from a ...

) is considered. A finite sequence of symbols is called a "word over ''A''", and the free monoid ''A''^∗ is called the "

Kleene star In mathematical logic and theoretical computer science, the Kleene star (or Kleene operator or Kleene closure) is a unary operation on a Set (mathematics), set to generate a set of all finite-length strings that are composed of zero or more repe ...

of ''A''". Thus, the abstract study of formal languages can be thought of as the study of subsets of finitely generated free monoids. For example, assuming an alphabet ''A'' = , its Kleene star ''A''^∗ contains all concatenations of ''a'', ''b'', and ''c'': :. If ''A'' is any set, the ''word length'' function on ''A''^∗ is the unique monoid homomorphism from ''A''^∗ to (N₀,+) that maps each element of ''A'' to 1. A free monoid is thus a graded monoid.Sakarovitch (2009) p.382 (A graded monoid

M

is a monoid that can be written as

M=M_0\oplus M_1\oplus M_2 \cdots

. Each

M_n

is a grade; the grading here is just the length of the string. That is,

M_n

contains those strings of length

n.

The

\oplus

symbol here can be taken to mean "set union"; it is used instead of the symbol

\cup

because, in general, set unions might not be monoids, and so a distinct symbol is used. By convention, gradations are always written with the

\oplus

symbol.) There are deep connections between the theory of semigroups and that of automata. For example, every formal language has a syntactic monoid that recognizes that language. For the case of a

regular language In theoretical computer science and formal language theory, a regular language (also called a rational language) is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science (as opposed to ...

, that monoid is isomorphic to the transition monoid associated to the semiautomaton of some deterministic finite automaton that recognizes that language. The regular languages over an alphabet A are the closure of the finite subsets of A*, the free monoid over A, under union, product, and generation of submonoid. For the case of concurrent computation, that is, systems with locks, mutexes or thread joins, the computation can be described with history monoids and trace monoids. Roughly speaking, elements of the monoid can commute, (e.g. different threads can execute in any order), but only up to a lock or mutex, which prevent further commutation (e.g. serialize thread access to some object).

Conjugate words

We define a pair of words in ''A''^∗ of the form ''uv'' and ''vu'' as conjugate: the conjugates of a word are thus its circular shifts.Sakarovitch (2009) p.27 Two words are conjugate in this sense if they are conjugate in the sense of group theory as elements of the free group generated by ''A''.

Equidivisibility

A free monoid is equidivisible: if the equation ''mn'' = ''pq'' holds, then there exists an ''s'' such that either ''m'' = ''ps'', ''sn'' = ''q'' (example see image) or ''ms'' = ''p'', ''n'' = ''sq''.Sakarovitch (2009) p.26 This result is also known as Levi's lemma. A monoid is free if and only if it is graded (in the strong sense that only the identity has gradation 0) and equidivisible.

Free generators and rank

The members of a set ''A'' are called the free generators for ''A''^∗ and ''A''⁺. The superscript * is then commonly understood to be the

. More generally, if ''S'' is an abstract free monoid (semigroup), then a set of elements which maps onto the set of single-letter words under an isomorphism to a monoid ''A''^∗ (semigroup ''A⁺'') is called a ''set of free generators'' for ''S''. Each free monoid (or semigroup) ''S'' has exactly one set of free generators, the

cardinality The thumb is the first digit of the hand, next to the index finger. When a person is standing in the medical anatomical position (where the palm is facing to the front), the thumb is the outermost digit. The Medical Latin English noun for thum ...

of which is called the ''rank'' of ''S''. Two free monoids or semigroups are isomorphic if and only if they have the same rank. In fact, ''every'' set of generators for a free monoid or semigroup ''S'' contains the free generators, since a free generator has word length 1 and hence can only be generated by itself. It follows that a free semigroup or monoid is finitely generated if and only if it has finite rank. A

submonoid In abstract algebra, a monoid is a set equipped with an associative binary operation and an identity element. For example, the nonnegative integers with addition form a monoid, the identity element being . Monoids are semigroups with identity ...

''N'' of ''A''^∗ is stable if ''u'', ''v'', ''ux'', ''xv'' in ''N'' together imply ''x'' in ''N''. A submonoid of ''A''^∗ is stable if and only if it is free. For example, using the set of bits as ''A'', the set ''N'' of all bit strings containing an even number of "1"s is a stable submonoid because if ''u'' contains an even number of "1"s, and ''ux'' as well, then ''x'' must contain an even number of "1"s, too. While ''N'' cannot be freely generated by any set of single bits, it ''can'' be freely generated by the set of bit strings – the set of strings of the form "10^''n''1" for some nonnegative integer ''n'' (along with the string "0").

Codes

A set of free generators for a free monoid ''P'' is referred to as a basis for P: a set of words ''C'' is a code if ''C''* is a free monoid and ''C'' is a basis. A set ''X'' of words in ''A''^∗ is a prefix, or has the prefix property, if it does not contain a proper (string) prefix of any of its elements. Every prefix in ''A''⁺ is a code, indeed a

prefix code A prefix code is a type of code system distinguished by its possession of the prefix property, which requires that there is no whole Code word (communication), code word in the system that is a prefix (computer science), prefix (initial segment) of ...

. A submonoid ''N'' of ''A''^∗ is right unitary if ''x'', ''xy'' in ''N'' implies ''y'' in ''N''. A submonoid is generated by a prefix if and only if it is right unitary.

Factorization

A factorization of a free monoid is a sequence of subsets of words with the property that every word in the free monoid can be written as a concatenation of elements drawn from the subsets. The Chen–Fox–Lyndon theorem states that the Lyndon words furnish a factorization. More generally, Hall words provide a factorization; the Lyndon words are a special case of the Hall words.

Free hull

The intersection of free submonoids of a free monoid ''A''^∗ is again free. If ''S'' is a subset of a free monoid ''A''* then the intersection of all free submonoids of ''A''* containing ''S'' is well-defined, since ''A''* itself is free, and contains ''S''; it is a free monoid and called the free hull of ''S''. A basis for this intersection is a code. The defect theorem states that if ''X'' is finite and ''C'' is the basis of the free hull of ''X'', then either ''X'' is a code and ''C'' = ''X'', or :, ''C'', ≤ , ''X'', − 1 .

Morphisms

monoid morphism In abstract algebra, a monoid is a set equipped with an associative binary operation and an identity element. For example, the nonnegative integers with addition form a monoid, the identity element being . Monoids are semigroups with identi ...

''f'' from a free monoid ''B''^∗ to a monoid ''M'' is a map such that ''f''(''xy'') = ''f''(''x'')⋅''f''(''y'') for words ''x'',''y'' and ''f''(ε) = ι, where ε and ι denote the identity elements of ''B''^∗ and ''M'', respectively. The morphism ''f'' is determined by its values on the letters of ''B'' and conversely any map from ''B'' to ''M'' extends to a morphism. A morphism is non-erasing or continuous if no letter of ''B'' maps to ι and trivial if every letter of ''B'' maps to ι. A morphism ''f'' from a free monoid ''B''^∗ to a free monoid ''A''^∗ is total if every letter of ''A'' occurs in some word in the image of ''f''; cyclic or periodic if the image of ''f'' is contained in ^∗ for some word ''w'' of ''A''^∗. A morphism ''f'' is ''k''-uniform if the length , ''f''(''a''), is constant and equal to ''k'' for all ''a'' in ''A''. A 1-uniform morphism is strictly alphabetic or a coding. A morphism ''f'' from a free monoid ''B''^∗ to a free monoid ''A''^∗ is simplifiable if there is an alphabet ''C'' of cardinality less than that of ''B'' such the morphism ''f'' factors through ''C''^∗, that is, it is the composition of a morphism from ''B''^∗ to ''C''^∗ and a morphism from that to ''A''^∗; otherwise ''f'' is elementary. The morphism ''f'' is called a code if the image of the alphabet ''B'' under ''f'' is a code. Every elementary morphism is a code.

Test sets

For ''L'' a subset of ''B''^∗, a finite subset ''T'' of ''L'' is a ''test set'' for ''L'' if morphisms ''f'' and ''g'' on ''B''^∗ agree on ''L'' if and only if they agree on ''T''. The Ehrenfeucht conjecture is that any subset ''L'' has a test set: it has been proved independently by Albert and Lawrence; McNaughton; and Guba. The proofs rely on

Hilbert's basis theorem In mathematics Hilbert's basis theorem asserts that every ideal (ring theory), ideal of a polynomial ring over a field (mathematics), field has a finite generating set of an ideal, generating set (a finite ''basis'' in Hilbert's terminology). In ...

Map and fold

The computational embodiment of a monoid morphism is a map followed by a fold. In this setting, the free monoid on a set ''A'' corresponds to lists of elements from ''A'' with concatenation as the binary operation. A monoid homomorphism from the free monoid to any other monoid (''M'',•) is a function ''f'' such that * ''f''(''x''₁...''x''_''n'') = ''f''(''x''₁) • ... • ''f''(''x''_''n'') * ''f''() = ''e'' where ''e'' is the identity on ''M''. Computationally, every such homomorphism corresponds to a map operation applying ''f'' to all the elements of a list, followed by a fold operation which combines the results using the binary operator •. This computational paradigm (which can be generalized to non-associative binary operators) has inspired the MapReduce software framework.

Endomorphisms

An endomorphism of ''A''^∗ is a morphism from ''A''^∗ to itself. The

identity map Graph of the identity function on the real numbers In mathematics, an identity function, also called an identity relation, identity map or identity transformation, is a function that always returns the value that was used as its argument, unc ...

''I'' is an endomorphism of ''A''^∗, and the endomorphisms form a monoid under composition of functions. An endomorphism ''f'' is prolongable if there is a letter ''a'' such that ''f''(''a'') = ''as'' for a non-empty string ''s''.Allouche & Shallit (2003) p.10

String projection

The operation of string projection is an endomorphism. That is, given a letter ''a'' ∈ Σ and a string ''s'' ∈ Σ^∗, the string projection ''p''_a(''s'') removes every occurrence of ''a'' from ''s''; it is formally defined by :

p_a(s) = \begin 
\varepsilon & \text s=\varepsilon, \text \\
p_a(t) & \text s=ta  \\ 
p_a(t)b & \text s=tb \text b\ne a.   
\end

Note that string projection is well-defined even if the rank of the monoid is infinite, as the above recursive definition works for all strings of finite length. String projection is a morphism in the category of free monoids, so that :

p_a\left(\Sigma^*\right)= \left(\Sigma-a\right)^*

where

p_a\left(\Sigma^*\right)

is understood to be the free monoid of all finite strings that don't contain the letter ''a''. Projection commutes with the operation of string concatenation, so that

p_a(st)=p_a(s)p_a(t)

for all strings ''s'' and ''t''. There are many right inverses to string projection, and thus it is a split epimorphism. The identity morphism is

p_\varepsilon,

defined as

p_\varepsilon(s)=s

for all strings ''s'', and

p_\varepsilon(\varepsilon)=\varepsilon

. String projection is commutative, as clearly :

p_a(p_b(s))=p_b(p_a(s)).

For free monoids of finite rank, this follows from the fact that free monoids of the same rank are isomorphic, as projection reduces the rank of the monoid by one. String projection is idempotent, as :

p_a(p_a(s))=p_a(s)

for all strings ''s''. Thus, projection is an idempotent, commutative operation, and so it forms a bounded semilattice or a commutative band.

The free commutative monoid

Given a set ''A'', the free commutative monoid on ''A'' is the set of all finite multisets with elements drawn from ''A'', with the monoid operation being multiset sum and the monoid unit being the empty multiset. For example, if ''A'' = , elements of the free commutative monoid on ''A'' are of the form :. The fundamental theorem of arithmetic states that the monoid of positive integers under multiplication is a free commutative monoid on an infinite set of generators, the

prime number A prime number (or a prime) is a natural number greater than 1 that is not a Product (mathematics), product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime ...

s. The free commutative semigroup is the subset of the free commutative monoid that contains all multisets with elements drawn from ''A'' except the empty multiset. The free partially commutative monoid, or '' trace monoid'', is a generalization that encompasses both the free and free commutative monoids as instances. This generalization finds applications in

combinatorics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and as an end to obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many ...

and in the study of parallelism in

computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

Notes

References

* * * * * * * *

External links

*{{Commons category-inline Semigroup theory Formal languages Free algebraic structures Combinatorics on words