HOME





Straight-line Grammar
A straight-line grammar (sometimes abbreviated as SLG) is a formal grammar that generates exactly one string.Florian Benz and Timo Kötzing, “An effective heuristic for the smallest grammar problem,” Proceedings of the fifteenth annual conference on Genetic and evolutionary computation conference - GECCO ’13, 2013. , p. 488 Consequently, it does not branch (every non-terminal has only one associated production rule) nor loop (if non-terminal ''A'' appears in a derivation of ''B'', then ''B'' does not appear in a derivation of ''A''). Areas of usefulness Straight-line grammars are widely used in the development of algorithms that execute directly on compressed structures (without prior decompression). SLGs are of interest in fields like Kolmogorov complexity, Lossless data compression, Structure discovery and Compressed data structures. The problem of finding a context-free grammar (equivalently: an SLG) of minimal size that generates a given string is called the smal ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Formal Grammar
A formal grammar is a set of Terminal and nonterminal symbols, symbols and the Production (computer science), production rules for rewriting some of them into every possible string of a formal language over an Alphabet (formal languages), alphabet. A grammar does not describe the semantics, meaning of the strings — only their form. In applied mathematics, formal language theory is the discipline that studies formal grammars and languages. Its applications are found in theoretical computer science, theoretical linguistics, Formal semantics (logic), formal semantics, mathematical logic, and other areas. A formal grammar is a Set_(mathematics), set of rules for rewriting strings, along with a "start symbol" from which rewriting starts. Therefore, a grammar is usually thought of as a language generator. However, it can also sometimes be used as the basis for a "recognizer"—a function in computing that determines whether a given string belongs to the language or is grammatical ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Kolmogorov Complexity
In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963 and is a generalization of classical information theory. The notion of Kolmogorov complexity can be used to state and prove impossibility results akin to Cantor's diagonal argument, Gödel's incompleteness theorem, and Turing's halting problem. In particular, no program ''P'' computing a lower bound for each text's Kolmogorov complexity can return a value essentially larger than ''P'''s own len ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Lossless Data Compression
Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits Redundancy (information theory), statistical redundancy. By contrast, lossy compression permits reconstruction only of an approximation of the original data, though usually with greatly improved Bit rate#Bitrates in multimedia, compression rates (and therefore reduced media sizes). By operation of the pigeonhole principle, no lossless compression algorithm can shrink the size of all possible data: Some data will get longer by at least one symbol or bit. Compression algorithms are usually effective for human- and machine-readable documents and cannot shrink the size of random data that contain no Redundancy (information theory), redundancy. Different algorithms exist that are designed either with a specific type of input data in mind or with speci ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Structure Discovery
A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such as biological organisms, minerals and chemicals. Abstract structures include data structures in computer science and musical form. Types of structure include a hierarchy (a cascade of one-to-many relationships), a network featuring many-to-many links, or a lattice featuring connections between components that are neighbors in space. Load-bearing Buildings, aircraft, skeletons, anthills, beaver dams, bridges and salt domes are all examples of load-bearing structures. The results of construction are divided into buildings and non-building structures, and make up the infrastructure of a human society. Built structures are broadly divided by their varying design approaches and standards, into categories including building structures, arch ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Compressed Data Structure
The term compressed data structure arises in the computer science subfields of algorithms, data structures, and theoretical computer science. It refers to a data structure whose operations are roughly as fast as those of a conventional data structure for the problem, but whose size can be substantially smaller. The size of the compressed data structure is typically highly dependent upon the information entropy of the data being represented. Important examples of compressed data structures include the compressed suffix array and the FM-index, both of which can represent an arbitrary text of characters ''T'' for pattern matching. Given any input pattern ''P'', they support the operation of finding if and where ''P'' appears in ''T''. The search time is proportional to the sum of the length of pattern ''P'', a very slow-growing function of the length of the text ''T'', and the number of reported matches. The space they occupy is roughly equal to the size of the text ''T'' in entrop ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Smallest Grammar Problem
In data compression and the theory of formal languages, the smallest grammar problem is the problem of finding the smallest context-free grammar that generates a given string of characters (but no other string). The size of a grammar is defined by some authors as the number of symbols on the right side of the production rules. Others also add the number of rules to that. A grammar that generates only a single string, as required for the solution to this problem, is called a straight-line grammar. Every binary string of length n has a grammar of length O(n/\log n), as expressed using big O notation. For binary de Bruijn sequences, no better length is possible. The (decision version of the) smallest grammar problem is NP-complete. It can be approximated in polynomial time to within a logarithmic approximation ratio; more precisely, the ratio is O(\log\tfrac) where n is the length of the given string and g is the size of its smallest grammar. It is hard to approximate to within a cons ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Tree (data Structure)
In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be connected to exactly one parent, except for the ''root'' node, which has no parent (i.e., the root node as the top-most node in the tree hierarchy). These constraints mean there are no cycles or "loops" (no node can be its own ancestor), and also that each child can be treated like the root node of its own subtree, making recursion a useful technique for tree traversal. In contrast to linear data structures, many trees cannot be represented by relationships between neighboring nodes (parent and children nodes of a node under consideration, if they exist) in a single straight line (called edge or link between two adjacent nodes). Binary trees are a commonly used type, which constrain the number of children for each parent to at most two. Whe ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Context-free Grammar
In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context. In particular, in a context-free grammar, each production rule is of the form : A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be empty). Regardless of which symbols surround it, the single nonterminal A on the left hand side can always be replaced by \alpha on the right hand side. This distinguishes it from a context-sensitive grammar, which can have production rules in the form \alpha A \beta \rightarrow \alpha \gamma \beta with A a nonterminal symbol and \alpha, \beta, and \gamma strings of terminal and/or nonterminal symbols. A formal grammar is essentially a set of production rules that describe all possible strings in a given formal language. Production rules are simple replacements. For example, the first rule in the picture, : \lan ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Directed Acyclic Graph
In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one vertex to another, such that following those directions will never form a closed loop. A directed graph is a DAG if and only if it can be topologically ordered, by arranging the vertices as a linear ordering that is consistent with all edge directions. DAGs have numerous scientific and computational applications, ranging from biology (evolution, family trees, epidemiology) to information science (citation networks) to computation (scheduling). Directed acyclic graphs are also called acyclic directed graphs or acyclic digraphs. Definitions A graph is formed by vertices and by edges connecting pairs of vertices, where the vertices can be any kind of object that is connected in pairs by edges. In the case of a directed graph, each edg ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Chomsky Normal Form
In formal language theory, a context-free grammar, ''G'', is said to be in Chomsky normal form (first described by Noam Chomsky) if all of its production rules are of the form: : ''A'' → ''BC'',   or : ''A'' → ''a'',   or : ''S'' → ε, where ''A'', ''B'', and ''C'' are nonterminal symbols, the letter ''a'' is a terminal symbol (a symbol that represents a constant value), ''S'' is the start symbol, and ε denotes the empty string. Also, neither ''B'' nor ''C'' may be the start symbol, and the third production rule can only appear if ε is in ''L''(''G''), the language produced by the context-free grammar ''G''. Every grammar in Chomsky normal form is context-free, and conversely, every context-free grammar can be transformed into an equivalent onethat is, one that produces the same language which is in Chomsky normal form and has a size no larger than the square of the original grammar's size. Converting a grammar to Chomsky normal form To convert a grammar ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Straight-line Program
In computer science, a straight-line program is, informally, a program that does not contain any loop or any test, and is formed by a sequence of steps that apply each an operation to previously computed elements. This article is devoted to the case where the allowed operations are the operations of a group, that is multiplication and inversion. More specifically a straight-line program (SLP) for a finite group ''G'' = ⟨''S''⟩ is a finite sequence ''L'' of elements of ''G'' such that every element of ''L'' either belongs to ''S'', is the inverse of a preceding element, or the product of two preceding elements. An SLP ''L'' is said to ''compute'' a group element ''g'' ∈ ''G'' if ''g'' ∈ ''L'', where ''g'' is encoded by a word in ''S'' and its inverses. Intuitively, an SLP computing some ''g'' ∈ ''G'' is an ''efficient'' way of storing ''g'' as a group word over ''S''; observe that if ''g'' is constructed in ''i'' steps, the word l ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]