GADDAG
   HOME

TheInfoList



OR:

A GADDAG is a
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
presented by Steven Gordon in 1994, for use in generating moves for
Scrabble ''Scrabble'' is a word game in which two to four players score points by placing tiles, each bearing a single letter, onto a Board game, game board divided into a 15×15 grid of squares. The tiles must form words that, in crossword fashion, re ...
and other word-generation games where such moves require words that "hook into" existing words. It is often in contrast to move-generation algorithms using a directed acyclic word graph (DAWG) such as the one used by
Maven MAVEN is a NASA spacecraft orbiting Mars to study the loss of that planet's atmospheric gases to space, providing insight into the history of the planet's climate and water. The name is an acronym for "Mars Atmosphere and Volatile Evolution" w ...
. It is generally twice as fast as the traditional DAWG algorithms, but take about 5 times as much space for regulation Scrabble dictionaries. Quackle, an open-source Scrabble program, uses a GADDAG to generate moves.


Description

The name GADDAG comes from DAG for
directed acyclic graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ...
, prefixed by its own reverse. A GADDAG is a specialization of a
Trie In computer science, a trie (, ), also known as a digital tree or prefix tree, is a specialized search tree data structure used to store and retrieve strings from a dictionary or set. Unlike a binary search tree, nodes in a trie do not store t ...
, containing states and branches to other GADDAGs. It is distinct for its storage of every reversed prefix of every word in a dictionary. This means every word has as many representations as it does letters; since the average word in most Scrabble regulation dictionaries is 5 letters long, this makes the GADDAG about 5 times as big as a simple DAWG.


Definition

For any word in a dictionary that is formed by a non-empty prefix ''x'' and a suffix ''y,'' a GADDAG contains a direct, deterministic path for any string REV(''x'')+''y'', where + is a concatenation operator. For example, for the word "''explain''," a GADDAG will contain direct paths to the strings e+xplain xe+plain pxe+lain lpxe+ain alpxe+in ialpxe+n nialpxe This setup enables searching for a word given any letter that occurs in it.


Use in move generation

Any move-generation algorithm must adhere to three types of constraints: * ''Board constraints'': one may only build by 'hooking' onto existing letters of the board, and one may only place tiles on empty squares. * ''Rack constraints'': one may only place tiles with letters on one's rack. * ''Dictionary constraint'': all words resulting from the placement of tiles exist in the game's dictionary. DAWG-based algorithms take advantage of the second and third constraint: the DAWG is built around the dictionary, and is traversed using tiles in the rack. It fails, however, to address the first constraint: supposing one want to 'hook into' the letter ''P'' in ''HARPY'', and the board has 2 spaces before the P, one must search the dictionary for all words containing letters from the rack where the third letter is ''P''. This is inefficient when searching through the DAWG, as many searches through the trie will be fruitless. This is addressed by the GADDAG's storage of prefixes: by traversing the ''P'' branch of a GADDAG, one sees all words that have a ''P'' somewhere in their composition, and can "travel up" the prefix to form the word with tiles in the rack. To use the example from the section, searching for ''P'' turns up "''pxe+lain''". The letters between ''P'' and the + can be placed above the ''P'' on the board, and the rest below it (if space on the board permits).


See also

*
Suffix Tree In computer science, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particu ...
*
Trie In computer science, a trie (, ), also known as a digital tree or prefix tree, is a specialized search tree data structure used to store and retrieve strings from a dictionary or set. Unlike a binary search tree, nodes in a trie do not store t ...
*
Scrabble ''Scrabble'' is a word game in which two to four players score points by placing tiles, each bearing a single letter, onto a Board game, game board divided into a 15×15 grid of squares. The tiles must form words that, in crossword fashion, re ...
* Prefix Hash Tree


References

{{Reflist Scrabble software Graph data structures String data structures Game artificial intelligence