Yacc
   HOME

TheInfoList



OR:

Yacc (Yet Another Compiler-Compiler) is a
computer program A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. Computer programs are one component of software, which also includes software documentation, documentation and oth ...
for the
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, ...
operating system developed by Stephen C. Johnson. It is a Look Ahead Left-to-Right Rightmost Derivation (LALR) parser generator, generating a LALR parser (the part of a
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
that tries to make syntactic sense of the
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the ...
) based on a formal grammar, written in a notation similar to Backus–Naur Form (BNF). Yacc is supplied as a standard utility on BSD and AT&T Unix. GNU-based
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
distributions include
Bison Bison are large bovines in the genus ''Bison'' (Greek: "wild ox" (bison)) within the tribe Bovini. Two extant and numerous extinct species are recognised. Of the two surviving species, the American bison, ''B. bison'', found only in North A ...
, a forward-compatible Yacc replacement.


History

In the early 1970s, Stephen C. Johnson, a computer scientist at
Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mul ...
/
AT&T AT&T Inc. is an American multinational telecommunications holding company headquartered at Whitacre Tower in Downtown Dallas, Texas. It is the world's largest telecommunications company by revenue and the third largest provider of mobile ...
, developed Yacc because he wanted to insert an exclusive or operator into a B language compiler (developed using McIlroy's TMG compiler-compiler), but it turned out to be a hard task. As a result, he was directed by his colleague at Bell Labs Al Aho to Donald Knuth's work on LR parsing, which served as the basis for Yacc. Yacc was influenced by and received its name in reference to TMG compiler-compiler. Yacc was originally written in the B programming language, but was soon rewritten in C by Alan Snyder. It appeared as part of Version 3 Unix, and a full description of Yacc was published in 1975. Johnson used Yacc to create the Portable C Compiler. Bjarne Stroustrup also attempted to use Yacc to create a formal specification of C++, but "was defeated by C's syntax". While finding it unsuitable for a formal specification of the language, Stroustrup did proceed to use Yacc to implement Cfront, the first implementation of C++. In a 2008 interview, Johnson reflected that "the contribution Yacc made to the spread of
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, ...
and C is what I'm proudest of".


Description

The input to Yacc is a grammar with snippets of C code (called "actions") attached to its rules. Its output is a shift-reduce parser in C that executes the C snippets associated with each rule as soon as the rule is recognized. Typical actions involve the construction of parse trees. Using an example from Johnson, if the call constructs a binary parse tree node with the specified and children, then the rule expr : expr '+' expr recognizes summation expressions and constructs nodes for them. The special identifiers , and refer to items on the parser's
stack Stack may refer to: Places * Stack Island, an island game reserve in Bass Strait, south-eastern Australia, in Tasmania’s Hunter Island Group * Blue Stack Mountains, in Co. Donegal, Ireland People * Stack (surname) (including a list of people ...
. Yacc produces only a parser (phrase analyzer); for full syntactic analysis this requires an external
lexical analyzer In computer science, lexical analysis, lexing or tokenization is the process of converting a sequence of characters (such as in a computer program or web page) into a sequence of ''lexical tokens'' ( strings with an assigned and thus identified ...
to perform the first tokenization stage (word analysis), which is then followed by the parsing stage proper. Lexical analyzer generators, such as Lex or
Flex Flex or FLEX may refer to: Computing * Flex (language), developed by Alan Kay * FLEX (operating system), a single-tasking operating system for the Motorola 6800 * FlexOS, an operating system developed by Digital Research * FLEX (protocol), a com ...
, are widely available. The
IEEE The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operati ...
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming in ...
P1003.2 standard defines the functionality and requirements for both Lex and Yacc. Some versions of AT&T Yacc have become
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized so ...
. For example,
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the ...
is available with the standard distributions of Plan 9.


Impact

Yacc and similar programs (largely reimplementations) have been very popular. Yacc itself used to be available as the default parser generator on most Unix systems, though it has since been supplanted by more recent, largely compatible, programs such as
Berkeley Yacc Berkeley Yacc (byacc) is a Unix parser generator designed to be compatible with Yacc. It was originally written by Robert Corbett and released in 1989. Due to its liberal license and because it was faster than the AT&T Yacc, it quickly became the ...
, GNU Bison, MKS Yacc, and Abraxas PCYACC. An updated version of the original AT&T Yacc is included as part of Sun's
OpenSolaris OpenSolaris () is a discontinued open-source computer operating system based on Solaris and created by Sun Microsystems. It was also, perhaps confusingly, the name of a project initiated by Sun to build a developer and user community around t ...
project. Each offers slight improvements and additional features over the original Yacc, but the concept and basic syntax have remained the same. Among the languages that were first implemented with Yacc are AWK, C++, eqn and Pic. Yacc was also used on Unix to implement the Portable C Compiler, as well as parsers for such programming languages as FORTRAN 77, Ratfor, APL, bc, m4, etc. Yacc has also been rewritten for other languages, including OCaml, Ratfor, ML,
Ada Ada may refer to: Places Africa * Ada Foah, a town in Ghana * Ada (Ghana parliament constituency) * Ada, Osun, a town in Nigeria Asia * Ada, Urmia, a village in West Azerbaijan Province, Iran * Ada, Karaman, a village in Karaman Province, T ...
,
Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, Frenc ...
,
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
, Python,
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called ...
, Go,
Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in ANSI standard document ''ANSI INCITS 226-1994 (S20018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived fr ...
and Erlang. *
Berkeley Yacc Berkeley Yacc (byacc) is a Unix parser generator designed to be compatible with Yacc. It was originally written by Robert Corbett and released in 1989. Due to its liberal license and because it was faster than the AT&T Yacc, it quickly became the ...
: The Berkeley implementation of Yacc quickly became more popular than AT&T Yacc itself because of its performance and lack of reuse restrictions. * LALR parser: The underlying parsing algorithm in Yacc-generated parsers. *
Bison Bison are large bovines in the genus ''Bison'' (Greek: "wild ox" (bison)) within the tribe Bovini. Two extant and numerous extinct species are recognised. Of the two surviving species, the American bison, ''B. bison'', found only in North A ...
: The GNU version of Yacc. * Lex (and Flex lexical analyser), a token parser commonly used in conjunction with Yacc (and Bison). * BNF is a metasyntax used to express
context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be em ...
s: that is a formal way to describe context-free languages. * PLY (Python Lex-Yacc) is an alternative implementation of Lex and Yacc in Python.


See also

*
Compiler-compiler In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a programming language and machine. The most common type of compiler ...
* hoc (programming language)


References


External links

* * * {{Authority control Compiling tools Parser generators Unix programming tools Unix SUS2008 utilities Plan 9 commands Inferno (operating system) commands