Lex (software)
   HOME

TheInfoList



OR:

Lex is a
computer program A computer program is a sequence or set of instructions in a programming language for a computer to execute. Computer programs are one component of software, which also includes documentation and other intangible components. A computer progra ...
that generates lexical analyzers ("scanners" or "lexers"). Lex is commonly used with the yacc parser generator. Lex, originally written by
Mike Lesk Michael E. Lesk (born 1945) is an American computer scientist. Biography In the 1960s, Michael Lesk worked for the SMART Information Retrieval System project, wrote much of its retrieval code and did many of the retrieval experiments, as well as ...
and
Eric Schmidt Eric Emerson Schmidt (born April 27, 1955) is an American businessman and software engineer known for being the CEO of Google from 2001 to 2011, executive chairman of Google from 2011 to 2015, executive chairman of Alphabet Inc. from 2015 to 2 ...
and described in 1975, is the standard lexical analyzer generator on many
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
systems, and an equivalent tool is specified as part of the
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming inte ...
standard. Lex reads an input stream specifying the lexical analyzer and writes
source code In computing, source code, or simply code, is any collection of code, with or without comment (computer programming), comments, written using a human-readable programming language, usually as plain text. The source code of a Computer program, p ...
which implements the lexical analyzer in the
C programming language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as ...
. In addition to C, some old versions of Lex could generate a lexer in Ratfor.


Open source

Although originally distributed as proprietary software, some versions of Lex are now open-source. Open-source versions of Lex, based on the original proprietary code, are now distributed with open-source operating systems such as
OpenSolaris OpenSolaris () is a discontinued open-source computer operating system based on Solaris and created by Sun Microsystems. It was also, perhaps confusingly, the name of a project initiated by Sun to build a developer and user community around t ...
and Plan 9 from Bell Labs. One popular open-source version of Lex, called flex, or the "fast lexical analyzer", is not derived from proprietary coding.


Structure of a Lex file

The structure of a Lex file is intentionally similar to that of a yacc file: files are divided into three sections, separated by lines that contain only two percent signs, as follows: *The definitions section defines macros and imports header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file. *The rules section associates
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
patterns with C statements. When the lexer sees text in the input matching a given pattern, it will execute the associated C code. *The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file linked in at compile time.


Example of a Lex file

The following is an example Lex file for the flex version of Lex. It recognizes strings of numbers (positive integers) in the input, and simply prints them out. /*** Definition section ***/ % %% /*** Rules section ***/ /* -9 matches a string of one or more digits */ -9 ., \n %% /*** C Code section ***/ int main(void) If this input is given to flex, it will be converted into a C file, . This can be compiled into an executable which matches and outputs strings of integers. For example, given the input: abc123z.!&*2gj6 the program will print: Saw an integer: 123 Saw an integer: 2 Saw an integer: 6


Using Lex with other programming tools


Using Lex with parser generators

Lex and parser generators, such as Yacc or
Bison Bison are large bovines in the genus ''Bison'' (Greek: "wild ox" (bison)) within the tribe Bovini. Two extant and numerous extinct species are recognised. Of the two surviving species, the American bison, ''B. bison'', found only in North ...
, are commonly used together. Parser generators use a
formal grammar In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe ...
to parse an input stream, something which Lex cannot do using simple
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
s, as Lex is limited to simple finite state automata. It is typically preferable to have a parser, one generated by Yacc for instance, accept a stream of tokens (a "token-stream") as input, rather than having to process a stream of characters (a "character-stream") directly. Lex is often used to produce such a token-stream. Scannerless parsing refers to parsing the input character-stream directly, without a distinct lexer.


Lex and make

make is a utility that can be used to maintain programs involving Lex. Make assumes that a file that has an extension of .l is a Lex source file. The make internal macro LFLAGS can be used to specify Lex options to be invoked automatically by make.


See also

* Flex lexical analyser * Yacc * Ragel * PLY (Python Lex-Yacc) * Comparison of parser generators


References


External links


Using Flex and Bison at Macworld.com
* * {{Plan 9 commands Compiling tools Unix programming tools Unix SUS2008 utilities Plan 9 commands Finite automata Lexical analysis