grep is a
command-line utility for searching plaintext datasets for lines that match a
regular expression
A regular expression (shortened as regex or regexp), sometimes referred to as rational expression, is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
. Its name comes from the
ed command
g/re/p
(global regular expression search and print), which has the same effect.
grep was originally developed for the
Unix
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
operating system, but later became available for all
Unix-like
A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
systems and some others such as
OS-9.
History
Before it was named, grep was a private utility written by
Ken Thompson
Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B (programmi ...
to search files for certain patterns.
Doug McIlroy, unaware of its existence, asked Thompson to write such a program. Responding that he would think about such a utility overnight, Thompson actually corrected bugs and made improvements for about an hour on his own program called "s" (short for "search"). The next day he presented the program to McIlroy, who said it was exactly what he wanted. Thompson's account may explain the belief that grep was written overnight.
Thompson wrote the first version in
PDP-11 assembly language
In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
to help
Lee E. McMahon analyze the text of ''
The Federalist Papers
''The Federalist Papers'' is a collection of 85 articles and essays written by Alexander Hamilton, James Madison, and John Jay under the collective pseudonym "Publius" to promote the ratification of the Constitution of the United States. The ...
'' to determine authorship of the individual papers. The
ed text editor (also authored by Thompson) had
regular expression
A regular expression (shortened as regex or regexp), sometimes referred to as rational expression, is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
support but could not be used to search through such a large amount of text, as it loaded the entire file into memory to enable
random access
Random access (also called direct access) is the ability to access an arbitrary element of a sequence in equal time or any datum from a population of addressable elements roughly as easily and efficiently as any other, no matter how many elemen ...
editing, so Thompson excerpted that regexp code into a standalone tool which would instead process arbitrarily long files sequentially without buffering too much into memory.
He chose the name because in ed, the command
g/re/p
would print all lines featuring a specified pattern match. grep was first included in
Version 4 Unix
Research Unix refers to the early versions of the Unix operating system for PDP-7, DEC PDP-7, PDP-11, VAX and Interdata 7/32 and 8/32 computers, developed in the Bell Labs Computing Sciences Research Center (CSRC). The term ''Research Unix'' first ...
. Stating that it is "generally cited as ''the'' prototypical software tool", McIlroy credited grep with "irrevocably ingraining" Thompson's
tools philosophy
The Unix philosophy, originated by Ken Thompson, is a set of cultural norms and philosophical approaches to minimalist, modular software development. It is based on the experience of leading developers of the Unix operating system. Early Unix ...
in Unix.
Implementations
A variety of grep implementations are available in many operating systems and software development environments. Early variants included egrep and fgrep, introduced in
Version 7 Unix
Version 7 Unix, also called Seventh Edition Unix, Version 7 or just V7, was an important early release of the Unix operating system. V7, released in 1979, was the last Bell Laboratories release to see widespread distribution before the commerc ...
. The egrep variant supports an
extended regular expression syntax added by
Alfred Aho
Alfred Vaino Aho (born August 9, 1941) is a Canadian computer scientist best known for his work on programming languages, compilers, and related algorithms, and his textbooks on the art and science of computer programming.
Aho was elected into ...
after
Ken Thompson
Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B (programmi ...
's original regular expression implementation.
The "fgrep" variant searches for any of a list of ''fixed'' strings using the
Aho–Corasick string matching algorithm.
Binaries of these variants exist in modern systems, usually linking to grep or calling grep as a shell script with the appropriate flag added, e.g.
exec grep -E "$@"
. egrep and fgrep, while commonly deployed on POSIX systems, to the point the POSIX specification mentions their widespread existence, are actually not part of POSIX.
Other commands contain the word "grep" to indicate they are search tools, typically ones that rely on regular expression matches. The
pgrep
pgrep is a command-line utility initially written for use with the Solaris 7 operating system by Mike Shapiro. It has since been available in illumos and reimplemented for the Linux and BSDs ( DragonFly BSD, FreeBSD, NetBSD, and OpenBSD). ...
utility, for instance, displays the processes whose names match a given regular expression.
In the
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".
Perl was developed ...
programming language,
grep
is a built-in function that finds elements in a list that satisfy a certain property. This
higher-order function In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:
* takes one or more functions as arguments (i.e. a procedural parameter, which is a parameter of a procedure that is itself ...
is typically named
filter
or
where
in other languages.
The pcregrep command is an implementation of grep that uses
Perl regular expression syntax. Similar functionality can be invoked in the GNU version of grep with the
-P
flag.
Ports Ports collections (or ports trees, or just ports) are the sets of makefiles and Patch (Unix), patches provided by the BSD-based operating systems, FreeBSD, NetBSD, and OpenBSD, as a simple method of installing software or creating binary packages. T ...
of grep (within
Cygwin and
GnuWin32, for example) also run under
Microsoft Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
. Some versions of Windows feature the similar qgrep or
findstr command.
A grep command is also part of
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
's ''MSX-DOS2 Tools'' for
MSX-DOS version 2.
The grep, egrep, and fgrep commands have also been ported to the
IBM i operating system.
The software
Adobe InDesign
Adobe InDesign is a desktop publishing and page layout designing software application software, application produced by Adobe Inc., Adobe and first released in 1999. It can be used to create works such as posters, flyers, brochures, magazines, ...
has functions GREP (since CS3 version (2007)), in the ''find/change'' dialog box "GREP" tab, and introduced with InDesign CS4 in ''paragraph styles'' "GREP styles".
agrep
agrep (approximate grep) is an
open-source approximate string matching program, developed by
Udi Manber and Sun Wu between 1988 and 1991, for use with the
Unix
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
operating system. It was later ported to
OS/2
OS/2 is a Proprietary software, proprietary computer operating system for x86 and PowerPC based personal computers. It was created and initially developed jointly by IBM and Microsoft, under the leadership of IBM software designer Ed Iacobucci, ...
,
DOS, and
Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
.
''a''grep matches even when the text only ''approximately'' fits the search pattern.
This following invocation finds
netmasks
in file
myfile
, but also any other word that can be derived from it, given no more than two substitutions.
agrep -2 netmasks myfile
This example generates a list of matches with the closest, that is those with the fewest, substitutions listed first. The command flag
-B
means "best":
agrep -B netmasks myfile
Usage as a verb
In December 2003, the ''
Oxford English Dictionary
The ''Oxford English Dictionary'' (''OED'') is the principal historical dictionary of the English language, published by Oxford University Press (OUP), a University of Oxford publishing house. The dictionary, which published its first editio ...
Online'' added "grep" as both a noun and a verb.
A common verb usage is the phrase "You can't grep dead trees"—meaning one can more easily search through digital media, using tools such as grep, than one could with a hard copy (i.e. one made from "dead trees", which in this context is a
dysphemism
A dysphemism is an expression with connotations that are derogatory either about the subject matter or to the audience. Dysphemisms contrast with neutral or Euphemism, euphemistic expressions. Dysphemism may be motivated by fear, Distasteful, dista ...
for paper).
['']Jargon File
The Jargon File is a glossary and usage dictionary of slang used by computer programmers. The original Jargon File was a collection of terms from technical cultures such as the MIT Computer Science and Artificial Intelligence Laboratory, MIT AI Lab ...
'', article "Documentation"
See also
*
Boyer–Moore string-search algorithm
*
agrep, an approximate string-matching command
*
find (Windows) or
Findstr, a DOS and Windows command that performs text searches, similar to a simple grep
*
find (Unix), a Unix command that finds files by attribute, very different from grep
*
List of Unix commands
This is a list of the shell commands of the most recent version of the Portable Operating System Interface (POSIX) IEEE Std 1003.1-2024 which is part of the Single UNIX Specification (SUS). These commands are implemented in many shells on moder ...
*
vgrep, or "visual grep"
*
ngrep, the network grep
References
;Notes
*
* Hume, Andrew ''Grep wars: The strategic search initiative.'' In Peter Collinson, editor, ''Proceedings of the EUUG Spring 88 Conference'', pages 237–245, Buntingford, UK, 1988. European UNIX User Group.
*
External links
GNU Grep official website*
*
- implementation details from GNU grep's author.
Command Grep – 25 practical examples
{{Authority control
Unix text processing utilities
Unix SUS2008 utilities
Standard Unix programs
Plan 9 commands
Inferno (operating system) commands
IBM i Qshell commands