History
First appearing in Version 7 Unix, sed is one of the early Unix commands built for command line processing of data files. It evolved as the natural successor to the popular grep command. The original motivation was an analogue of grep (g/re/p) for substitution, hence "g/re/s". Foreseeing that further special-purpose programs for each command would also arise, such as g/re/d, McMahon wrote a general-purpose line-oriented stream editor, which became sed. The syntax for sed, notably the use of/
for pattern matching, and s///
for substitution, originated with ed, the precursor to sed, which was in common use at the time, and the regular expression syntax has influenced other languages, notably ECMAScript and Perl. Later, the more powerful language AWK developed, and these functioned as cousins, allowing powerful text processing to be done by Mode of operation
sed is a line-oriented text processing utility: it reads text, line by line, from an input stream or file, into an internal buffer called the ''pattern space''. Each line read starts a ''cycle''. To the pattern space, sed applies one or more operations which have been specified via a ''sed script''. sed implements a programming language with about 25 ''commands'' that specify the operations on the text. For each input line, after running the script, sed ordinarily outputs the pattern space (the line as modified by the script) and begins the cycle again with the next line. Other end-of-script behaviors are available through sed options and script commands, e.g.d
to delete the pattern space, q
to quit, N
to add the next line to the pattern space immediately, and so on. Thus a sed script corresponds to the body of a loop that iterates through the lines of a stream, where the loop itself and the loop variable (the current line number) are implicit and maintained by sed.
The sed script can either be specified on the command line (-e
option) or read from a separate file (-f
option). Commands in the sed script may take an optional ''address,'' in terms of line numbers or regular expressions. The address determines when the command is run. For example, 2d
would only run the d
(delete) command on the second input line (printing all lines but the second), while /^ /d
would delete all lines beginning with a space. A separate special buffer, the ''hold space'', may be used by a few sed commands to hold and accumulate text between cycles. sed's command language has only two variables (the "hold space" and the "pattern space") and Usage
Substitution command
The following example shows a typical, and the most common, use of sed: substitution. This usage was indeed the original motivation for sed:-e
to indicate that an expression follows. The s
stands for substitute, while the g
stands for global, which means that all matching occurrences in the line would be replaced. The regular expression (i.e. pattern) to be searched is placed after the first delimiting symbol (slash here) and the replacement follows the second symbol. Slash (/
) is the conventional symbol, originating in the character for "search" in ed, but any other could be used to make syntax more readable if it does not occur in the pattern or replacement; this is useful to avoid "regexp
provides both pattern matching and saving text via sub-expressions, while the replacement
can be either literal text, or a format string containing the characters &
for "entire match" or the special escape sequences \1
through \9
for the ''n''th saved sub-expression. For example, sed -r "s/(cat, dog)s?/\1s/g"
replaces all occurrences of "cat" or "dog" with "cats" or "dogs", without duplicating an existing "s": (cat, dog)
is the 1st (and only) saved sub-expression in the regexp, and \1
in the format string substitutes this into the output.
Other sed commands
Besides substitution, other forms of simple processing are possible, using some 25 sed commands. For example, the following uses the ''d'' command to filter out lines that only contain spaces, or only contain the end of line character:^
) matches the beginning of the line.
* The dollar sign ($
) matches the end of the line.
* The *
) matches zero or more occurrences of the previous character.
* The plus (+
) matches one or more occurrence(s) of the previous character.
* The question mark (?
) matches zero or one occurrence of the previous character.
* The dot (.
) matches exactly one character.
Complex sed constructs are possible, allowing it to serve as a simple, but highly specialized, programming language. Flow of control, for example, can be managed by the use of a label (a colon followed by a string) and the branch instruction b
, as well as the conditional branch t
. An instruction b
followed by a valid label name will move processing to the command following that label. The t
instruction will only do so if there was a successful substitution since the previous t
(or the start of the program, in case of the first t
encountered). Additionally, the
); in most cases, it will be conditioned by an address pattern.
sed used as a filter
Under Unix, sed is often used as a filter in a pipeline:s/x/y/g
there is no ambiguity, so generateData , sed s/x/y/g
works correctly. However, quotes are usually included for clarity, and are often necessary, notably for whitespace (e.g., 's/x x/y y/'
). Most often single quotes are used, to avoid having the shell interpret $
as a shell variable. Double quotes are used, such as "s/$1/$2/g"
, to allow the shell to substitute for a command line argument or other shell variable.
File-based sed scripts
It is often useful to put several sed commands, one command per line, into a script file such assubst.sed
, and then use the -f
option to run the commands (such as s/x/y/g
) from the file:
subst.sed
can be created with contents:
chmod
command:In-place editing
The-i
option, introduced in GNU sed, allows in-place editing of files (actually, a temporary output file is created in the background, and then the original file is replaced by the temporary file). For example:
Examples
Hello, world! example
sed -f script.txt inputFileName
, where "inputFileName" is the input text file. The script changes "inputFileName" line #1 to "Hello, world!" and then quits, printing the result before sed exits. Any input lines past line #1 are not read, and not printed. So the sole output is "Hello, world!".
The example emphasizes many key characteristics of sed:
* Typical sed programs are rather short and simple.
* sed scripts can have comments (the line starting with the #
symbol).
* The s
(substitute) command is the most important sed command.
* sed allows simple programming, with commands such as q
(quit).
* sed uses regular expressions, such as .*
(zero or more of any character).
Other simple examples
Below follow various sed scripts; these can be executed by passing as an argument to sed, or put in a separate file and executed via-f
or by making the script itself executable.
To replace any instance of a certain word in a file with "REDACTED", such as an IRC password, and save the result:
Multiline processing example
In the next example, sed, which usually only works on one line, removes newlines from sentences where the second line starts with one space. Consider the following text: This is my dog, whose name is Frank. This is my fish, whose name is George. This is my goat, whose name is Adam. The sed script below will turn the text above into the following text. Note that the script affects only the input lines that start with a space: This is my dog, whose name is Frank. This is my fish, whose name is George. This is my goat, whose name is Adam. The script is:N
) add the next line to the pattern space;
* (s/\n / /
) find a new line followed by a space, replace with one space;
* (P
) print the top line of the pattern space;
* (D
) delete the top line from the pattern space and run the script again.
This can be expressed on a single line via semicolons:
sed '' inputFileName
Limitations and alternatives
While simple and limited, sed is sufficiently powerful for a large number of purposes. For more sophisticated processing, more powerful languages such as AWK or Perl are used instead. These are particularly used if transforming a line in a way more complicated than a regex extracting and template replacement, though arbitrarily complicated transforms are in principle possible by using the hold buffer. Conversely, for simpler operations, specialized Unix utilities such as grep (print lines matching a pattern),See also
* List of Unix commands * Turing tarpitNotes
References
Further reading
External links
* *