A filter is a
computer program
A computer program is a sequence or set of instructions in a programming language for a computer to execute. Computer programs are one component of software, which also includes documentation and other intangible components.
A computer progra ...
or
subroutine
In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed.
Functions ma ...
to process a
stream, producing another stream. While a single filter can be used individually, they are frequently strung together to form a
pipeline.
Some
operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s such as
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
are rich with filter programs.
Windows 7
Windows 7 is a major release of the Windows NT operating system developed by Microsoft. It was released to manufacturing on July 22, 2009, and became generally available on October 22, 2009. It is the successor to Windows Vista, released nearl ...
and later are also rich with filters, as they include
Windows PowerShell
PowerShell is a task automation and configuration management program from Microsoft, consisting of a command-line shell and the associated scripting language. Initially a Windows component only, known as Windows PowerShell, it was made open-s ...
. In comparison, however, few filters are built into
cmd.exe (the original
command-line interface of Windows), most of which have significant enhancements relative to the similar filter commands that were available in
MS-DOS
MS-DOS ( ; acronym for Microsoft Disk Operating System, also known as Microsoft DOS) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and a few oper ...
.
OS X
macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and la ...
includes filters from its underlying Unix base but also has
Automator, which allows filters (known as "Actions") to be strung together to form a pipeline.
Unix
In
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
and
Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating systems, a filter is a program that gets most of its data from its
standard input
In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin ...
(the main input stream) and writes its main results to its
standard output
In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin ...
(the main output stream). Auxiliary input may come from command line flags or configuration files, while auxiliary output may go to
standard error
The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error ...
. The command syntax for getting data from a device or file other than standard input is the input operator (
<
). Similarly, to send data to a device or file other than standard output is the output operator (
>
). To append data lines to an existing output file, one can use the append operator (
>>
). Filters may be strung together into a
pipeline with the pipe operator ("
,
"). This operator signifies that the main output of the command to the left is passed as main input to the command on the right.
The
Unix philosophy
The Unix philosophy, originated by Ken Thompson, is a set of cultural norms and philosophical approaches to minimalist, modular software development. It is based on the experience of leading developers of the Unix operating system. Early Unix d ...
encourages combining small, discrete tools to accomplish larger tasks. The classic filter in Unix is
Ken Thompson
Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B programmi ...
's , which
Doug McIlroy
Malcolm Douglas McIlroy (born 1932) is a mathematician, engineer, and programmer. As of 2019 he is an Adjunct Professor of Computer Science at Dartmouth College.
McIlroy is best known for having originally proposed Unix pipelines and developed se ...
cites as what "ingrained the tools outlook irrevocably" in the operating system, with later tools imitating it.
at its simplest prints any lines containing a character string to its output. The following is an example:
cut -d : -f 1 /etc/passwd , grep foo
This finds all registered users that have "
foo
The terms foobar (), foo, bar, baz, and others are used as metasyntactic variables and placeholder names in computer programming or computer-related documentation. - Etymology of "Foo" They have been used to name entities such as variables, ...
" as part of their username by using the
cut command to take the first field (username) of each line of the Unix system password file and passing them all as input to grep, which searches its input for lines containing the character string "foo" and prints them on its output.
Common Unix filter programs are:
cat
The cat (''Felis catus'') is a domestic species of small carnivorous mammal. It is the only domesticated species in the family Felidae and is commonly referred to as the domestic cat or house cat to distinguish it from the wild members of ...
,
cut,
grep
grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command ''g/re/p'' (''globally search for a regular expression and print matching lines''), which has the sa ...
,
head
A head is the part of an organism which usually includes the ears, brain, forehead, cheeks, chin, eyes, nose, and mouth, each of which aid in various sensory functions such as sight, hearing, smell, and taste. Some very simple animals may no ...
,
sort,
tail
The tail is the section at the rear end of certain kinds of animals’ bodies; in general, the term refers to a distinct, flexible appendage to the torso. It is the part of the body that corresponds roughly to the sacrum and coccyx in mammals ...
, and
uniq
uniq is a utility command (computing), command on Unix, Plan 9 from Bell Labs, Plan 9, Inferno (operating system), Inferno, and Unix-like operating systems which, when fed a text file or Standard streams#Standard input (stdin), standard input, o ...
.
Programs like
awk and
sed can be used to build quite complex filters because they are fully programmable. Unix filters can also be used by
Data scientists
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured and unstructured data, and apply knowledge from data across a bro ...
to get a quick overview about a file based dataset.
List of Unix filter programs
*
awk
*
cat
The cat (''Felis catus'') is a domestic species of small carnivorous mammal. It is the only domesticated species in the family Felidae and is commonly referred to as the domestic cat or house cat to distinguish it from the wild members of ...
*
comm
*
compress
compress is a Unix shell compression program based on the LZW compression algorithm. Compared to more modern compression utilities such as gzip and bzip2, compress performs faster and with less memory usage, at the cost of a significantly l ...
*
cut
*
expand
*
fold
*
grep
grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command ''g/re/p'' (''globally search for a regular expression and print matching lines''), which has the sa ...
*
head
A head is the part of an organism which usually includes the ears, brain, forehead, cheeks, chin, eyes, nose, and mouth, each of which aid in various sensory functions such as sight, hearing, smell, and taste. Some very simple animals may no ...
*
nl
*
paste
Paste is a term for any very thick viscous fluid. It may refer to:
Science and technology
* Adhesive or paste
** Wallpaper paste
** Wheatpaste, A liquid adhesive made from vegetable starch and water
* Paste (rheology), a substance that behaves a ...
*
perl
Perl is a family of two High-level programming language, high-level, General-purpose programming language, general-purpose, Interpreter (computing), interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it ...
*
pr
*
sed
*
sh
*
sort
*
split
Split(s) or The Split may refer to:
Places
* Split, Croatia, the largest coastal city in Croatia
* Split Island, Canada, an island in the Hudson Bay
* Split Island, Falkland Islands
* Split Island, Fiji, better known as Hạfliua
Arts, entertain ...
*
strings
*
tac
*
tail
The tail is the section at the rear end of certain kinds of animals’ bodies; in general, the term refers to a distinct, flexible appendage to the torso. It is the part of the body that corresponds roughly to the sacrum and coccyx in mammals ...
*
tee
*
tr
*
uniq
uniq is a utility command (computing), command on Unix, Plan 9 from Bell Labs, Plan 9, Inferno (operating system), Inferno, and Unix-like operating systems which, when fed a text file or Standard streams#Standard input (stdin), standard input, o ...
*
wc
*
zcat
DOS
Two standard filters from the early days of DOS-based computers are
find and
sort.
Examples:
find "keyword" < ''inputfilename'' > ''outputfilename''
sort "keyword" < ''inputfilename'' > ''outputfilename''
find /v "keyword" < ''inputfilename'' , sort > ''outputfilename''
Such filters may be used in
batch files (*.bat, *.cmd etc.).
For use in the same
command shell
In computing, a shell is a computer program that exposes an operating system's services to a human user or other programs. In general, operating system shells use either a command-line interface (CLI) or graphical user interface (GUI), depending ...
environment, there are many more filters available than those built into Windows. Some of these are
freeware
Freeware is software, most often proprietary, that is distributed at no monetary cost to the end user. There is no agreed-upon set of rights, license, or EULA that defines ''freeware'' unambiguously; every publisher defines its own rules for t ...
, some
shareware
Shareware is a type of proprietary software that is initially shared by the owner for trial use at little or no cost. Often the software has limited functionality or incomplete documentation until the user sends payment to the software developer ...
and some are commercial programs. A number of these mimic the function and features of the filters in Unix. Some filtering programs have a
graphical user interface
The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows User (computing), users to Human–computer interaction, interact with electronic devices through graphical icon (comp ...
(GUI) to enable users to design a customized filter to suit their special
data processing
Data processing is the collection and manipulation of digital data to produce meaningful information.
Data processing is a form of '' information processing'', which is the modification (processing) of information in any manner detectable by ...
and/or
data mining requirements.
Windows
Windows Command Prompt
Command Prompt, also known as cmd.exe or cmd, is the default command-line interpreter for the OS/2, eComStation, ArcaOS, Microsoft Windows ( Windows NT family and Windows CE family), and ReactOS operating systems. On Windows CE .NET 4.2, ...
inherited MS-DOS commands, improved some and added a few. For example,
Windows Server 2003
Windows Server 2003 is the sixth version of Windows Server operating system produced by Microsoft. It is part of the Windows NT family of operating systems and was released to manufacturing on March 28, 2003 and generally available on April 24, ...
features six command-line filters for modifying
Active Directory
Active Directory (AD) is a directory service developed by Microsoft for Windows domain networks. It is included in most Windows Server operating systems as a set of processes and services. Initially, Active Directory was used only for centr ...
that can be chained by piping: DSAdd, DSGet, DSMod, DSMove, DSRm and DSQuery.
Windows PowerShell
PowerShell is a task automation and configuration management program from Microsoft, consisting of a command-line shell and the associated scripting language. Initially a Windows component only, known as Windows PowerShell, it was made open-s ...
adds an entire host of filters known as "cmdlets" which can be chained together with a pipe, except a few simple ones, e.g.
Clear-Screen
. The following example gets a list of files in the
C:\Windows
folder, gets the size of each and sorts the size in ascending order. It shows how three filters (
Get-ChildItem
,
ForEach-Object
and
Sort-Object
) are chained with pipes.
Get-ChildItem C:\Windows , ForEach-Object , Sort-Object -Ascending
References
{{Reflist
External links
* http://www.webopedia.com/TERM/f/filter.html
Software design patterns
Programming paradigms
Operating system technology