file
is a
shell
Shell may refer to:
Architecture and design
* Shell (structure), a thin structure
** Concrete shell, a thin shell of concrete, usually with no interior columns or exterior buttresses
Science Biology
* Seashell, a hard outer layer of a marine ani ...
command
Command may refer to:
Computing
* Command (computing), a statement in a computer language
* command (Unix), a Unix command
* COMMAND.COM, the default operating system shell and command-line interpreter for DOS
* Command key, a modifier key on A ...
for reporting the type of data contained in a
file. It is commonly supported in
Unix
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
and
Unix-like
A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
operating systems
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
.
As the command uses relatively quick-running
heuristic
A heuristic or heuristic technique (''problem solving'', '' mental shortcut'', ''rule of thumb'') is any approach to problem solving that employs a pragmatic method that is not fully optimized, perfected, or rationalized, but is nevertheless ...
s to determine
file type, it can report misleading information. The command can be fooled, for example, by including a magic number in the content even if the rest of the content does not match what the magic number indicates. The command report cannot be taken as completely trustworthy.
The
Single UNIX Specification
The Single UNIX Specification (SUS) is a standard for computer operating systems, compliance with which is required to qualify for using the "UNIX" trademark. The standard specifies programming interfaces for the C language, a command-line shell, ...
(SUS) requires the command to exhibit the following behavior with respect to the file specified via the
command-line:
# If the file cannot be read, or its
Unix file type is undetermined, the command will report that the file was processed but its type was undetermined
# The command must be able to determine the types
directory,
FIFO,
socket, block
special file, and character special file
# A zero-length file is reported as such
# An initial part of file is considered and the command is to use position-sensitive tests
# The entire file is considered and the command is to use context-sensitive tests
# Otherwise, the file is reported as a data file
Position-sensitive tests are normally implemented by matching various locations within the file against a textual database of
magic numbers (see the Usage section). This differs from other simpler methods such as
file extensions and schemes like
MIME
A mime artist, or simply mime (from Greek language, Greek , , "imitator, actor"), is a person who uses ''mime'' (also called ''pantomime'' outside of Britain), the acting out of a story through body motions without the use of speech, as a the ...
.
In the System V implementation, the Ian Darwin implementation, and the OpenBSD implementation, the command uses a database to drive the probing of the lead bytes. That database is stored as a file that is located in
/etc/magic
,
/usr/share/file/magic
or similar.
History
The
file
command originated in
Unix Research Version 4 in 1973.
System V
Unix System V (pronounced: "System Five") is one of the first commercial versions of the Unix operating system. It was originally developed by AT&T and first released in 1983. Four major versions of System V were released, numbered 1, 2, 3, an ...
brought a major update with several important changes, most notably moving the file type information into an external text file rather than compiling it into the binary itself.
Most major
BSD
The Berkeley Software Distribution (BSD), also known as Berkeley Unix or BSD Unix, is a discontinued Unix operating system developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berkeley, beginni ...
and
Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
distributions include a
free,
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
implementation that was written from scratch by Ian Darwin in 1986–87. It keeps file type information in a text file with a format based on that of the System V version. It was expanded by
Geoff Collyer in 1989 and since then has had input from many others, including Guy Harris, Chris Lowth and Eric Fischer. From late 1993 onward, its maintenance has been organized by Christos Zoulas. The
OpenBSD
OpenBSD is a security-focused operating system, security-focused, free software, Unix-like operating system based on the Berkeley Software Distribution (BSD). Theo de Raadt created OpenBSD in 1995 by fork (software development), forking NetBSD ...
system has its own subset implementation written from scratch, but still uses the Darwin/Zoulas collection of magic file formatted information.
The command was ported to the
IBM i
IBM i (the ''i'' standing for ''integrated'') is an operating system developed by IBM for IBM Power Systems. It was originally released in 1988 as OS/400, as the sole operating system of the IBM AS/400 line of systems. It was renamed to i5/OS in 2 ...
operating system.
As of version 4.00 of the Ian Darwin/Christos Zoulas implementation of
file
, the functionality of the command is implemented in and exposed by a
libmagic
library
A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
that is accessible to consuming code via
C (and compatible) linking.
Usage
The SUS
mandates the following command-line options:
*
-M ''file''
, prevents the default position-sensitive and context-sensitive tests in favor of the tests specified in a specially formatted file
*
-m ''file''
, same as for
-M
, but with tests in addition to the default
*
-d
, selects default position-sensitive and context-sensitive tests; this is the default behavior unless
-M
or
-m
are specified
*
-h
, do not dereference
symbolic link
In computing, a symbolic link (also symlink or soft link) is a file whose purpose is to point to a file or directory (called the "target") by specifying a path thereto.
Symbolic links are supported by POSIX and by most Unix-like operating syste ...
s that point to an existing file or directory
*
-L
, dereference the symbolic link that points to an existing file or directory
*
-i
, do not classify the file further than to report as: nonexistent, a block special file, a character special file, a directory, a
FIFO, a socket, a symbolic link, or a regular file; the Ian Darwin and OpenBSD versions behave differently with this option and instead output an
Internet media type
In information and communications technology, a media type, content type or MIME type is a two-part identifier for file formats and content formats. Their purpose is comparable to filename extensions and uniform type identifiers, in that they ident ...
("
MIME
A mime artist, or simply mime (from Greek language, Greek , , "imitator, actor"), is a person who uses ''mime'' (also called ''pantomime'' outside of Britain), the acting out of a story through body motions without the use of speech, as a the ...
type") identifying the recognized file format
Implementations may add extra options. Ian Darwin's implementation adds
-s
'special files',
-k
'keep-going' or
-r
'raw', among many others.
Examples
For a
C source code
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, only ...
file, reports:
main.c: C program text
For a compiled executable, reports information like:
program:
ELF
An elf (: elves) is a type of humanoid supernatural being in Germanic peoples, Germanic folklore. Elves appear especially in Norse mythology, North Germanic mythology, being mentioned in the Icelandic ''Poetic Edda'' and the ''Prose Edda'' ...
32-bit
In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in a maximum of 32- bit units. Compared to smaller bit widths, 32-bit computers can perform la ...
LSB executable
In computer science, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instruction (computer science), in ...
,
Intel 80386
The Intel 386, originally released as the 80386 and later renamed i386, is the third-generation x86 architecture microprocessor from Intel. It was the first 32-bit computing, 32-bit processor in the line, making it a significant evolution in ...
, version 1 (
SYSV),
dynamically linked
(uses
shared libs),
stripped
For a block device
/dev/hda, reports:
/dev/hda1:
block special (0/0)
By default,
file
does not try to read a device file due to potential undesirable effects. But using the non-standard option (available in the Ian Darwin branch), which requests to read device files to identify content, reports details such as:
/dev/hda1: Linux/
i386
The Intel 386, originally released as the 80386 and later renamed i386, is the third-generation x86 architecture microprocessor from Intel. It was the first 32-bit processor in the line, making it a significant evolution in the x86 archite ...
ext2
ext2, or second extended file system, is a file system for the Linux kernel (operating system), kernel. It was initially designed by French software developer Rémy Card as a replacement for the extended file system (ext). Having been designed ...
filesystem
Via Ian Darwin's non-standard option
-k
, the command does not stop after the first hit found, but looks for other matching patterns. The
-r
option, which is available in some versions, causes the
new line
New or NEW may refer to:
Music
* New, singer of K-pop group The Boyz
* ''New'' (album), by Paul McCartney, 2013
** "New" (Paul McCartney song), 2013
* ''New'' (EP), by Regurgitator, 1995
* "New" (Daya song), 2017
* "New" (No Doubt song), 1 ...
character to be displayed in its raw form rather than in its octal representation. On Linux, reports information like:
libmagic-dev_5.35-4_
armhf.
deb: Debian binary package (format 2.0)
- current
ar archive
- data
For a compressed file, reports information like:
compressed.gz:
gzip
gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and ...
compressed data,
deflated, original filename, `compressed', last
modified: Thu Jan 26 14:08:23 2006,
os: Unix
For a compressed file, reports information like:
compressed.gz:
application/x-gzip;
charset=
binary
For a PPM file, reports;
data.ppm:
Netpbm PPM "rawbits" image data
For a
Mach-O universal binary, reports like:
/bin/cat: Mach-O universal binary with 2
architectures
/bin/cat (for architecture
ppc7400): Mach-O executable ppc
/bin/cat (for architecture i386): Mach-O executable i386
For a
symbolic link
In computing, a symbolic link (also symlink or soft link) is a file whose purpose is to point to a file or directory (called the "target") by specifying a path thereto.
Symbolic links are supported by POSIX and by most Unix-like operating syste ...
, reports:
/usr/bin/vi: symbolic link to vim
Identifying a symbolic link is not available on all platforms and will be dereferenced if
-L
is passed or
POSIXLY_CORRECT
is set.
See also
*
References
External links
*
*
*
*
* – a non-Ian Darwin implementation
* – a non-Ian Darwin, non-SUS implementation
Fine Free File Command– homepage for Ian Darwin's version of
file
used in major BSD and Linux distributions.
*
mailing list*
releasesbinwalk a firmware analysis tool that carves files based on libmagic signatures
an alternative providing ranked answers (instead of just one) based on statistics.
Magika an ML-based tool, by Google Research
{{DEFAULTSORT:File (Command)
Standard Unix programs
Unix SUS2008 utilities
Plan 9 commands
IBM i Qshell commands