filename extension
   HOME

TheInfoList



OR:

A filename extension, file name extension or file extension is a suffix to the
name A name is a term used for identification by an external observer. They can identify a class or category of things, or a single thing, either uniquely, or within a given context. The entity identified by a name is called its referent. A person ...
of a
computer file A computer file is a System resource, resource for recording Data (computing), data on a Computer data storage, computer storage device, primarily identified by its filename. Just as words can be written on paper, so too can data be written to a ...
(for example, .txt, .mp3, .exe) that indicates a characteristic of the file contents or its intended use. A filename extension is typically delimited from the rest of the filename with a
full stop The full stop ( Commonwealth English), period (North American English), or full point is a punctuation mark used for several purposes, most often to mark the end of a declarative sentence (as distinguished from a question or exclamation). A ...
(period), but in some systems it is separated with spaces. Some file systems, such as the FAT file system used in DOS, implement filename extensions as a feature of the file system itself and may limit the length and format of the extension, while others, such as Unix file systems, the VFAT file system, and NTFS, treat filename extensions as part of the filename without special distinction.


Operating system and file system support

The Multics file system stores the file name as a single string, not split into base name and extension components, allowing the "." to be just another character allowed in file names. It allows for variable-length filenames, permitting more than one dot, and hence multiple suffixes, as well as no dot, and hence no suffix. Some components of Multics, and applications running on it, use suffixes to indicate file types, but not all files are required to have a suffix — for example, executables and ordinary text files usually have no suffixes in their names. File systems for
UNIX-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
operating systems also store the file name as a single string, with "." as just another character in the file name. A file with more than one suffix is sometimes said to have more than one extension, although terminology varies in this regard, and most authors define ''extension'' in a way that does not allow more than one in the same file name. More than one extension usually represents nested transformations, such as files.tar.gz (the .tar indicates that the file is a tar archive of one or more files, and the .gz indicates that the tar archive file is compressed with gzip). Programs transforming or creating files may add the appropriate extension to names inferred from input file names (unless explicitly given an output file name), but programs reading files usually ignore the information; it is mostly intended for the human user. It is more common, especially in binary files, for the file to contain internal or external metadata describing its contents. This model generally requires the full filename to be provided in commands, whereas the metadata approach often allows the extension to be omitted. CTSS was an early operating system in which the filename and file type were separately stored. Continuing this practice, and also using a dot as a separator for display and input purposes (while not storing the dot), were various DEC operating systems (such as
RT-11 RT-11 (Real-time 11) is a discontinued small, low-end, single-user real-time operating system for the full line of Digital Equipment Corporation PDP-11 16-bit computers. RT-11 was first implemented in 1970. It was widely used for real-time compu ...
), followed by CP/M and subsequently DOS. In DOS and 16-bit Windows, file names have a maximum of 8 characters, a period, and an extension of up to three letters. The FAT file system for DOS and Windows stores file names as an 8-character name and a three-character extension. The period character is not stored. The High Performance File System (HPFS), used in Microsoft and
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
's
OS/2 OS/2 is a Proprietary software, proprietary computer operating system for x86 and PowerPC based personal computers. It was created and initially developed jointly by IBM and Microsoft, under the leadership of IBM software designer Ed Iacobucci, ...
stores the file name as a single string, with the "." character as just another character in the file name. The convention of using suffixes continued, even though HPFS supports extended attributes for files, allowing a file's type to be stored in the file as an extended attribute. Microsoft's Windows NT's native file system, NTFS, and the later ReFS, also store the file name as a single string; again, the convention of using suffixes to simulate extensions continued, for compatibility with existing versions of Windows. In Windows NT 3.5, a variant of the FAT file system, called VFAT appeared; it supports longer file names, with the file name being treated as a single string. Windows 95, with VFAT, introduced support for long file names, and removed the 8.3 name/extension split in file names from non-NT Windows. The
classic Mac OS Mac OS (originally System Software; retronym: Classic Mac OS) is the series of operating systems developed for the Mac (computer), Macintosh family of personal computers by Apple Computer, Inc. from 1984 to 2001, starting with System 1 and end ...
disposed of filename-based extension metadata entirely; it used, instead, a distinct file type code to identify the file format. Additionally, a creator code was specified to determine which application would be launched when the file's icon was double-clicked.
macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
, however, uses filename suffixes as a consequence of being derived from the UNIX-like NeXTSTEP operating system, in addition to using type and creator codes. In Commodore systems, files can only have four extensions: PRG, SEQ, USR, REL. However, these are used to separate data types used by a program and are irrelevant for identifying their contents. With the advent of
graphical user interface A graphical user interface, or GUI, is a form of user interface that allows user (computing), users to human–computer interaction, interact with electronic devices through Graphics, graphical icon (computing), icons and visual indicators such ...
s, the issue of file management and interface behavior arose. Microsoft Windows allowed multiple applications to be associated with a given extension, and different actions were available for selecting the required application, such as a context menu offering a choice between viewing, editing or printing the file. The assumption was still that any extension represented a single file type; there was an unambiguous mapping between extension and icon. When the
Internet The Internet (or internet) is the Global network, global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a internetworking, network of networks ...
age first arrived, those using Windows systems that were still restricted to 8.3 filename formats had to create web pages with names ending in .HTM, while those using
Macintosh Mac is a brand of personal computers designed and marketed by Apple Inc., Apple since 1984. The name is short for Macintosh (its official name until 1999), a reference to the McIntosh (apple), McIntosh apple. The current product lineup inclu ...
or UNIX computers could use the recommended .html filename extension. This also became a problem for programmers experimenting with the
Java programming language Java is a high-level, general-purpose, memory-safe, object-oriented programming language. It is intended to let programmers ''write once, run anywhere'' ( WORA), meaning that compiled Java code can run on all platforms that support Jav ...
, since it ''requires'' the four-letter suffix .java for
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
files and the five-letter suffix .class for Java
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
object code output files.


Content type

Filename extensions may be considered a type of
metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
. They are commonly used to imply information about the way data might be stored in the file. The exact definition, giving the criteria for deciding what part of the file name is its extension, belongs to the rules of the specific file system used; usually the extension is the substring which follows the last occurrence, if any, of the dot character (''example:'' txt is the extension of the filename readme.txt, and html the extension of index.html). On file systems of some mainframe systems such as CMS in VM, VMS, and of PC systems such as CP/M and derivative systems such as MS-DOS, the extension is a separate namespace from the filename. Under Microsoft's DOS and Windows, extensions such as EXE, COM or BAT indicate that a file is a program
executable In computer science, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instruction (computer science), in ...
. In OS/360 and successors, the part of the dataset name following the last period, called the low level qualifier, is treated as an extension by some software, e.g., TSO EDIT, but it has no special significance to the operating system itself; the same applies to Unix files in MVS. The filename extension was originally used to determine the file's generic type. The need to condense a file's type into three characters frequently led to abbreviated extensions. Examples include using .GFX for graphics files, .TXT for
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers, images, etc.). It may also include a lim ...
, and .MUS for music. However, because many different software programs have been made that all handle these data types (and others) in a variety of ways, filename extensions started to become closely associated with certain products—even specific product versions. For example, early WordStar files used .WS or .WS''n'', where ''n'' was the program's version number. Also, conflicting uses of some filename extensions developed. One example is .rpm, used for both RPM Package Manager packages and RealPlayer Media files;. Others are .qif, shared by DESQview fonts, Quicken financial ledgers, and QuickTime pictures; .gba, shared by GrabIt scripts and Game Boy Advance ROM images; .sb, used for SmallBasic and Scratch; and .dts, being used for Dynamix Three Space and DTS.


Compared to MIME type

In many
Internet The Internet (or internet) is the Global network, global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a internetworking, network of networks ...
protocols, such as
HTTP HTTP (Hypertext Transfer Protocol) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, wher ...
and MIME email, the type of a bitstream is stated as the media type, or MIME type, of the stream, rather than a filename extension. This is given in a line of text preceding the stream, such as ''Content-type: text/plain''. There is no standard mapping between filename extensions and media types, resulting in possible mismatches in interpretation between authors, web servers, and client software when transferring files over the Internet. For instance, a content author may specify the extension ''svgz'' for a compressed Scalable Vector Graphics file, but a web server that does not recognize this extension may not send the proper content type ''application/svg+xml'' and its required compression header, leaving web browsers unable to correctly interpret and display the image. BeOS, whose BFS file system supports extended attributes, would tag a file with its media type as an extended attribute. Some
desktop environments A desktop traditionally refers to: * The surface of a desk (often to distinguish office appliances that fit on a desk, such as photocopiers and printers, from larger equipment covering its own area on the floor) Desktop may refer to various compu ...
, such as KDE Plasma and GNOME, associate a media type with a file by examining both the filename suffix and the contents of the file, in the fashion of the file command, as a heuristic. They choose the application to launch when a file is opened based on that media type, reducing the dependency on filename extensions.
macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
uses both filename extensions and media types, as well as file type codes, to select a Uniform Type Identifier by which to identify the file type internally.


Executable programs

The use of a filename extension in a command name appears occasionally, usually as a side effect of the command having been implemented as a script, e.g., for the Bourne shell or for Python, and the interpreter name being suffixed to the command name, a practice common on systems that rely on associations between filename extension and interpreter, but sharply deprecated in
Unix-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
systems, such as
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
, Oracle Solaris, BSD-based systems, and Apple's
macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
, where the interpreter is normally specified as a header in the script (" shebang"). On association-based systems, the filename extension is generally mapped to a single, system-wide selection of interpreter for that extension (such as ".py" meaning to use Python), and the command itself is runnable from the command line even if the extension is omitted (assuming appropriate setup is done). If the implementation language is changed, the command name extension is changed as well, and the OS provides a consistent API by allowing the same extensionless version of the command to be used in both cases. This method suffers somewhat from the essentially global nature of the association mapping, as well as from developers' incomplete avoidance of extensions when calling programs, and that developers can not force that avoidance. Windows is the only remaining widespread employer of this mechanism. On systems with interpreter directives, including virtually all versions of Unix, command name extensions have no special significance, and are by standard practice not used, since the primary method to set interpreters for scripts is to start them with a single line specifying the interpreter to use. In these environments, including the extension in a command name unnecessarily exposes an implementation detail which puts all references to the commands from other programs at future risk if the implementation changes. For example, it would be perfectly normal for a shell script to be reimplemented in Python or Ruby, and later in C or C++, all of which would change the name of the command were extensions used. Without extensions, a program always has the same extension-less name, with only the interpreter directive or magic number changing, and references to the program from other programs remain valid.


Security issues

File extensions alone are not a reliable indicator of a file's type, as the extension can be modified without changing the file's contents, such as to disguise malicious content. Therefore, especially in the context of cybersecurity, a file's true nature should be examined for its signature, which is a distinctive sequence of bytes affixed to a file's header. This is accomplished using file identification software or a hex editor, which provides a hex dump of a file's contents. For example, on
UNIX-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
systems, it is not uncommon to find files with no extensions at all, as commands such as file are meant to be used instead, and will read the file's header to determine its content. Malware such as
Trojan horse In Greek mythology, the Trojan Horse () was a wooden horse said to have been used by the Greeks during the Trojan War to enter the city of Troy and win the war. The Trojan Horse is not mentioned in Homer, Homer's ''Iliad'', with the poem ending ...
s typically takes the form of an
executable In computer science, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instruction (computer science), in ...
, but any file type that performs
input/output In computing, input/output (I/O, i/o, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, such as another computer system, peripherals, or a human operator. Inputs a ...
operations may contain malicious code. A few data file types such as PDFs have been found to be vulnerable to exploits that cause buffer overflows. There have been instances of malware crafted to exploit such vulnerabilities in some Windows applications when opening a file with an overly long, unhandled filename extension.
File manager A file manager or file browser is a computer program that provides a user interface to manage computer files, files and folder (computing), folders. The most common Computer file#Operations, operations performed on files or groups of files incl ...
s may have an option to hide filenames extensions. This is the case for File Explorer, the file browser provided with
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
, which by default does not display extensions. Malicious users have tried to spread computer viruses and computer worms by using file names formed like LOVE-LETTER-FOR-YOU.TXT.vbs. The idea is that this will appear as LOVE-LETTER-FOR-YOU.TXT, a harmless text file, without alerting the user to the fact that it is a harmful computer program, in this case, written in VBScript. The default behavior for
ReactOS ReactOS is a Free and open-source software, free and open-source operating system for i586/amd64 personal computers that is intended to be binary-code compatibility, binary-compatible with computer programs and device drivers developed for Wind ...
is to display filename extensions in ReactOS Explorer. Later Windows versions (starting with Windows XP Service Pack 2 and Windows Server 2003) included customizable lists of filename extensions that should be considered "dangerous" in certain "zones" of operation, such as when
download In computer networks, download means to ''receive'' data from a remote system, typically a server such as a web server, an FTP server, an email server, or other similar systems. This contrasts with uploading, where data is ''sent to'' a remote ...
ed from the web or received as an e-mail attachment. Modern antivirus software systems also help to defend users against such attempted attacks where possible. A virus may couple itself with an executable without actually modifying the executable. These viruses, known as ''companion viruses'', attach themselves in such a way that they are executed when the original file is requested. One way such a virus does this involves giving the virus the same name as the target file, but with a different extension to which the operating system gives priority, and often assigning the former a "hidden" attribute to conceal the malware's existence. The efficacy of this approach depends on whether the user attempts to open the intended file by entering a command and whether the user includes the extension. Later versions of DOS and Windows check for and attempt to run .COM files first by default, followed by .EXE and finally .BAT files. In this case, the infected file is the one with the .COM extension, which the user unwittingly executes. Some viruses take advantage of the similarity between the " .com"
top-level domain A top-level domain (TLD) is one of the domain name, domains at the highest level in the hierarchical Domain Name System of the Internet after the root domain. The top-level domain names are installed in the DNS root zone, root zone of the nam ...
and the .COM filename extension by emailing malicious, executable command-file attachments under names superficially similar to URLs (''e.g.'', "myparty.yahoo.com"), with the effect that unaware users click on email-embedded links that they think lead to websites but actually download and execute the malicious attachments.


See also

* file (command) * List of file formats * List of filename extensions *
Metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
* .properties


References


External links

*
Database of filename extensions
at FileInfo.com Metadata Computer files Names