WAV
   HOME

TheInfoList



OR:

Waveform Audio File Format (WAVE, or WAV due to its
filename extension A filename extension, file name extension or file extension is a suffix to the name of a computer file (e.g., .txt, .docx, .md). The extension indicates a characteristic of the file contents or its intended use. A filename extension is typically ...
; pronounced "wave") is an
audio file format An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data (excluding metadata) is called the audio coding format and can be uncompressed, or compressed to reduce the file size, ofte ...
standard, developed by IBM and
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washi ...
, for storing an audio bitstream on
PCs A personal computer (PC) is a multi-purpose microcomputer whose size, capabilities, and price make it feasible for individual use. Personal computers are intended to be operated directly by an end user, rather than by a computer expert or techn ...
. It is the main format used on
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
systems for uncompressed audio. The usual bitstream encoding is the linear pulse-code modulation (LPCM) format. WAV is an application of the Resource Interchange File Format (RIFF)
bitstream format A bitstream format is the format of the data found in a stream of bits used in a digital communication or data storage application. The term typically refers to the data format of the output of an encoder, or the data format of the input to a d ...
method for storing data in ''chunks'', and thus is similar to the
8SVX 8-Bit Sampled Voice (8SVX) is an audio file format standard developed by Electronic Arts for the Commodore-Amiga computer series. It is a data subtype of the IFF file container format. It typically contains linear pulse-code modulation (LPCM) ...
and the
AIFF Audio Interchange File Format (AIFF) is an audio file format standard used for storing sound data for personal computers and other electronic audio devices. The format was developed by Apple Inc. in 1988 based on Electronic Arts' Interchange ...
format used on
Amiga Amiga is a family of personal computers introduced by Commodore International, Commodore in 1985. The original model is one of a number of mid-1980s computers with 16- or 32-bit processors, 256 KB or more of RAM, mouse-based GUIs, and sign ...
and
Macintosh The Mac (known as Macintosh until 1999) is a family of personal computers designed and marketed by Apple Inc., Apple Inc. Macs are known for their ease of use and minimalist designs, and are popular among students, creative professionals, and ...
computers, respectively.


Description

The WAV file is an instance of a Resource Interchange File Format (RIFF) defined by IBM and
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washi ...
. The RIFF format acts as a "wrapper" for various
audio coding format An audio coding format (or sometimes audio compression format) is a content representation format for storage or transmission of digital audio (such as in digital television, digital radio and in audio and video files). Examples of audio coding ...
s. Though a WAV file can contain compressed audio, the most common WAV audio format is uncompressed audio in the linear pulse-code modulation (LPCM) format. LPCM is also the standard audio coding format for audio CDs, which store two-channel LPCM audio
sampled Sample or samples may refer to: Base meaning * Sample (statistics), a subset of a population – complete data set * Sample (signal), a digital discrete sample of a continuous analog signal * Sample (material), a specimen or small quantity of so ...
at 44,100 Hz with 16 bits per sample. Since LPCM is uncompressed and retains all of the samples of an audio track, professional users or audio experts may use the WAV format with LPCM audio for maximum audio quality. WAV files can also be edited and manipulated with relative ease using software. The WAV format supports compressed audio using, on Microsoft Windows, the
Audio Compression Manager This article describes audio APIs and components in Microsoft Windows which are now obsolete or deprecated. Multimedia Extensions (MME) The MME API or the Windows Multimedia API (also known as ''WinMM'') was the first universal and standardized W ...
(ACM). Any ACM
codec A codec is a device or computer program that encodes or decodes a data stream or signal. ''Codec'' is a portmanteau of coder/decoder. In electronic communications, an endec is a device that acts as both an encoder and a decoder on a signal or ...
can be used to compress a WAV file. The
user interface In the industrial design field of human–computer interaction, a user interface (UI) is the space where interactions between humans and machines occur. The goal of this interaction is to allow effective operation and control of the machine f ...
(UI) for Audio Compression Manager may be accessed through various programs that use it, including
Sound Recorder Sound recording and reproduction is the electrical, mechanical, electronic, or digital inscription and re-creation of sound waves, such as spoken voice, singing, instrumental music, or sound effects. The two main classes of sound recording t ...
in some versions of Windows. Beginning with
Windows 2000 Windows 2000 is a major release of the Windows NT operating system developed by Microsoft and oriented towards businesses. It was the direct successor to Windows NT 4.0, and was released to manufacturing on December 15, 1999, and was offici ...
, a WAVE_FORMAT_EXTENSIBLE header was defined which specifies multiple audio channel data along with speaker positions, eliminates ambiguity regarding sample types and container sizes in the standard WAV format and supports defining custom extensions to the format chunk. There are some inconsistencies in the WAV format: for example, 8-bit data is unsigned while 16-bit data is signed, and many chunks duplicate information found in other chunks.


Specification


RIFF

A RIFF file is a tagged file format. It has a specific container format (a ''chunk'') that includes a four-character tag ( FourCC) and the size (number of bytes) of the chunk. The tag specifies how the data within the chunk should be interpreted, and there are several standard FourCC tags. Tags consisting of all capital letters are reserved tags. The outermost chunk of a RIFF file has a RIFF form tag; the first four bytes of chunk data are a FourCC that specify the form type and are followed by a sequence of subchunks. In the case of a WAV file, those four bytes are the FourCC WAVE. The remainder of the RIFF data is a sequence of chunks describing the audio information. The advantage of a tagged file format is that the format can be extended later without confusing existing file readers. The rule for a RIFF (or WAV) reader is that it should ignore any tagged chunk that it does not recognize. The reader won't be able to use the new information, but the reader should not be confused. The specification for RIFF files includes the definition of an INFO chunk. The chunk may include information such as the title of the work, the author, the creation date, and copyright information. Although the INFO chunk was defined in version 1.0, the chunk was not referenced in the formal specification of a WAV file. If the chunk were present in the file, then a reader should know how to interpret it, but many readers had trouble. Some readers would abort when they encountered the chunk, some readers would process the chunk if it were the first chunk in the RIFF form, and other readers would process it if it followed all of the expected waveform data. Consequently, the safest thing to do from an interchange standpoint was to omit the INFO chunk and other extensions and send a lowest-common-denominator file. There are other INFO chunk placement problems. RIFF files were expected to be used in international environments, so there is CSET chunk to specify the country code, language, dialect, and
code page In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers. Typically each number represents the binary value in a single byte. (In some c ...
for the strings in a RIFF file. For example, specifying an appropriate CSET chunk should allow the strings in an INFO chunk (and other chunks throughout the RIFF file) to be interpreted as Cyrillic or Japanese characters. RIFF also defines a JUNK chunk whose contents are uninteresting. The chunk allows a chunk to be deleted by just changing its FourCC. The chunk could also be used to reserve some space for future edits so the file could be modified without being rewritten. A later definition of RIFF introduced a similar PAD  chunk.


RIFF WAVE

The toplevel definition of a WAV file is:
 → RIFF('WAVE'
                               // Format
                    fact-ck>        // Fact chunk
                    cue-ck>         // Cue points
                    playlist-ck>    // Playlist
                    assoc-data-list>// Associated data list
                    )       // Wave data
The definition shows a toplevel RIFF form with the WAVE tag. It is followed by a mandatory format chunk that describes the format of the sample data that follows. The format chunk includes information such as the sample encoding, number of bits per channel, the number of channels, the sample rate. The WAV specification includes some optional features. The optional fact chunk reports the number of samples for some compressed coding schemes. The cue point (cue ) chunk identifies some significant sample numbers in the wave file. The playlist chunk allows the samples to be played out of order or repeated rather than just from beginning to end. The associated data list allows labels and notes (labl and note) to be attached to cue points; text annotation (ltxt) may be given for a group of samples (e.g., caption information). Finally, the mandatory wave data chunk contains the actual samples (in the specified format). Note that the WAV file definition does not show where an INFO chunk should be placed. It is also silent about the placement of a CSET chunk (which specifies the character set used). The RIFF specification attempts to be a formal specification, but its formalism lacks the precision seen in other tagged formats. For example, the RIFF specification does not clearly distinguish between a set of subchunks and an ordered sequence of subchunks. The RIFF form chunk suggests it should be a sequence container. The specification suggests a LIST chunk is also a sequence: "A LIST chunk contains a list, or ordered sequence, of subchunks." However, the specification does not give a formal specification of the INFO chunk; an example INFO LIST chunk ignores the chunk sequence implied in the INFO description. The LIST chunk definition for does use the LIST chunk as a sequence container with good formal semantics. The WAV specification allows for not only a single, contiguous, array of audio samples, but also discrete blocks of samples and silence that are played in order. Most WAV files use a single array of data. The specification for the sample data is confused:
The  contains the waveform data. It is defined as follows:
      → data(  )
    → LIST( 'wavl' ... ) // Silence
   → slnt(  ) // Count of silent samples
These productions are confused. Apparently (undefined) and (defined but not referenced) should be identical. Even if that problem is fixed, the productions then allow a to contain a recursive (which implies data interpretation problems). The specification should have been something like:
    → data(  ... )
    → LIST( 'wavl' ... ) // Silence
   → slnt(  ) // Count of silent samples
to avoid the recursion. WAV files can contain embedded IFF "lists", which can contain several "sub-chunks".


Metadata

As a derivative of RIFF, WAV files can be tagged with
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
in the INFO chunk. In addition, WAV files can embed any kind of metadata, including but not limited to
Extensible Metadata Platform The Extensible Metadata Platform (XMP) is an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets. XMP standardizes a data ...
(XMP) data or ID3 tags in extra chunks. Applications may not handle this extra information or may expect to see it in a particular place. Although the RIFF specification requires that applications ignore chunks they do not recognize, some applications are confused by additional chunks.


Popularity

Uncompressed WAV files are large, so
file sharing File sharing is the practice of distributing or providing access to digital media, such as computer programs, multimedia (audio, images and video), documents or electronic books. Common methods of storage, transmission and dispersion include r ...
of WAV files over the
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, p ...
is uncommon except among video, music and audio professionals where the uncompressed form has become the most popular of all audio formats and, for most, high speed large bandwidth web connections are commonplace. Many audio and music software manufacturers now favour it as their default file format though others are often supported. The high resolution of the format makes it suitable for retaining first generation archived files of high quality, for use on a system where disk space is not a constraint, or in applications such as audio editing where the time involved in compressing and uncompressing data, and the losses in quality of such conversions are a concern.


Use by broadcasters

In spite of their large size, uncompressed WAV files are used by most radio broadcasters, especially those that have adopted a tapeless system. *
BBC Radio BBC Radio is an operational business division and service of the British Broadcasting Corporation (which has operated in the United Kingdom under the terms of a royal charter since 1927). The service provides national radio stations covering ...
in the UK uses 48 kHz 16-bit two-channel WAV audio as standard in their SCISYS dira audio editing and playout system. * The UK Commercial radio company
Global Radio Global Media & Entertainment Limited, trading as Global, is a British media company formed in 2007. It is the owner of the largest commercial radio company in Europe having expanded through a number of historical acquisitions, including Chrys ...
uses 44.1 kHz 16-bit two-channel WAV files in the Genesys playout system, and throughout their broadcast chain. * The ABC "D-Cart" system, which was developed by the Australian broadcaster, uses 48 kHz 16-bit two-channel WAV files, which is identical to that of Digital Audio Tape. * The Digital Radio Mondiale consortium uses WAV files as an informal standard for transmitter simulation and receiver testing.


Limitations

The WAV format is limited to files that are less than 4 GiB, because of its use of a
32-bit In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in 32- bit units. Compared to smaller bit widths, 32-bit computers can perform large calculati ...
unsigned
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
to record the file size header. Although this is equivalent to about 6.8 hours of CD-quality audio (44.1 kHz, 16-bit stereo), it is sometimes necessary to exceed this limit, especially when greater
sampling rate In signal processing, sampling is the reduction of a continuous-time signal In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled. Discrete time ...
s, bit resolutions or channel count are required. The W64 format was therefore created for use in
Sound Forge Sound Forge (formerly known as Sonic Foundry Sound Forge, and later as Sony Sound Forge) is a digital audio editing suite by Magix Software GmbH, which is aimed at the professional and semi-professional markets. There are two versions of Sound F ...
. Its
64-bit In computer architecture, 64-bit integers, memory addresses, or other data units are those that are 64 bits wide. Also, 64-bit CPUs and ALUs are those that are based on processor registers, address buses, or data buses of that size. A ...
header allows for much longer recording times. The
RF64 {{Infobox file format , name = RF64 , icon = , iconcaption = , icon_size = , screenshot = , screenshot_size = , caption = , _noextcode = , extension = , _nomimecode = , mime = , type_code = , uniform_type = , c ...
format specified by the
European Broadcasting Union The European Broadcasting Union (EBU; french: Union européenne de radio-télévision, links=no, UER) is an alliance of public service media organisations whose countries are within the European Broadcasting Area or who are members of the C ...
has also been created to solve this problem.


Non-audio data

Since the sampling rate of a WAV file can vary from 1 Hz to 4.3
GHz The hertz (symbol: Hz) is the unit of frequency in the International System of Units (SI), equivalent to one event (or cycle) per second. The hertz is an SI derived unit whose expression in terms of SI base units is s−1, meaning that one he ...
, and the number of channels can be as high as 65535, .wav files have also been used for non-audio data. LTspice, for instance, can store multiple circuit trace
waveform In electronics, acoustics, and related fields, the waveform of a signal is the shape of its graph as a function of time, independent of its time and magnitude scales and of any displacement in time.David Crecraft, David Gorham, ''Electro ...
s in separate channels, at any appropriate sampling rate, with the full-scale range representing ±1 V or A rather than a sound pressure.


Audio CDs

Audio CDs do not use the WAV file format, using instead
Red Book audio Compact Disc Digital Audio (CDDA or CD-DA), also known as Digital Audio Compact Disc or simply as Audio CD, is the standard format for audio compact discs. The standard is defined in the ''Red Book'', one of a series of Rainbow Books (named ...
. The commonality is that audio CDs are encoded as uncompressed
PCM Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the am ...
, which is one of the formats supported by WAV. WAV is a file format for a computer to use that cannot be understood by most CD players directly. To record WAV files to an Audio CD the file headers must be stripped, the contents must be transcoded if not already stored as PCM, and the PCM data written directly to the disc as individual tracks with zero-padding added to match the CD's sector size. In order for PCM audio to be able to be burned to a CD, it should be in the 44100 Hz, 16-bit stereo format.


Comparison of coding schemes

Audio in WAV files can be encoded in a variety of audio coding formats, such as
GSM The Global System for Mobile Communications (GSM) is a standard developed by the European Telecommunications Standards Institute (ETSI) to describe the protocols for second-generation ( 2G) digital cellular networks used by mobile devices such ...
or
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
, to reduce the file size. This is a reference to compare the
monophonic Monaural or monophonic sound reproduction (often shortened to mono) is sound intended to be heard as if it were emanating from one position. This contrasts with stereophonic sound or ''stereo'', which uses two separate audio channels to reproduc ...
(not
stereophonic Stereophonic sound, or more commonly stereo, is a method of sound reproduction that recreates a multi-directional, 3-dimensional audible perspective. This is usually achieved by using two independent audio channels through a configuration ...
) audio quality and compression bitrates of audio coding formats available for WAV files including
PCM Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the am ...
,
ADPCM Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio ...
, Microsoft
GSM 06.10 Full Rate (FR or GSM-FR or GSM 06.10 or sometimes simply GSM) was the first digital speech coding standard used in the GSM digital mobile phone system. It uses linear predictive coding (LPC). The bit rate of the codec is 13 kbit/s, or 1.625 bits/a ...
,
CELP Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algori ...
, SBC,
Truespeech Truespeech is a proprietary audio codec produced by the DSP Group. It is designed for encoding voice data at low bitrates (8.5kbps for 8kHz samples), and to be embedded into DSP chips. Truespeech had been integrated into Windows Media Player in old ...
and
MPEG The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and IEC that sets standards for media coding, including compression coding of audio, video, graphics, and genomic data; and transmission and f ...
Layer-3. These are the default ACM codecs that come with Windows. The above are WAV files; even those that use
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
compression have the .wav extension.


See also

*
Audio Compression Manager This article describes audio APIs and components in Microsoft Windows which are now obsolete or deprecated. Multimedia Extensions (MME) The MME API or the Windows Multimedia API (also known as ''WinMM'') was the first universal and standardized W ...
*
Broadcast Wave Format Broadcast Wave Format (BWF) is an extension of the popular Microsoft WAV audio format and is the recording format of most file-based non-linear digital recorders used for motion picture, radio and television production. It was first specified b ...
(BWF) *
Comparison of audio coding formats The following tables compare general and technical information for a variety of audio coding formats. For listening tests comparing the perceived audio quality of audio formats and codecs, see the article Codec listening test. General informati ...
*
RF64 {{Infobox file format , name = RF64 , icon = , iconcaption = , icon_size = , screenshot = , screenshot_size = , caption = , _noextcode = , extension = , _nomimecode = , mime = , type_code = , uniform_type = , c ...
, an extended file format for audio (multichannel file format enabling file sizes to exceed 4 gigabytes) *
Windows Media Audio Windows Media Audio (WMA) is a series of audio codecs and their corresponding audio coding formats developed by Microsoft. It is a proprietary technology that forms part of the Windows Media framework. WMA consists of four distinct codecs. The ...


References


External links


WAVE file format specifications
- from McGill University, (Last update: 2011-01-03)
Extensible Wave-Format Descriptors
from Microsoft (Updated October 26, 2017)

- University of Bath

(1999)
WAV & BWF Metadata Guide

Exif tags
see, for example, page 128 {{DEFAULTSORT:Wav Audio file formats Digital container formats Computer-related introductions in 1991 Microsoft Windows multimedia technology