Tabulator-separated Values
   HOME

TheInfoList



OR:

Tab-separated values (TSV) is a simple,
text-based In computing, text-based user interfaces (TUI) (alternately terminal user interfaces, to reflect a dependence upon the properties of computer terminals and not just text), is a retronym describing a type of user interface (UI) common as an ear ...
file format A file format is a Computer standard, standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary format, pr ...
for storing tabular data. Records are separated by newlines, and values within a record are separated by tab characters. The TSV format is thus a
delimiter-separated values Formats that use delimiter-separated values (also DSV)DSV stands for ''Delimiter Separated Values'' store two-dimensional arrays of data by separating the values in each row with specific delimiter character (computing), characters. Most database ...
format, similar to
comma-separated values Comma-separated values (CSV) is a text file format that uses commas to separate values, and newlines to separate records. A CSV file stores Table (information), tabular data (numbers and text) in plain text, where each line of the file typically r ...
. TSV is a simple file format that is widely supported, so it is often used in
data exchange Data exchange is the process of taking data structured under a ''source'' schema and transforming it into a ''target'' schema, so that the target data is an accurate representation of the source data. Data exchange allows data to be shared between ...
to move tabular data between different computer programs that support the format. For example, a TSV file might be used to transfer information from a
database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
to a
spreadsheet A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
.


Example

The head of the
Iris flower data set The ''Iris'' flower data set or Fisher's ''Iris'' data set is a Multivariate statistics, multivariate data set used and made famous by the British statistician and biologist Ronald Fisher in his 1936 paper ''The use of multiple measurements in ta ...
can be stored as a TSV using the following
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers, images, etc.). It may also include a lim ...
(note that the HTML rendering may convert tabs to spaces): The TSV plain text above corresponds to the following tabular data:


Character escaping

The
IANA The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Internet P ...
media type In information and communications technology, a media type, content type or MIME type is a two-part identifier for file formats and content formats. Their purpose is comparable to filename extensions and uniform type identifiers, in that they ide ...
standard for TSV achieves simplicity by simply disallowing tabs within fields. Since the values in the TSV format cannot contain literal tabs or
newline A newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or ...
characters, a convention is necessary for lossless conversion of text values with these characters. A common convention is to perform the following escapes: Another common convention is to use the CSV convention from and enclose values containing tabs or newlines in double quotes. This can lead to ambiguities.


Line endings

Records are typically separated by a line feed, as is typical for Unix platforms, or a carriage return and line feed, as is typical for Microsoft platforms. Some programs may expect the latter. The de-facto specification specifies that records are separated by an , but does not specify any specific
newline A newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or ...
.


See also

*
Comma-separated values Comma-separated values (CSV) is a text file format that uses commas to separate values, and newlines to separate records. A CSV file stores Table (information), tabular data (numbers and text) in plain text, where each line of the file typically r ...
*
Delimiter collision A delimiter is a sequence of one or more characters for specifying the boundary between separate, independent regions in plain text, mathematical expressions or other data streams. An example of a delimiter is the comma character, which acts ...


References


Sources

* * *


Further reading

* *{{Cite book, last=Welinder, first=Morten, date=2012-12-19, section-url=https://help.gnome.org/users/gnumeric/stable/gnumeric.html#file-format-tab, section=§14.2.3 — Text File Formats, title=The Gnumeric Manual, edition=v1.12, access-date=2023-05-23 Spreadsheet file formats Delimiter-separated format Computer file formats