comma-separated values
   HOME

TheInfoList



OR:

Comma-separated values (CSV) is a
text file A text file (sometimes spelled textfile; an old alternative name is flat file) is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. In ope ...
format that uses
comma The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
s to separate values, and
newline A newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or ...
s to separate records. A CSV file stores tabular data (numbers and text) in
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers, images, etc.). It may also include a lim ...
, where each line of the file typically represents one data record. Each record consists of the same number of fields, and these are separated by commas in the CSV file. If the field delimiter itself may appear within a field, fields can be surrounded with quotation marks. The CSV file format is one type of delimiter-separated file format. Delimiters frequently used include the comma, tab, space, and semicolon. Delimiter-separated files are often given a ".csv" extension even when the field separator is not a comma. Many applications or libraries that consume or produce CSV files have options to specify an alternative delimiter. The lack of adherence to the CSV standard RFC 4180 necessitates the support for a variety of CSV formats in data input software. Despite this drawback, CSV remains widespread in data applications and is widely supported by a variety of software, including common spreadsheet applications such as
Microsoft Excel Microsoft Excel is a spreadsheet editor developed by Microsoft for Microsoft Windows, Windows, macOS, Android (operating system), Android, iOS and iPadOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a ...
. Benefits cited in favor of CSV include human readability and the simplicity of the format.


Applications

CSV is a common
data exchange Data exchange is the process of taking data structured under a ''source'' schema and transforming it into a ''target'' schema, so that the target data is an accurate representation of the source data. Data exchange allows data to be shared between ...
format that is widely supported by consumer, business, and scientific applications. Among its most common uses is moving tabular data between programs that natively operate on incompatible (often proprietary or undocumented) formats. For example, a user may need to transfer information from a database program that stores data in a proprietary format, to a
spreadsheet A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
that uses a completely different format. Most database programs can export data as CSV. Most spreadsheet programs can read CSV data, allowing CSV to be used as an intermediate format when transferring data from a database to a spreadsheet. Every major ecommerce platform provides support for exporting data as a CSV file. CSV is also used for storing data. Common data science tools such as Pandas include the option to export data to CSV for long-term storage. Benefits of CSV for data storage include the simplicity of CSV makes parsing and creating CSV files easy to implement and fast compared to other data formats, human readability making editing or fixing data simpler, and high compressibility leading to smaller data files. Alternatively, CSV does not support more complex data relations and makes no distinction between null and empty values, and in applications where these features are needed other formats are preferred. More than 200 local, regional, and national data portals, such as those of the
UK government His Majesty's Government, abbreviated to HM Government or otherwise UK Government, is the central government, central executive authority of the United Kingdom of Great Britain and Northern Ireland.
and the
European Commission The European Commission (EC) is the primary Executive (government), executive arm of the European Union (EU). It operates as a cabinet government, with a number of European Commissioner, members of the Commission (directorial system, informall ...
, use CSV files with standardized data catalogs.


Specification

proposes a
specification A specification often refers to a set of documented requirements to be satisfied by a material, design, product, or service. A specification is often a type of technical standard. There are different types of technical or engineering specificati ...
for the CSV format; however, actual practice often does not follow the RFC and the term "CSV" might refer to any file that: # is
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers, images, etc.). It may also include a lim ...
using a character encoding such as
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
, various
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
character encodings (e.g.
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
),
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight- bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding si ...
, or
Shift JIS Shift JIS (also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation in conjunction with Microsoft and standardized as JIS ...
, # consists of records (typically one record per line), # with the records divided into fields separated by a comma, # where every record has the same sequence of fields. Within these general constraints, many variations are in use. Therefore, without additional information (such as whether RFC 4180 is honored), a file claimed simply to be in "CSV" format is not fully specified.


History

Comma-separated values is a data format that predates
personal computer A personal computer, commonly referred to as PC or computer, is a computer designed for individual use. It is typically used for tasks such as Word processor, word processing, web browser, internet browsing, email, multimedia playback, and PC ...
s by more than a decade: the
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
Fortran (level H extended) compiler under
OS/360 OS/360, officially known as IBM System/360 Operating System, is a discontinued batch processing operating system developed by IBM for their then-new System/360 mainframe computer, announced in 1964; it was influenced by the earlier IBSYS/IBJOB a ...
supported CSV in 1972. List-directed ("free form") input/output was defined in FORTRAN 77, approved in 1978. List-directed input used commas or spaces for delimiters, so unquoted character strings could not contain commas or spaces. The term "comma-separated value" and the "CSV" abbreviation were in use by 1983. The manual for the Osborne Executive computer, which bundled the
SuperCalc SuperCalc is a spreadsheet published by Sorcim in 1980. History VisiCalc was the first spreadsheet program, but at first was not available for the CP/M operating system. SuperCalc was created to serve that market. Alongside WordStar, it wa ...
spreadsheet, documents the CSV quoting convention that allows strings to contain embedded commas, but the manual does not specify a convention for embedding quotation marks within quoted strings. Comma-separated value lists are easier to type (for example into
punched card A punched card (also punch card or punched-card) is a stiff paper-based medium used to store digital information via the presence or absence of holes in predefined positions. Developed over the 18th to 20th centuries, punched cards were widel ...
s) than fixed-column-aligned data, and they were less prone to producing incorrect results if a value was punched one column off from its intended location. Comma separated files are used for the interchange of database information between machines of two different architectures. The plain-text character of CSV files largely avoids incompatibilities such as byte-order and word size. The files are largely human-readable, so it is easier to deal with them in the absence of perfect documentation or communication. The main standardization initiative—transforming "'' de facto'' fuzzy definition" into a more precise and ''
de jure In law and government, ''de jure'' (; ; ) describes practices that are officially recognized by laws or other formal norms, regardless of whether the practice exists in reality. The phrase is often used in contrast with '' de facto'' ('from fa ...
'' one—was in 2005, with , defining CSV as a MIME Content Type. Later, in 2013, some of RFC 4180's deficiencies were tackled by a W3C recommendation. In 2014
IETF The Internet Engineering Task Force (IETF) is a standards organization for the Internet standard, Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP). It has no formal membership roster ...
published describing the application of URI fragments to CSV documents. RFC 7111 specifies how row, column, and cell ranges can be selected from a CSV document using position indexes. In 2015
W3C The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
, in an attempt to enhance CSV with formal semantics, publicized the first ''drafts of recommendations'' for CSV metadata standards, which began as ''recommendations'' in December of the same year.


General functionality

CSV formats are best used to represent sets or sequences of records in which each record has an identical list of fields. This corresponds to a single relation in a
relational database A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured for ...
, or to data (though not calculations) in a typical spreadsheet. The format dates back to the early days of business computing and is widely used to pass data between computers with different internal word sizes, data formatting needs, and so forth. For this reason, CSV files are common on all computer platforms. CSV is a delimited text file that uses a
comma The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
to separate values (many implementations of CSV import/export tools allow other separators to be used; for example, the use of a "Sep=^" row as the first row in the *.csv file will cause Excel to open the file expecting
caret Caret () is the name used familiarly for the character provided on most QWERTY keyboards by typing . The symbol has a variety of uses in programming and mathematics. The name "caret" arose from its visual similarity to the original proofre ...
"^" to be the separator instead of comma ","). Simple CSV implementations may prohibit field values that contain a comma or other special characters such as newlines. More sophisticated CSV implementations permit them, often by requiring " ( double quote) characters around values that contain reserved characters (such as commas, double quotes, or less commonly, newlines). Embedded double quote characters may then be represented by a pair of consecutive double quotes, or by prefixing a double quote with an
escape character In computing and telecommunications, an escape character is a character that invokes an alternative interpretation on the following characters in a character sequence. An escape character is a particular case of metacharacters. Generally, the ...
such as a
backslash The backslash is a mark used mainly in computing and mathematics. It is the mirror image of the common slash (punctuation), slash . It is a relatively recent mark, first documented in the 1930s. It is sometimes called a hack, whack, Escape c ...
(for example in
Sybase Sybase, Inc. was an enterprise software and services company. The company produced software relating to relational databases, with facilities located in California and Massachusetts. Sybase was acquired by SAP in 2010; SAP ceased using the Syba ...
Central). CSV formats are not limited to a particular
character set Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical values that make up a c ...
. They work just as well with
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
character sets (such as
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
or
UTF-16 UTF-16 (16-bit Unicode Transformation Format) is a character encoding that supports all 1,112,064 valid code points of Unicode. The encoding is variable-length as code points are encoded with one or two ''code units''. UTF-16 arose from an earli ...
) as with ASCII (although particular programs that support CSV may have their own limitations). CSV files normally will even survive naïve translation from one character set to another (unlike nearly all proprietary data formats). CSV does not, however, provide any way to indicate what character set is in use, so that must be communicated separately, or determined at the receiving end (if possible). Databases that include multiple relations cannot be exported as a single CSV file. Similarly, CSV cannot naturally represent
hierarchical A hierarchy (from Greek: , from , 'president of sacred rites') is an arrangement of items (objects, names, values, categories, etc.) that are represented as being "above", "below", or "at the same level as" one another. Hierarchy is an importan ...
or
object-oriented Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impleme ...
data. This is because every CSV record is expected to have the same structure. CSV is therefore rarely appropriate for
documents A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ', which denotes a "teaching" or "lesson": ...
created with
HTML Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
,
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
, or other markup or word-processing technologies.
Statistical database A statistical database is a database used for statistical analysis purposes. It is an OLAP (online analytical processing), instead of OLTP (online transaction processing) system. Modern decision, and classical statistical databases are often clos ...
s in various fields often have a generally relation-like structure, but with some repeatable groups of fields. For example, health databases such as the Demographic and Health Survey typically repeat some questions for each child of a given parent (perhaps up to a fixed maximum number of children).
Statistical analysis Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
systems often include utilities that can "rotate" such data; for example, a "parent" record that includes information about five children can be split into five separate records, each containing (a) the information on one child, and (b) a copy of all the non-child-specific information. CSV can represent either the "vertical" or "horizontal" form of such data. In a relational database, similar issues are readily handled by creating a separate relation for each such group, and connecting "child" records to the related "parent" records using a
foreign key A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples ...
(such as an ID number or name for the parent). In markup languages such as XML, such groups are typically enclosed within a parent element and repeated as necessary (for example, multiple <child> nodes within a single <parent> node). With CSV there is no widely accepted single-file solution.


Standardization

The name "CSV" indicates the use of the comma to separate data fields. Nevertheless, the term "CSV" is widely used to refer to a large family of formats that differ in many ways. Some implementations allow or require single or double quotation marks around some or all fields; and some reserve the first record as a header containing a list of field names. The character set being used is undefined: some applications require a Unicode byte order mark (BOM) to enforce Unicode interpretation (sometimes even a UTF-8 BOM). Files that use the tab character instead of comma can be more precisely referred to as "TSV" for tab-separated values. Other implementation differences include the handling of more commonplace field separators (such as space or semicolon) and newline characters inside text fields. One more subtlety is the interpretation of a blank line: it can equally be the result of writing a record of zero fields, or a record of one field of zero length; thus decoding it is ambiguous.


RFC 4180 and MIME standards

The 2005 technical standard RFC 4180 formalizes the CSV file format and defines the MIME type "text/csv" for the handling of text-based fields. However, the interpretation of the text of each field is still application-specific. Files that follow the RFC 4180 standard can simplify CSV exchange and should be widely portable. Among its requirements: * MS-DOS-style lines that end with (CR/LF) characters (optional for the last line). * An optional header record (there is no sure way to detect whether it is present, so care is required when importing). * Each record ''should'' contain the same number of comma-separated fields. * Any field ''may'' be quoted (with double quotes). * Fields containing a line-break, double-quote or commas ''should'' be quoted. (If they are not, the file will likely be impossible to process correctly.) * ''If'' double-quotes are used to enclose fields, then a double-quote in a field ''must'' be represented by two double-quote characters. The format can be processed by most programs that claim to read CSV files. The exceptions are ''(a)'' programs may not support line-breaks within quoted fields, ''(b)'' programs may confuse the optional header with data or interpret the first data line as an optional header, and ''(c)'' double-quotes in a field may not be parsed correctly automatically.


OKF frictionless tabular data package

In 2011
Open Knowledge Foundation Open Knowledge Foundation (OKF) is a global, non-profit network that promotes and shares information at no charge, including both content and data. It was founded by Rufus Pollock on 20 May 2004 in Cambridge, England. It is incorporated in Engla ...
(OKF) and various partners created a data protocols working group, which later evolved into the Frictionless Data initiative. One of the main formats they released was the Tabular Data Package. Tabular Data package was heavily based on CSV, using it as the main data transport format and adding basic type and schema metadata (CSV lacks any type information to distinguish the string "1" from the number 1). The Frictionless Data Initiative has also provided a standard CSV Dialect Description Format for describing different dialects of CSV, for example specifying the field separator or quoting rules.


W3C tabular data standard

In 2013 the
W3C The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
"CSV on the Web" working group began to specify technologies providing higher interoperability for web applications using CSV or similar formats. The working group completed its work in February 2016 and is officially closed in March 2016 with the release of a set of documents and W3C recommendations for modeling "Tabular Data", and enhancing CSV with
metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
and
semantics Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
. While the well-formedness of CSV data can readily checked, testing validity and canonical form is less well developed, relative to more precise data formats, such as
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
and
SQL Structured Query Language (SQL) (pronounced ''S-Q-L''; or alternatively as "sequel") is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling s ...
, which offer richer types and rules-based validation.


Basic rules

Many informal documents exist that describe "CSV" formats.
IETF The Internet Engineering Task Force (IETF) is a standards organization for the Internet standard, Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP). It has no formal membership roster ...
RFC 4180 (summarized above) defines the format for the "text/csv" MIME type registered with the
IANA The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Internet P ...
. Rules typical of these and other "CSV" specifications and implementations are as follows:


Example

The above table of data may be represented in CSV format as follows: Year,Make,Model,Description,Price 1997,Ford,E350,"ac, abs, moon",3000.00 1999,Chevy,"Venture ""Extended Edition""","",4900.00 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00 1996,Jeep,Grand Cherokee,"MUST SELL! air, moon roof, loaded",4799.00 Example of a USA/UK CSV file (where the decimal separator is a period/full stop and the value separator is a comma): Year,Make,Model,Length 1997,Ford,E350,2.35 2000,Mercury,Cougar,2.38 Example of an analogous European CSV/ DSV file (where the decimal separator is a comma and the value separator is a semicolon): Year;Make;Model;Length 1997;Ford;E350;2,35 2000;Mercury;Cougar;2,38 The latter format is not RFC 4180 compliant. Compliance could be achieved by the use of a comma instead of a semicolon as a separator and by quoting all numbers that have a decimal mark.


Application support

Some applications use CSV as a data interchange format to enhance its
interoperability Interoperability is a characteristic of a product or system to work with other products or systems. While the term was initially defined for information technology or systems engineering services to allow for information exchange, a broader de ...
, exporting and importing CSV. Others use CSV as an ''internal format''. As a data interchange format: the CSV file format is supported by almost all spreadsheets and database management systems, *
Spreadsheet A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
s including Apple
Numbers A number is a mathematical object used to count, measure, and label. The most basic examples are the natural numbers 1, 2, 3, 4, and so forth. Numbers can be represented in language with number words. More universally, individual numbers can ...
,
LibreOffice Calc LibreOffice Calc is the spreadsheet component of the LibreOffice suite. After forking from OpenOffice.org in 2010, LibreOffice Calc underwent a massive re-work of external reference handling to fix many defects in formula calculations involvi ...
, and
Apache OpenOffice Apache OpenOffice (AOO) is an open-source software, open-source office suite, office productivity software suite. It is one of the successor projects of OpenOffice.org and the designated successor of IBM Lotus Symphony. It was a close cousin of ...
Calc.
Microsoft Excel Microsoft Excel is a spreadsheet editor developed by Microsoft for Microsoft Windows, Windows, macOS, Android (operating system), Android, iOS and iPadOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a ...
also supports a dialect of CSV with restrictions in comparison to other spreadsheet software (e.g., Excel still cannot export CSV files in the commonly used UTF-8 character encoding, and separator is not enforced to be the comma).
LibreOffice Calc LibreOffice Calc is the spreadsheet component of the LibreOffice suite. After forking from OpenOffice.org in 2010, LibreOffice Calc underwent a massive re-work of external reference handling to fix many defects in formula calculations involvi ...
CSV importer is actually a more generic delimited text importer, supporting multiple separators at the same time as well as field trimming. * Various
Relational databases A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured form ...
support saving query results to a CSV file.
PostgreSQL PostgreSQL ( ) also known as Postgres, is a free and open-source software, free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transaction processing, transactions ...
provides the COPY command, which allows for both saving and loading data to and from a file. saves the content of a table articles to a file called /home/wikipedia/file.csv. * Many utility programs on
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
-style systems (such as cut, paste, join, sort, uniq, awk) can split files on a comma delimiter, and can therefore process simple CSV files. However, this method does not correctly handle commas or new lines within quoted strings, hence it is better to use tools like csvkit or Miller. As (main or optional) internal representation. Can be native or foreign, but differ from interchange format ("export/import only") because it is not necessary to create a copy in another format: * Some
Spreadsheet A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
s including
LibreOffice Calc LibreOffice Calc is the spreadsheet component of the LibreOffice suite. After forking from OpenOffice.org in 2010, LibreOffice Calc underwent a massive re-work of external reference handling to fix many defects in formula calculations involvi ...
offers this option, without enforcing user to adopt another format. * Some relational databases, when using standard SQL, offer ''foreign-data wrapper'' (FDW). For example, PostgreSQL offers the and commands to configure any variant of CSV. * Databases like
Apache Hive Apache Hive is a data warehouse software project. It is built on top of Apache Hadoop for providing data query and analysis. Hive gives an SQL-like Interface (computing), interface to query data stored in various databases and file systems that i ...
offer the option to express CSV or .csv.gz as an internal table format. * The
emacs Emacs (), originally named EMACS (an acronym for "Editor Macros"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, s ...
editor can operate on CSV files using csv-nav mode. CSV format is supported by libraries available for many
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s. Most provide some way to specify the field delimiter,
decimal separator FIle:Decimal separators.svg, alt=Four types of separating decimals: a) 1,234.56. b) 1.234,56. c) 1'234,56. d) ١٬٢٣٤٫٥٦., Both a comma and a full stop (or period) are generally accepted decimal separators for international use. The apost ...
, character encoding, quoting conventions, date format, etc.


Software and row limits

Programs that work with CSV may have limits on the maximum number of rows CSV files can have. Below is a list of common software and its limitations: * Microsoft Excel: 1,048,576 row limit; * Microsoft PowerShell, no row or cell limit. (Memory Limited) * Apple Numbers: 1,000,000 row limit; * Google Sheets: 10,000,000 cell limit (the product of columns and rows); * OpenOffice and LibreOffice: 1,048,576 row limit; * Sourcetable:large data spreadsheet
Sourcetable Inc., 2024. Retrieved 2024-11-14.
no row limit. (Spreadsheet-database hybrid); * Text Editors (such as WordPad,
TextEdit TextEdit is an open-source software, open-source word processor and text editor, first featured in NeXT's NeXTSTEP and OPENSTEP. It is now distributed with macOS since Apple Inc.'s acquisition of NeXT, and available as a GNUstep application fo ...
, Vim, etc.): no row or cell limit; * Databases (COPY command and FDW): no row or cell limit.


See also

* Tab-separated values * Comparison of data-serialization formats *
Delimiter-separated values Formats that use delimiter-separated values (also DSV)DSV stands for ''Delimiter Separated Values'' store two-dimensional arrays of data by separating the values in each row with specific delimiter character (computing), characters. Most database ...
*
Delimiter collision A delimiter is a sequence of one or more characters for specifying the boundary between separate, independent regions in plain text, mathematical expressions or other data streams. An example of a delimiter is the comma character, which acts ...
*
Flat-file database A flat-file database is a database stored in a file called a flat file. Records follow a uniform format, and there are no structures for indexing or recognizing relationships between records. The file is simple. A flat file can be a plain t ...
* Simple Data Format * Substitute character,
Null character The null character is a control character with the value zero. Many character sets include a code point for a null character including Unicode (Universal Coded Character Set), ASCII (ISO/IEC 646), Baudot, ITA2 codes, the C0 control code, and EB ...
, invisible comma U+2063


References


Further reading

* (Has file descriptions of delimited ASCII (.DEL) (including comma- and semicolon-separated) and non-delimited ASCII (.ASC) files for data transfer.) {{Data Exchange Delimiter-separated format Open formats Spreadsheet file formats