In
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, a record (also called a structure,
struct, or
compound data type) is a composite
data structure
In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
a collection of
fields, possibly of different
data type
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
s, typically fixed in number and sequence.
For example, a date could be stored as a record containing a
numeric year field, a
month field represented as a string, and a numeric
day-of-month field. A circle record might contain a numeric
radius and a
center that is a
point record containing
x and
y coordinates.
Notable applications include the
programming language
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
''record type'' and for row-based storage, data organized as a sequence of records, such as a
database table,
spreadsheet
A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
or
comma-separated values
Comma-separated values (CSV) is a text file format that uses commas to separate values, and newlines to separate records. A CSV file stores Table (information), tabular data (numbers and text) in plain text, where each line of the file typically r ...
(CSV) file. In general, a record type value is stored in
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...
and row-based storage is in
mass storage
In computing, mass storage refers to the storage of large amounts of data in a persisting and machine-readable fashion. In general, the term ''mass'' in ''mass storage'' is used to mean ''large'' in relation to contemporaneous hard disk drive ...
.
A ''record type'' is a
data type
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
that describes such values and variables. Most modern programming languages allow the programmer to define new record types. The definition includes specifying the data type of each field and an
identifier
An identifier is a name that identifies (that is, labels the identity of) either a unique object or a unique ''class'' of objects, where the "object" or class may be an idea, person, physical countable object (or class thereof), or physical mass ...
(name or label) by which it can be accessed. In
type theory
In mathematics and theoretical computer science, a type theory is the formal presentation of a specific type system. Type theory is the academic study of type systems.
Some type theories serve as alternatives to set theory as a foundation of ...
,
product types (with no field names) are generally preferred due to their simplicity, but proper record types are studied in languages such as
System F-sub. Since type-theoretical records may contain
first-class function
In computer science, a programming language is said to have first-class functions if it treats function (programming), functions as first-class citizens. This means the language supports passing functions as arguments to other functions, returning ...
-typed fields in addition to data, they can express many features of
object-oriented programming
Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...
.
Terminology
In the context of storage such as in a
database
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
or
spreadsheet
A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
a record is often called a ''
row'' and each field is called a
column
A column or pillar in architecture and structural engineering is a structural element that transmits, through compression, the weight of the structure above to other structural elements below. In other words, a column is a compression member ...
.
In
object-oriented programming
Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...
, an
object is a record that contains state and method fields.
A record is similar to a
mathematical
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
tuple
In mathematics, a tuple is a finite sequence or ''ordered list'' of numbers or, more generally, mathematical objects, which are called the ''elements'' of the tuple. An -tuple is a tuple of elements, where is a non-negative integer. There is o ...
, although a
tuple
In mathematics, a tuple is a finite sequence or ''ordered list'' of numbers or, more generally, mathematical objects, which are called the ''elements'' of the tuple. An -tuple is a tuple of elements, where is a non-negative integer. There is o ...
may or may not be considered a record, and vice versa, depending on conventions and the programming language. In the same vein, a record type can be viewed as the computer language analog of the
Cartesian product
In mathematics, specifically set theory, the Cartesian product of two sets and , denoted , is the set of all ordered pairs where is an element of and is an element of . In terms of set-builder notation, that is
A\times B = \.
A table c ...
of two or more
mathematical sets, or the implementation of an abstract
product type in a specific language.
A record differs from an
array in that a record's elements (fields) are determined by the definition of the record, and may be heterogeneous whereas an array is a collection of elements with the same type.
The parameters of a
function can be viewed collectively as the fields of a record and passing arguments to the function can be viewed as
assigning the input parameters to the record fields. At a low-level, a function call includes an ''activation record'' or ''call frame'', that contains the parameters as well as other fields such as local variables and the return address.
History

The concept of a record can be traced to various types of
tables and
ledger
A ledger is a book or collection of accounts in which accounting transactions are recorded. Each account has:
* an opening or brought-forward balance;
*a list of transactions, each recorded as either a debit or credit in separate columns (usu ...
s used in
accounting
Accounting, also known as accountancy, is the process of recording and processing information about economic entity, economic entities, such as businesses and corporations. Accounting measures the results of an organization's economic activit ...
since remote times. The modern notion of records in computer science, with fields of well-defined type and size, was already implicit in 19th century mechanical calculators, such as
Babbage's
Analytical Engine.
The original machine-readable medium used for data (as opposed to control) was the
punch card
A punched card (also punch card or punched-card) is a stiff paper-based medium used to store digital information via the presence or absence of holes in predefined positions. Developed over the 18th to 20th centuries, punched cards were wide ...
used for records in the
1890 United States census: each punch card was a single record. Compare the journal entry from 1880 and the punch card from 1895. Records were well-established in the first half of the 20th century, when most data processing was done using punched cards. Typically, each record of a data file would be recorded on one punched card, with specific columns assigned to specific fields. Generally, a record was the smallest unit that could be read from external storage (e.g., card reader, tape, or disk). The contents of punchcard-style records were originally called "unit records" because punchcards had pre-determined document lengths.
When storage systems became more advanced with the use of
hard drives and
magnetic tape
Magnetic tape is a medium for magnetic storage made of a thin, magnetizable coating on a long, narrow strip of plastic film. It was developed in Germany in 1928, based on the earlier magnetic wire recording from Denmark. Devices that use magnetic ...
, variable-length records became the standard. A variable-length record is a record in which the size of the record in bytes is approximately equal to the sum of the sizes of its fields. This was not possible to do before more advanced storage hardware was invented because all of the punchcards had to conform to pre-determined document lengths that the computer could read, since at the time the cards had to be physically fed into a machine.
Most
machine language implementations and early
assembly language
In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
s did not have special syntax for records, but the concept was available (and extensively used) through the use of
index register
An index register in a computer's central processing unit, CPU is a processor register (or an assigned memory location) used for pointing to operand addresses during the run of a program. It is useful for stepping through String (computer science ...
s,
indirect addressing, and
self-modifying code
In computer science, self-modifying code (SMC or SMoC) is source code, code that alters its own instruction (computer science), instructions while it is execution (computing), executing – usually to reduce the instruction path length and imp ...
. Some early computers, such as the
IBM 1620
The IBM 1620 was a model of scientific minicomputer produced by IBM. It was announced on October 21, 1959, and was then marketed as an inexpensive scientific computer. After a total production of about two thousand machines, it was withdrawn on N ...
, had hardware support for delimiting records and fields, and special instructions for copying such records.
The concept of records and fields was central in some early file
sorting
Sorting refers to ordering data in an increasing or decreasing manner according to some linear relationship among the data items.
# ordering: arranging items in a sequence ordered by some criterion;
# categorizing: grouping items with similar p ...
and
tabulating utilities, such as
IBM's Report Program Generator (RPG).
was the first widespread
programming language
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
to support record types, and its record definition facilities were quite sophisticated at the time. The language allows for the definition of nested records with alphanumeric, integer, and fractional fields of arbitrary size and precision, and fields that automatically format any value assigned to them (e.g., insertion of currency signs, decimal points, and digit group separators). Each file is associated with a record variable where data is read into or written from. COBOL also provides a
MOVE
CORRESPONDING
statement that assigns corresponding fields of two records according to their names.
The early languages developed for numeric computing, such as
FORTRAN (up to
FORTRAN IV) and
ALGOL 60
ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a ...
, did not support record types; but later versions of those languages, such as
FORTRAN 77 and
ALGOL 68
ALGOL 68 (short for ''Algorithmic Language 1968'') is an imperative programming language member of the ALGOL family that was conceived as a successor to the ALGOL 60 language, designed with the goal of a much wider scope of application and ...
did add them. The original
Lisp programming language
Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation.
Originally specified in the late 1950s, ...
too was lacking records (except for the built-in
cons cell), but its
S-expression
In computer programming, an S-expression (or symbolic expression, abbreviated as sexpr or sexp) is an expression in a like-named notation for nested List (computing), list (Tree (data structure), tree-structured) data. S-expressions were invented ...
s provided an adequate surrogate. The
Pascal programming language was one of the first languages to fully integrate record types with other basic types into a logically consistent type system. The
PL/I
PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has b ...
language provided for COBOL-style records. The
C language provides the record concept using
structs. Most languages designed after Pascal (such as
Ada,
Modula
The Modula programming language is a descendant of the Pascal language. It was developed in Switzerland, at ETH Zurich, in the mid-1970s by Niklaus Wirth, the same person who designed Pascal. The main innovation of Modula over Pascal is a mo ...
, and
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
), also supported records.
Although records are not often used in their original context anymore (i.e. being used solely for the purpose of containing data), records influenced newer
object-oriented programming
Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...
languages and
relational database
A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970.
A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured for ...
management systems. Since records provided more modularity in the way data was stored and handled, they are better suited at representing complex, real-world concepts than the
primitive data types provided by default in languages. This influenced later languages such as
C++,
Python,
JavaScript
JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior.
Web browsers have ...
, and
Objective-C
Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was ...
which address the same modularity needs of programming.
Objects in these languages are essentially records with the addition of
methods and
inheritance
Inheritance is the practice of receiving private property, titles, debts, entitlements, privileges, rights, and obligations upon the death of an individual. The rules of inheritance differ among societies and have changed over time. Offi ...
, which allow programmers to manipulate the way data behaves instead of only the contents of a record. Many programmers regard records as obsolete now since object-oriented languages have features that far surpass what records are capable of. On the other hand, many programmers argue that the low overhead and ability to use records in
assembly language
In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
make records still relevant when programming with low levels of
abstraction
Abstraction is a process where general rules and concepts are derived from the use and classifying of specific examples, literal (reality, real or Abstract and concrete, concrete) signifiers, first principles, or other methods.
"An abstraction" ...
. Today, the most popular languages on the
TIOBE index, an indicator of the popularity of programming languages, have been influenced in some way by records due to the fact that they are object oriented. Query languages such as
SQL and
Object Query Language were also influenced by the concept of records. These languages allow the programmer to store sets of data, which are essentially records, in tables. This data can then be retrieved using a
primary key. The tables themselves are also records which may have a
foreign key
A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples ...
: a key that references data in another table.
Record type
Operations
Operations for a record type include:
* Declaration of a record type, including the position, type, and (possibly) name of each field
* Declaration of a record; a variable typed as a record type
* Construction of a record value; possibly with field value initialization
* Read and write record field value
* Comparison of two records for equality
* Computation of a standard
hash value
A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable-length output. The values returned by a hash function are called ''hash values'', ...
for the record
Some languages provide facilities that enumerate the fields of a record. This facility is needed to implement certain services such as
debugging
In engineering, debugging is the process of finding the Root cause analysis, root cause, workarounds, and possible fixes for bug (engineering), bugs.
For software, debugging tactics can involve interactive debugging, control flow analysis, Logf ...
,
garbage collection, and
serialization
In computing, serialization (or serialisation, also referred to as pickling in Python (programming language), Python) is the process of translating a data structure or object (computer science), object state into a format that can be stored (e. ...
. It requires some degree of
type polymorphism.
In contexts that support record subtyping, operations include adding and removing fields of a record. A specific record type implies that a specific set of fields are present, but values of that type may contain additional fields. A record with fields ''x'', ''y'', and ''z'' would thus belong to the type of records with fields ''x'' and ''y'', as would a record with fields ''x'', ''y'', and ''r''. The rationale is that passing an (''x'',''y'',''z'') record to a function that expects an (''x'',''y'') record as argument should work, since that function will find all the fields it requires within the record. Many ways of practically implementing records in programming languages would have trouble with allowing such variability, but the matter is a central characteristic of record types in more theoretical contexts.
Assignment and comparison
Most languages allow assignment between records that have exactly the same record type (including same field types and names, in the same order). Depending on the language, however, two record data types defined separately may be regarded as distinct types even if they have exactly the same fields.
Some languages may also allow assignment between records whose fields have different names, matching each field value with the corresponding field variable by their positions within the record; so that, for example, a
complex number
In mathematics, a complex number is an element of a number system that extends the real numbers with a specific element denoted , called the imaginary unit and satisfying the equation i^= -1; every complex number can be expressed in the for ...
with fields called
real
and
imag
can be assigned to a
2D point record variable with fields
X
and
Y
. In this alternative, the two operands are still required to have the same sequence of field types. Some languages may also require that corresponding types have the same size and encoding as well, so that the whole record can be assigned as an uninterpreted
bit string. Other languages may be more flexible in this regard, and require only that each value field can be legally assigned to the corresponding variable field; so that, for example, a
short integer field can be assigned to a
long integer
In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are ...
field, or vice versa.
Other languages (such as
COBOL
COBOL (; an acronym for "common business-oriented language") is a compiled English-like computer programming language designed for business use. It is an imperative, procedural, and, since 2002, object-oriented language. COBOL is primarily ...
) may match fields and values by their names, rather than positions.
These same possibilities apply to the comparison of two record values for equality. Some languages may also allow order comparisons ('<'and '>'), using the
lexicographic order
In mathematics, the lexicographic or lexicographical order (also known as lexical order, or dictionary order) is a generalization of the alphabetical order of the dictionaries to sequences of ordered symbols or, more generally, of elements of a ...
based on the comparison of individual fields.
PL/I
PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has b ...
allows both of the preceding types of assignment, and also allows ''structure expressions'', such as
a = a+1;
where "a" is a record, or structure in PL/I terminology.
Algol 68's distributive field selection
In Algol 68, if
Pts
was an array of records, each with integer fields
X
and
Y
, one could write
Y of Pts
to obtain an array of integers, consisting of the
Y
fields of all the elements of
Pts
. As a result, the statements
Y of Pts := 7
and
(Y of Pts) := 7
would have the same effect.
Pascal's "with" statement
In
Pascal, the command
with R do S
would execute the command sequence
S
as if all the fields of record
R
had been declared as variables. Similarly to entering a different
namespace
In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified.
Namespaces ...
in an object-oriented language like
C#, it is no longer necessary to use the record name as a prefix to access the fields. So, instead of writing
Pt.X := 5; Pt.Y := Pt.X + 3
one could write .
Representation in memory
The representation of a record in memory varies depending on the programming language. Often, fields are stored in consecutive memory locations, in the same order as they are declared in the record type. This may result in two or more fields stored into the same word of memory; indeed, this feature is often used in
systems programming
Systems programming, or system programming, is the activity of programming computer system software. The primary distinguishing characteristic of systems programming when compared to application programming is that application programming aims t ...
to access specific bits of a word. On the other hand, most compilers will add padding fields, mostly invisible to the programmer, in order to comply with alignment constraints imposed by the machine—say, that a
floating point field must occupy a single word.
Some languages may implement a record as an array of addresses pointing to the fields (and, possibly, to their names and/or types). Objects in object-oriented languages are often implemented in rather complicated ways, especially in languages that allow
multiple class inheritance.
Self-defining records
A ''self-defining record'' is a type of record which contains information to identify the record type and to locate information within the record. It may contain the offsets of elements; the elements can therefore be stored in any order or may be omitted. The information stored in a self-defining record can be interpreted as
metadata
Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive ...
for the record, which is similar to what one would expect to find in the
UNIX
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
metadata
Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive ...
regarding a file, containing information such as the record's creation time and the size of the record in
byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
s. Alternatively, various elements of the record, each including an element identifier, can simply follow one another in any order.
Key field
A record, especially in the context of row-based storage, may include key fields that allow indexing the records of a collection. A primary key is unique throughout all stored records; only one of this key exists. In other words, no duplicate may exist for any primary key. For example, an employee file might contain employee number, name, department, and salary. The employee number will be unique in the organization and will be the primary key. Depending on the storage medium and file organization, the employee number might be ''
indexed''—that is also stored in a separate file to make the lookup faster. The department code is not necessarily unique; it may also be indexed, in which case it would be considered a ''secondary key'', or ''alternate key''.
If it is not indexed, the entire employee file would have to be scanned to produce a listing of all employees in a specific department. Keys are usually chosen in a way that minimizes the chances of multiple values being feasibly mapped to by one key. For example, the salary field would not normally be considered usable as a key since many employees will likely have the same salary.
See also
*
*
*
*
*
*
References
{{DEFAULTSORT:Record (Computer Science)
Data types
Composite data types
Articles with example Julia code