In
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, a data structure is a
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
organization and storage format that is usually chosen for
efficient access
Access may refer to:
Companies and organizations
* ACCESS (Australia), an Australian youth network
* Access (credit card), a former credit card in the United Kingdom
* Access Co., a Japanese software company
* Access International Advisors, a hed ...
to data. More precisely, a data structure is a collection of data values, the relationships among them, and the
functions or
operations that can be applied to the data, i.e., it is an
algebraic structure
In mathematics, an algebraic structure or algebraic system consists of a nonempty set ''A'' (called the underlying set, carrier set or domain), a collection of operations on ''A'' (typically binary operations such as addition and multiplicatio ...
about
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
.
Usage
Data structures serve as the basis for
abstract data types (ADT). The ADT defines the logical form of the data type. The data structure implements the physical form of the
data type
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
.
Different types of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks. For example,
relational databases commonly use
B-tree indexes for data retrieval, while
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
implementations usually use
hash tables to look up
identifiers.
Data structures provide a means to manage large amounts of data efficiently for uses such as large
database
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
s and internet indexing services. Usually, efficient data structures are key to designing efficient
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s. Some formal design methods and
programming language
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s emphasize data structures, rather than algorithms, as the key organizing factor in software design. Data structures can be used to organize the storage and retrieval of information stored in both
main memory and
secondary memory.
Implementation
Data structures can be implemented using a variety of programming languages and techniques, but they all share the common goal of efficiently organizing and storing data. Data structures are generally based on the ability of a
computer
A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
to fetch and store data at any place in its memory, specified by a
pointer—a
bit string, representing a
memory address, that can be itself stored in memory and manipulated by the program. Thus, the
array and
record data structures are based on computing the addresses of data items with
arithmetic operations
Arithmetic is an elementary branch of mathematics that deals with numerical operations like addition, subtraction, multiplication, and Division (mathematics), division. In a wider sense, it also includes exponentiation, extraction of nth root, ...
, while the
linked data structures are based on storing addresses of data items within the structure itself. This approach to data structuring has profound implications for the efficiency and scalability of algorithms. For instance, the contiguous memory allocation in arrays facilitates rapid access and modification operations, leading to optimized performance in sequential data processing scenarios.
The implementation of a data structure usually requires writing a set of
procedures that create and manipulate instances of that structure. The efficiency of a data structure cannot be analyzed separately from those operations. This observation motivates the theoretical concept of an
abstract data type, a data structure that is defined indirectly by the operations that may be performed on it, and the mathematical properties of those operations (including their space and time cost).
Examples

There are numerous types of data structures, generally built upon simpler
primitive data types. Well known examples are:
* An
array is a number of elements in a specific order, typically all of the same type (depending on the language, individual elements may either all be forced to be the same type, or may be of almost any type). Elements are accessed using an integer index to specify which element is required. Typical implementations allocate contiguous memory words for the elements of arrays (but this is not always a necessity). Arrays may be fixed-length or resizable.
* A
linked list
In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
(also just called ''list'') is a linear collection of data elements of any type, called nodes, where each node has itself a value, and points to the next node in the linked list. The principal advantage of a linked list over an array is that values can always be efficiently inserted and removed without relocating the rest of the list. Certain other operations, such as
random access to a certain element, are however slower on lists than on arrays.
* A
record (also called ''tuple'' or ''struct'') is an
aggregate data structure. A record is a value that contains other values, typically in fixed number and sequence and typically indexed by names. The elements of records are usually called ''fields'' or ''members''. In the context of
object-oriented programming
Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...
, records are known as
plain old data structures to distinguish them from objects.
*
Hash tables, also known as hash maps, are data structures that provide fast retrieval of values based on keys. They use a hashing function to map keys to indexes in an array, allowing for constant-time access in the average case. Hash tables are commonly used in dictionaries, caches, and database indexing. However, hash collisions can occur, which can impact their performance. Techniques like chaining and open addressing are employed to handle collisions.
*
Graphs are collections of nodes connected by edges, representing relationships between entities. Graphs can be used to model social networks, computer networks, and transportation networks, among other things. They consist of vertices (nodes) and edges (connections between nodes). Graphs can be directed or undirected, and they can have cycles or be acyclic. Graph traversal algorithms include breadth-first search and depth-first search.
*
Stacks and
queues are abstract data types that can be implemented using arrays or linked lists. A stack has two primary operations: push (adds an element to the top of the stack) and pop (removes the topmost element from the stack), that follow the Last In, First Out (LIFO) principle. Queues have two main operations: enqueue (adds an element to the rear of the queue) and dequeue (removes an element from the front of the queue) that follow the First In, First Out (FIFO) principle.
*
Trees represent a hierarchical organization of elements. A tree consists of nodes connected by edges, with one node being the root and all other nodes forming subtrees. Trees are widely used in various algorithms and data storage scenarios.
Binary tree
In computer science, a binary tree is a tree data structure in which each node has at most two children, referred to as the ''left child'' and the ''right child''. That is, it is a ''k''-ary tree with . A recursive definition using set theor ...
s (particularly
heaps),
AVL trees, and
B-trees are some popular types of trees. They enable efficient and optimal searching, sorting, and hierarchical representation of data.
A
trie, or prefix tree, is a special type of tree used to efficiently retrieve strings. In a trie, each node represents a character of a string, and the edges between nodes represent the characters that connect them. This structure is especially useful for tasks like autocomplete, spell-checking, and creating dictionaries. Tries allow for quick searches and operations based on string prefixes.
Language support
Most
assembly language
In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
s and some
low-level languages, such as
BCPL (Basic Combined Programming Language), lack built-in support for data structures. On the other hand, many
high-level programming language
A high-level programming language is a programming language with strong Abstraction (computer science), abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be ea ...
s and some higher-level assembly languages, such as
MASM, have special syntax or other built-in support for certain data structures, such as records and arrays. For example, the
C (a direct descendant of BCPL) and
Pascal languages support
structs and records, respectively, in addition to vectors (one-dimensional
arrays
An array is a systematic arrangement of similar objects, usually in rows and columns.
Things called an array include:
{{TOC right
Music
* In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
) and multi-dimensional arrays.
Most programming languages feature some sort of
library
A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
mechanism that allows data structure implementations to be reused by different programs. Modern languages usually come with standard libraries that implement the most common data structures. Examples are the
C++ Standard Template Library, the
Java Collections Framework, and the
Microsoft
Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
.NET Framework.
Modern languages also generally support
modular programming
Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect or "concern" of the d ...
, the separation between the
interface of a library module and its implementation. Some provide
opaque data types that allow clients to hide implementation details.
Object-oriented programming languages, such as
C++,
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, and
Smalltalk
Smalltalk is a purely object oriented programming language (OOP) that was originally created in the 1970s for educational use, specifically for constructionist learning, but later found use in business. It was created at Xerox PARC by Learni ...
, typically use
classes for this purpose.
Many known data structures have
concurrent versions which allow multiple computing threads to access a single concrete instance of a data structure simultaneously.
See also
*
Abstract data type
*
Concurrent data structure
*
Data model
*
Dynamization
*
Linked data structure
*
List of data structures
*
Persistent data structure
*
Plain old data structure
*
Queap
*
Succinct data structure
*
Tree (data structure)
In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be co ...
References
Bibliography
* Peter Brass, ''Advanced Data Structures'',
Cambridge University Press
Cambridge University Press was the university press of the University of Cambridge. Granted a letters patent by King Henry VIII in 1534, it was the oldest university press in the world. Cambridge University Press merged with Cambridge Assessme ...
, 2008,
*
Donald Knuth, ''
The Art of Computer Programming'', vol. 1.
Addison-Wesley, 3rd edition, 1997,
* Dinesh Mehta and
Sartaj Sahni, ''Handbook of Data Structures and Applications'',
Chapman and Hall/
CRC Press
The CRC Press, LLC is an American publishing group that specializes in producing technical books. Many of their books relate to engineering, science and mathematics. Their scope also includes books on business, forensics and information technol ...
, 2004,
*
Niklaus Wirth
Niklaus Emil Wirth ( IPA: ) (15 February 1934 – 1 January 2024) was a Swiss computer scientist. He designed several programming languages, including Pascal, and pioneered several classic topics in software engineering. In 1984, he won the Tu ...
, ''Algorithms and Data Structures'',
Prentice Hall
Prentice Hall was a major American publishing#Textbook_publishing, educational publisher. It published print and digital content for the 6–12 and higher-education market. It was an independent company throughout the bulk of the twentieth cen ...
, 1985,
Further reading
Open Data Structures by Pat Morin*
G. H. Gonnet and
R. Baeza-Yates,
Handbook of Algorithms and Data Structures - in Pascal and C', second edition, Addison-Wesley, 1991,
*
Ellis Horowitz and Sartaj Sahni, ''Fundamentals of Data Structures in Pascal'',
Computer Science Press, 1984,
External links
Descriptionsfrom the
Dictionary of Algorithms and Data Structures
Data structures courseAn Examination of Data Structures from .NET perspectiveSchaffer, C. ''Data Structures and Algorithm Analysis''
{{DEFAULTSORT:Data Structure