
In
computer science, a heap is a specialized
tree-based
data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
which is essentially an
almost complete
Lutz's resource-bounded measure is a generalisation of Lebesgue measure to complexity classes. It was originally developed by Jack Lutz. Just as Lebesgue measure gives a method to quantify the size of subsets of the Euclidean space \R^n, resource ...
tree that satisfies the heap property: in a ''max heap'', for any given
node C, if P is a parent node of C, then the ''key'' (the ''value'') of P is greater than or equal to the key of C. In a ''min heap'', the key of P is less than or equal to the key of C. The node at the "top" of the heap (with no parents) is called the ''root'' node.
The heap is one maximally efficient implementation of an
abstract data type called a
priority queue
In computer science, a priority queue is an abstract data-type similar to a regular queue or stack data structure in which each element additionally has a ''priority'' associated with it. In a priority queue, an element with high priority is se ...
, and in fact, priority queues are often referred to as "heaps", regardless of how they may be implemented. In a heap, the highest (or lowest) priority element is always stored at the root. However, a heap is not a sorted structure; it can be regarded as being partially ordered. A heap is a useful data structure when it is necessary to repeatedly remove the object with the highest (or lowest) priority, or when insertions need to be interspersed with removals of the root node.
A common implementation of a heap is the
binary heap, in which the tree is a
binary tree
In computer science, a binary tree is a k-ary k = 2 tree data structure in which each node has at most two children, which are referred to as the ' and the '. A recursive definition using just set theory notions is that a (non-empty) binary t ...
(see figure). The heap data structure, specifically the binary heap, was introduced by
J. W. J. Williams
John William Joseph Williams (2 September 1930 – 29 September 2012) was a Welsh-Canadian computer scientist best known for inventing in 1964 heapsort and the binary heap data structure. He was born in Chippenham, Wiltshire''England & Wales, Civ ...
in 1964, as a data structure for the
heapsort sorting algorithm. Heaps are also crucial in several efficient
graph algorithms such as
Dijkstra's algorithm. When a heap is a complete binary tree, it has a smallest possible height—a heap with ''N'' nodes and ''a'' branches for each node always has log
''a'' ''N'' height.
Note that, as shown in the graphic, there is no implied ordering between siblings or cousins and no implied sequence for an
in-order traversal (as there would be in, e.g., a
binary search tree). The heap relation mentioned above applies only between nodes and their parents, grandparents, etc. The maximum number of children each node can have depends on the type of heap.
Operations
The common operations involving heaps are:
;Basic
* ''find-max'' (or ''find-min''): find a maximum item of a max-heap, or a minimum item of a min-heap, respectively (a.k.a. ''
peek
Polyether ether ketone (PEEK) is a colourless organic thermoplastic polymer in the polyaryletherketone (PAEK) family, used in engineering applications. The polymer was first developed in November 1978, later being introduced to the market by Vic ...
'')
* ''insert'': adding a new key to the heap (a.k.a., ''push'')
* ''extract-max'' (or ''extract-min''): returns the node of maximum value from a max heap
r minimum value from a min heap
R, or r, is the eighteenth letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is ''ar'' (pronounced ), plural ''ars'', or in Irelan ...
after removing it from the heap (a.k.a., ''pop'')
* ''delete-max'' (or ''delete-min''): removing the root node of a max heap (or min heap), respectively
* ''replace'': pop root and push a new key. More efficient than pop followed by push, since only need to balance once, not twice, and appropriate for fixed-size heaps.
;Creation
* ''create-heap'': create an empty heap
* ''heapify'': create a heap out of given array of elements
* ''merge'' (''union''): joining two heaps to form a valid new heap containing all the elements of both, preserving the original heaps.
* ''meld'': joining two heaps to form a valid new heap containing all the elements of both, destroying the original heaps.
;Inspection
* ''size'': return the number of items in the heap.
* ''is-empty'': return true if the heap is empty, false otherwise.
;Internal
* ''increase-key'' or ''decrease-key'': updating a key within a max- or min-heap, respectively
* ''delete'': delete an arbitrary node (followed by moving last node and sifting to maintain heap)
* ''sift-up'': move a node up in the tree, as long as needed; used to restore heap condition after insertion. Called "sift" because node moves up the tree until it reaches the correct level, as in a
sieve.
* ''sift-down'': move a node down in the tree, similar to sift-up; used to restore heap condition after deletion or replacement.
Implementation
Heaps are usually implemented with an
array, as follows:
* Each element in the array represents a node of the heap, and
* The parent / child relationship is
defined implicitly by the elements' indices in the array.

For a
binary heap, in the array, the first index contains the root element. The next two indices of the array contain the root's children. The next four indices contain the four children of the root's two child nodes, and so on. Therefore, given a node at index , its children are at indices and , and its parent is at index . This simple indexing scheme makes it efficient to move "up" or "down" the tree.
Balancing a heap is done by sift-up or sift-down operations (swapping elements which are out of order). As we can build a heap from an array without requiring extra memory (for the nodes, for example),
heapsort can be used to sort an array in-place.
After an element is inserted into or deleted from a heap, the heap property may be violated, and the heap must be re-balanced by swapping elements within the array.
Although different type of heaps implement the operations differently, the most common way is as follows:
* Insertion: Add the new element at the end of the heap, in the first available free space. If this will violate the heap property, sift up the new element (''swim'' operation) until the heap property has been reestablished.
* Extraction: Remove the root and insert the last element of the heap in the root. If this will violate the heap property, sift down the new root (''sink'' operation) to reestablish the heap property.
* Replacement: Remove the root and put the ''new'' element in the root and sift down. When compared to extraction followed by insertion, this avoids a sift up step.
Construction of a binary (or ''d''-ary) heap out of a given array of elements may be performed in linear time using the classic
Floyd algorithm
Floyd may refer to:
As a name
* Floyd (given name), a list of people and fictional characters
* Floyd (surname), a list of people and fictional characters
Places in the United States
* Floyd, Arkansas, an unincorporated community
* Floyd, Iow ...
, with the worst-case number of comparisons equal to 2''N'' − 2''s''
2(''N'') − ''e''
2(''N'') (for a binary heap), where ''s''
2(''N'') is the sum of all digits of the binary representation of ''N'' and ''e''
2(''N'') is the exponent of 2 in the prime factorization of ''N''. This is faster than a sequence of consecutive insertions into an originally empty heap, which is log-linear.
Variants
*
2–3 heap In computer science, a 2–3 heap is a data structure, a variation on the heap, designed by Tadao Takaoka in 1999. The structure is similar to the Fibonacci heap, and borrows from the 2–3 tree.
Time costs for some common heap operations are:
* ...
*
B-heap A B-heap is a binary heap implemented to keep subtrees in a single page. This reduces the number of pages accessed by up to a factor of ten for big heaps when using virtual memory, compared with the traditional implementation. The traditional mappin ...
*
Beap
*
Binary heap
*
Binomial heap
In computer science, a binomial heap is a data structure that acts as a priority queue but also allows pairs of heaps to be merged.
It is important as an implementation of the mergeable heap abstract data type (also called meldable heap), which ...
*
Brodal queue
*
''d''-ary heap
*
Fibonacci heap
In computer science, a Fibonacci heap is a data structure for priority queue operations, consisting of a collection of heap-ordered trees. It has a better amortized running time than many other priority queue data structures including the binar ...
*
K-D Heap
*
Leaf heap
A leaf ( : leaves) is any of the principal appendages of a vascular plant stem, usually borne laterally aboveground and specialized for photosynthesis. Leaves are collectively called foliage, as in "autumn foliage", while the leaves, ste ...
*
Leftist heap
In computer science, a leftist tree or leftist heap is a priority queue implemented with a variant of a binary heap. Every node x has an ''s-value'' which is the distance to the nearest leaf in subtree rooted at x. In contrast to a ''binary heap'' ...
*
Pairing heap
A pairing heap is a type of heap data structure with relatively simple implementation and excellent practical amortized performance, introduced by Michael Fredman, Robert Sedgewick, Daniel Sleator, and Robert Tarjan in 1986.
Pairing heaps are ...
*
Radix heap
*
Randomized meldable heap In computer science, a randomized meldable heap (also Meldable Heap or Randomized Meldable Priority Queue) is a priority queue based data structure in which the underlying structure is also a heap-ordered binary tree. However, there are no restrict ...
*
Skew heap
*
Soft heap
*
Ternary heap
*
Treap
*
Weak heap In computer science, a weak heap is a data structure for priority queues, combining features of the binary heap and binomial heap. It can be stored in an array as an implicit binary tree like a binary heap, and has the efficiency guarantees of bino ...
Comparison of theoretic bounds for variants
Applications
The heap data structure has many applications.
*
Heapsort: One of the best sorting methods being in-place and with no quadratic worst-case scenarios.
*
Selection algorithms: A heap allows access to the min or max element in constant time, and other selections (such as median or kth-element) can be done in sub-linear time on data that is in a heap.
*
Graph algorithms: By using heaps as internal traversal data structures, run time will be reduced by polynomial order. Examples of such problems are
Prim's minimal-spanning-tree algorithm and
Dijkstra's shortest-path algorithm.
*
Priority Queue
In computer science, a priority queue is an abstract data-type similar to a regular queue or stack data structure in which each element additionally has a ''priority'' associated with it. In a priority queue, an element with high priority is se ...
: A priority queue is an abstract concept like "a list" or "a map"; just as a list can be implemented with a linked list or an array, a priority queue can be implemented with a heap or a variety of other methods.
*
K-way merge
Merge algorithms are a family of algorithms that take multiple sorted lists as input and produce a single list as output, containing all the elements of the inputs lists in sorted order. These algorithms are used as subroutines in various sorting ...
: A heap data structure is useful to merge many already-sorted input streams into a single sorted output stream. Examples of the need for merging include external sorting and streaming results from distributed data such as a log structured merge tree. The inner loop is obtaining the min element, replacing with the next element for the corresponding input stream, then doing a sift-down heap operation. (Alternatively the replace function.) (Using extract-max and insert functions of a priority queue are much less efficient.)
*
Order statistics
In statistics, the ''k''th order statistic of a statistical sample is equal to its ''k''th-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference.
Importa ...
: The Heap data structure can be used to efficiently find the kth smallest (or largest) element in an array.
Programming language implementations
* The
C++ Standard Library provides the , and algorithms for heaps (usually implemented as binary heaps), which operate on arbitrary random access
iterators. It treats the iterators as a reference to an array, and uses the array-to-heap conversion. It also provides the container adaptor , which wraps these facilities in a container-like class. However, there is no standard support for the replace, sift-up/sift-down, or decrease/increase-key operations.
* The
Boost C++ libraries include a heaps library. Unlike the STL, it supports decrease and increase operations, and supports additional types of heap: specifically, it supports ''d''-ary, binomial, Fibonacci, pairing and skew heaps.
* There is
generic heap implementationfor
C and
C++ with
D-ary heap and
B-heap A B-heap is a binary heap implemented to keep subtrees in a single page. This reduces the number of pages accessed by up to a factor of ten for big heaps when using virtual memory, compared with the traditional implementation. The traditional mappin ...
support. It provides an STL-like API.
* The standard library of the
D programming language include
which is implemented in terms of D'
ranges Instances can be constructed from an
exposes a
that allows iteration with D's built-in statements and integration with the range-based API of th
* For
Haskell there is th
module.
* The
Java platform (since version 1.5) provides a binary heap implementation with the class in the
Java Collections Framework. This class implements by default a min-heap; to implement a max-heap, programmer should write a custom comparator. There is no support for the replace, sift-up/sift-down, or decrease/increase-key operations.
*
Python has
module that implements a priority queue using a binary heap. The library exposes a heapreplace function to support k-way merging.
*
PHP has both max-heap () and min-heap () as of version 5.3 in the Standard PHP Library.
*
Perl has implementations of binary, binomial, and Fibonacci heaps in th
distribution available on
CPAN
The Comprehensive Perl Archive Network (CPAN) is a repository of over 250,000 software modules and accompanying documentation for 39,000 distributions, written in the Perl programming language by over 12,000 contributors. ''CPAN'' can denote eith ...
.
* The
Go language contains
package with heap algorithms that operate on an arbitrary type that satisfies a given interface. That package does not support the replace, sift-up/sift-down, or decrease/increase-key operations.
* Apple's
Core Foundation
Core Foundation (also called CF) is a C application programming interface (API) written by Apple for its operating systems, and is a mix of low-level routines and wrapper functions. Most Core Foundation routines follow a certain naming conventi ...
library contains
structure.
*
Pharo
Pharo is an open source, cross-platform implementation of the classic Smalltalk-80 programming language and runtime. It's based on the OpenSmalltalk virtual machine called Cog (VM), which evaluates a dynamic, Reflective programming, reflectiv ...
has an implementation of a heap in the Collections-Sequenceable package along with a set of test cases. A heap is used in the implementation of the timer event loop.
* The
Rust programming language has a binary max-heap implementation
in the module of its standard library.
*
.NET ha
PriorityQueueclass which uses quarternary (d-ary) min-heap implementation. It is available from .NET 6.
See also
*
Sorting algorithm
*
Search data structure In computer science, a search data structure is any data structure that allows the efficient retrieval of specific items from a set of items, such as a specific record from a database.
The simplest, most general, and least efficient search struc ...
*
Stack (abstract data type)
In computer science, a stack is an abstract data type that serves as a collection of elements, with two main operations:
* Push, which adds an element to the collection, and
* Pop, which removes the most recently added element that was not y ...
*
Queue (abstract data type)
*
Tree (data structure)
*
Treap, a form of binary search tree based on heap-ordered trees
References
External links
Heapat Wolfram MathWorld
of how the basic heap algorithms work
*
{{DEFAULTSORT:Heap (Data Structure)