In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
, algorithmic efficiency is a property of an
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
which relates to the amount of
computational resource
In computational complexity theory, a computational resource is a resource used by some computational models in the solution of computational problems.
The simplest computational resources are computation time, the number of steps necessary to s ...
s used by the algorithm. An algorithm must be
analyzed to determine its resource usage, and the efficiency of an algorithm can be measured based on the usage of different resources. Algorithmic efficiency can be thought of as analogous to engineering
productivity
Productivity is the efficiency of production of goods or services expressed by some measure. Measurements of productivity are often expressed as a ratio of an aggregate output to a single input or an aggregate input used in a production proce ...
for a repeating or continuous process.
For maximum efficiency it is desirable to minimize resource usage. However, different resources such as
time
Time is the continued sequence of existence and events that occurs in an apparently irreversible succession from the past, through the present, into the future. It is a component quantity of various measurements used to sequence events, t ...
and
space
Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually con ...
complexity cannot be compared directly, so which of two algorithms is considered to be more efficient often depends on which measure of efficiency is considered most important.
For example,
bubble sort
Bubble sort, sometimes referred to as sinking sort, is a simple sorting algorithm that repeatedly steps through the input list element by element, comparing the current element with the one after it, swapping their values if needed. These passes ...
and
timsort
Timsort is a hybrid, stable sorting algorithm, derived from merge sort and insertion sort, designed to perform well on many kinds of real-world data. It was implemented by Tim Peters in 2002 for use in the Python programming language. The algor ...
are both
algorithms to sort a list of items from smallest to largest. Bubble sort sorts the list in time proportional to the number of elements squared (
, see
Big O notation), but only requires a small amount of extra
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered ...
which is constant with respect to the length of the list (
). Timsort sorts the list in time
linearithmic
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
(proportional to a quantity times its logarithm) in the list's length (
), but has a space requirement
linear
Linearity is the property of a mathematical relationship ('' function'') that can be graphically represented as a straight line. Linearity is closely related to '' proportionality''. Examples in physics include rectilinear motion, the linear ...
in the length of the list (
). If large lists must be sorted at high speed for a given application, timsort is a better choice; however, if minimizing the memory footprint of the sorting is more important, bubble sort is a better choice.
Background
The importance of efficiency with respect to time was emphasised by
Ada Lovelace
Augusta Ada King, Countess of Lovelace ('' née'' Byron; 10 December 1815 – 27 November 1852) was an English mathematician and writer, chiefly known for her work on Charles Babbage's proposed mechanical general-purpose computer, the ...
in 1843 as applied to
Charles Babbage
Charles Babbage (; 26 December 1791 – 18 October 1871) was an English polymath. A mathematician, philosopher, inventor and mechanical engineer, Babbage originated the concept of a digital programmable computer.
Babbage is considered ...
's mechanical analytical engine:
"In almost every computation a great variety of arrangements for the succession of the processes is possible, and various considerations must influence the selections amongst them for the purposes of a calculating engine. One essential object is to choose that arrangement which shall tend to reduce to a minimum the time necessary for completing the calculation"
Early
electronic computer
A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These program ...
s had both limited
speed
In everyday use and in kinematics, the speed (commonly referred to as ''v'') of an object is the magnitude of the change of its position over time or the magnitude of the change of its position per unit of time; it is thus a scalar quantity ...
and limited
random access memory
Random-access memory (RAM; ) is a form of computer memory that can be read and changed in any order, typically used to store working data and machine code. A random-access memory device allows data items to be read or written in almost the s ...
. Therefore, a
space–time trade-off
In physics, spacetime is a mathematical model that combines the three dimensions of space and one dimension of time into a single four-dimensional manifold. Spacetime diagrams can be used to visualize relativistic effects, such as why differen ...
occurred. A
task
Task may refer to:
* Task (computing), in computing, a program execution context
* Task (language instruction) refers to a certain type of activity used in language instruction
* Task (project management), an activity that needs to be accomplished ...
could use a fast algorithm using a lot of memory, or it could use a slow algorithm using little memory. The engineering trade-off was then to use the fastest algorithm that could fit in the available memory.
Modern computers are significantly faster than the early computers, and have a much larger amount of memory available (
Gigabytes instead of Kilobytes). Nevertheless,
Donald Knuth
Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer ...
emphasised that efficiency is still an important consideration:
"In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal and I believe the same viewpoint should prevail in software engineering"
Overview
An algorithm is considered efficient if its resource consumption, also known as computational cost, is at or below some acceptable level. Roughly speaking, 'acceptable' means: it will run in a reasonable amount of time or space on an available computer, typically as a
function
Function or functionality may refer to:
Computing
* Function key, a type of key on computer keyboards
* Function model, a structured representation of processes in a system
* Function object or functor or functionoid, a concept of object-orie ...
of the size of the input. Since the 1950s computers have seen dramatic increases in both the available computational power and in the available amount of memory, so current acceptable levels would have been unacceptable even 10 years ago. In fact, thanks to the
approximate doubling of computer power every 2 years, tasks that are acceptably efficient on modern
smartphone
A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, whic ...
s and
embedded system
An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' ...
s may have been unacceptably inefficient for industrial
server
Server may refer to:
Computing
*Server (computing), a computer program or a device that provides functionality for other programs or devices, called clients
Role
* Waiting staff, those who work at a restaurant or a bar attending customers and su ...
s 10 years ago.
Computer manufacturers frequently bring out new models, often with higher
performance
A performance is an act of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function.
Management science
In the work place ...
. Software costs can be quite high, so in some cases the simplest and cheapest way of getting higher performance might be to just buy a faster computer, provided it is
compatible
Compatibility may refer to:
Computing
* Backward compatibility, in which newer devices can understand data generated by older devices
* Compatibility card, an expansion card for hardware emulation of another device
* Compatibility layer, compon ...
with an existing computer.
There are many ways in which the resources used by an algorithm can be measured: the two most common measures are speed and memory usage; other measures could include transmission speed, temporary disk usage, long-term disk usage, power consumption,
total cost of ownership
Total cost of ownership (TCO) is a financial estimate intended to help buyers and owners determine the direct and indirect costs of a product or service. It is a management accounting concept that can be used in full cost accounting or even ecolo ...
,
response time
Response time may refer to:
*The time lag between an electronic input and the output signal which depends upon the value of passive components used.
*Responsiveness, how quickly an interactive system responds to user input
*Response time (biology) ...
to external stimuli, etc. Many of these measures depend on the size of the input to the algorithm, i.e. the amount of data to be processed. They might also depend on the way in which the data is arranged; for example, some
sorting algorithm
In computer science, a sorting algorithm is an algorithm that puts elements of a list into an order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending. Efficient sorting is importan ...
s perform poorly on data which is already sorted, or which is sorted in reverse order.
In practice, there are other factors which can affect the efficiency of an algorithm, such as requirements for accuracy and/or reliability. As detailed below, the way in which an algorithm is implemented can also have a significant effect on actual efficiency, though many aspects of this relate to
optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...
issues.
Theoretical analysis
In the theoretical
analysis of algorithms
In computer science, the analysis of algorithms is the process of finding the computational complexity of algorithms—the amount of time, storage, or other resources needed to execute them. Usually, this involves determining a function that r ...
, the normal practice is to estimate their complexity in the asymptotic sense. The most commonly used notation to describe resource consumption or "complexity" is
Donald Knuth
Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer ...
's
Big O notation, representing the complexity of an algorithm as a function of the size of the input
. Big O notation is an
asymptotic
In analytic geometry, an asymptote () of a curve is a line such that the distance between the curve and the line approaches zero as one or both of the ''x'' or ''y'' coordinates tends to infinity. In projective geometry and related contexts, ...
measure of function complexity, where
roughly means the time requirement for an algorithm is proportional to
, omitting
lower-order terms The leading-order terms (or corrections) within a mathematical equation, expression or model are the terms with the largest order of magnitude.J.K.Hunter, ''Asymptotic Analysis and Singular Perturbation Theory'', 2004. http://www.math.ucdavis.edu/ ...
that contribute less than
to the growth of the function as
grows arbitrarily large. This estimate may be misleading when
is small, but is generally sufficiently accurate when
is large as the notation is asymptotic. For example, bubble sort may be faster than
merge sort
In computer science, merge sort (also commonly spelled as mergesort) is an efficient, general-purpose, and comparison-based sorting algorithm. Most implementations produce a stable sort, which means that the order of equal elements is the same ...
when only a few items are to be sorted; however either implementation is likely to meet performance requirements for a small list. Typically, programmers are interested in algorithms that
scale
Scale or scales may refer to:
Mathematics
* Scale (descriptive set theory), an object defined on a set of points
* Scale (ratio), the ratio of a linear dimension of a model to the corresponding dimension of the original
* Scale factor, a number ...
efficiently to large input sizes, and merge sort is preferred over bubble sort for lists of length encountered in most data-intensive programs.
Some examples of Big O notation applied to algorithms' asymptotic time complexity include:
Benchmarking: measuring performance
For new versions of software or to provide comparisons with competitive systems,
benchmark
Benchmark may refer to:
Business and economics
* Benchmarking, evaluating performance within organizations
* Benchmark price
* Benchmark (crude oil), oil-specific practices
Science and technology
* Benchmark (surveying), a point of known elevati ...
s are sometimes used, which assist with gauging an algorithms relative performance. If a new
sort algorithm
In computer science, a sorting algorithm is an algorithm that puts elements of a list into an order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending. Efficient sorting is importa ...
is produced, for example, it can be compared with its predecessors to ensure that at least it is efficient as before with known data, taking into consideration any functional improvements. Benchmarks can be used by customers when comparing various products from alternative suppliers to estimate which product will best suit their specific requirements in terms of functionality and performance. For example, in the
mainframe
A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterpris ...
world certain proprietary
sort
Sort may refer to:
* Sorting, any process of arranging items in sequence or in sets
** Sorting algorithm, any algorithm for arranging elements in lists
** Sort (Unix), a Unix utility which sorts the lines of a file
** Sort (C++), a function in the ...
products from independent software companies such as
Syncsort
Precisely Holdings, LLC, doing business as Precisely, is a software company specializing in data integrity tools, and also providing big data, high-speed sorting, ETL, data integration, data quality, data enrichment, and location intelligence ...
compete with products from the major suppliers such as
IBM for speed.
Some benchmarks provide opportunities for producing an analysis comparing the relative speed of various compiled and interpreted languages for example
and
The Computer Language Benchmarks Game
The Computer Language Benchmarks Game (formerly called The Great Computer Language Shootout) is a free software project for comparing how a given subset of simple algorithms can be implemented in various popular programming languages.
The project ...
compares the performance of implementations of typical programming problems in several programming languages.
Even creating "
do it yourself
"Do it yourself" ("DIY") is the method of building, modifying, or repairing things by oneself without the direct aid of professionals or certified experts. Academic research has described DIY as behaviors where "individuals use raw and se ...
" benchmarks can demonstrate the relative performance of different programming languages, using a variety of user specified criteria. This is quite simple, as a "Nine language performance roundup" by Christopher W. Cowell-Shah demonstrates by example.
Implementation concerns
Implementation issues can also have an effect on efficiency, such as the choice of programming language, or the way in which the algorithm is actually coded,
or the choice of a
compiler
In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
for a particular language, or the
compilation options used, or even the
operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
being used. In many cases a language implemented by an
interpreter may be much slower than a language implemented by a compiler.
See the articles on
just-in-time compilation
In computing, just-in-time (JIT) compilation (also dynamic translation or run-time compilations) is a way of executing computer code that involves compiler, compilation during execution of a program (at run time (program lifecycle phase), run tim ...
and
interpreted language
In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpre ...
s.
There are other factors which may affect time or space issues, but which may be outside of a programmer's control; these include
data alignment
Data structure alignment is the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing.
The CPU in modern computer hardware performs reads and ...
,
data granularity
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted. ...
,
cache locality
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference locali ...
,
cache coherency
In computer architecture, cache coherence is the uniformity of shared resource data that ends up stored in multiple local caches. When clients in a system maintain caches of a common memory resource, problems may arise with incoherent data, whi ...
,
garbage collection
Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclable ...
,
instruction-level parallelism
Instruction-level parallelism (ILP) is the parallel or simultaneous execution of a sequence of instructions in a computer program. More specifically ILP refers to the average number of instructions run per step of this parallel execution.
Dis ...
,
multi-threading (at either a hardware or software level),
simultaneous multitasking, and
subroutine
In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed.
Functions ma ...
calls.
[Guy Lewis Steele, Jr. "Debunking the 'Expensive Procedure Call' Myth, or, Procedure Call Implementations Considered Harmful, or, Lambda: The Ultimate GOTO". MIT AI Lab. AI Lab Memo AIM-443. October 197]
/ref>
Some processors have capabilities for vector processor, vector processing, which allow a single instruction to operate on multiple operands; it may or may not be easy for a programmer or compiler to use these capabilities. Algorithms designed for sequential processing may need to be completely redesigned to make use of parallel processing, or they could be easily reconfigured. As parallel
Parallel is a geometric term of location which may refer to:
Computing
* Parallel algorithm
* Parallel computing
* Parallel metaheuristic
* Parallel (software), a UNIX utility for running programs in parallel
* Parallel Sysplex, a cluster of I ...
and distributed computing
A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. Distributed computing is a field of computer sci ...
grow in importance in the late 2010s, more investments are being made into efficient high-level
High-level and low-level, as technical terms, are used to classify, describe and point to specific goals of a systematic operation; and are applied in a wide range of contexts, such as, for instance, in domains as widely varied as computer scienc ...
API
An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
s for parallel and distributed computing systems such as CUDA
CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach ...
, TensorFlow
TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learning ...
, Hadoop
Apache Hadoop () is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage ...
, OpenMP
OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating sy ...
and MPI.
Another problem which can arise in programming is that processors compatible with the same instruction set
In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called a ...
(such as x86-64
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging ...
or ARM
In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between t ...
) may implement an instruction in different ways, so that instructions which are relatively fast on some models may be relatively slow on other models. This often presents challenges to optimizing compiler
In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power cons ...
s, which must have a great amount of knowledge of the specific CPU
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
and other hardware available on the compilation target to best optimize a program for performance. In the extreme case, a compiler may be forced to emulate instructions not supported on a compilation target platform, forcing it to generate code or link an external library call to produce a result that is otherwise incomputable on that platform, even if it is natively supported and more efficient in hardware on other platforms. This is often the case in embedded system
An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' ...
s with respect to floating-point arithmetic
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
, where small and low-power microcontroller
A microcontroller (MCU for ''microcontroller unit'', often also MC, UC, or μC) is a small computer on a single VLSI integrated circuit (IC) chip. A microcontroller contains one or more CPUs ( processor cores) along with memory and programma ...
s often lack hardware support for floating-point arithmetic and thus require computationally expensive software routines to produce floating point calculations.
Measures of resource usage
Measures are normally expressed as a function of the size of the input .
The two most common measures are:
* ''Time'': how long does the algorithm take to complete?
* ''Space'': how much working memory (typically RAM) is needed by the algorithm? This has two aspects: the amount of memory needed by the code (auxiliary space usage), and the amount of memory needed for the data on which the code operates (intrinsic space usage).
For computers whose power is supplied by a battery (e.g. laptops and smartphone
A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, whic ...
s), or for very long/large calculations (e.g. supercomputers), other measures of interest are:
* ''Direct power consumption'': power needed directly to operate the computer.
* ''Indirect power consumption'': power needed for cooling, lighting, etc.
, power consumption is growing as an important metric for computational tasks of all types and at all scales ranging from embedded
Embedded or embedding (alternatively imbedded or imbedding) may refer to:
Science
* Embedding, in mathematics, one instance of some mathematical object contained within another instance
** Graph embedding
* Embedded generation, a distributed ge ...
Internet of things
The Internet of things (IoT) describes physical objects (or groups of such objects) with sensors, processing ability, software and other technologies that connect and exchange data with other devices and systems over the Internet or other com ...
devices to system-on-chip
A system on a chip or system-on-chip (SoC ; pl. ''SoCs'' ) is an integrated circuit that integrates most or all components of a computer or other electronic system. These components almost always include a central processing unit (CPU), memo ...
devices to server farm
A server farm or server cluster is a collection of computer servers, usually maintained by an organization to supply server functionality far beyond the capability of a single machine. They often consist of thousands of computers which require ...
s. This trend is often referred to as green computing
Green computing, green IT, or ICT sustainability, is the study and practice of environmentally sustainable computing or IT.
The goals of green computing are similar to green chemistry: reduce the use of hazardous materials, maximize energy effi ...
.
Less common measures of computational efficiency may also be relevant in some cases:
*''Transmission size'': bandwidth could be a limiting factor. Data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressi ...
can be used to reduce the amount of data to be transmitted. Displaying a picture or image (e.g. Google logo
The Google logo appears in numerous settings to identify the search engine company. Google has used several logos over its history, with the first logo created by Sergey Brin using GIMP. A revised logo debuted on September 1, 2015. The previou ...
) can result in transmitting tens of thousands of bytes (48K in this case) compared with transmitting six bytes for the text "Google". This is important for I/O bound computing tasks.
*''External space'': space needed on a disk or other external memory device; this could be for temporary storage while the algorithm is being carried out, or it could be long-term storage needed to be carried forward for future reference.
*''Response time'' ( latency): this is particularly relevant in a real-time application
Real-time computing (RTC) is the computer science term for hardware and software systems subject to a "real-time constraint", for example from event to system response. Real-time programs must guarantee response within specified time constrai ...
when the computer system must respond quickly to some external event.
*''Total cost of ownership'': particularly if a computer is dedicated to one particular algorithm.
Time
Theory
Analyze the algorithm, typically using time complexity
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
analysis to get an estimate of the running time as a function of the size of the input data. The result is normally expressed using Big O notation. This is useful for comparing algorithms, especially when a large amount of data is to be processed. More detailed estimates are needed to compare algorithm performance when the amount of data is small, although this is likely to be of less importance. Algorithms which include parallel processing may be more difficult to analyze.
Practice
Use a benchmark
Benchmark may refer to:
Business and economics
* Benchmarking, evaluating performance within organizations
* Benchmark price
* Benchmark (crude oil), oil-specific practices
Science and technology
* Benchmark (surveying), a point of known elevati ...
to time the use of an algorithm. Many programming languages have an available function which provides CPU time usage. For long-running algorithms the elapsed time could also be of interest. Results should generally be averaged over several tests.
Run-based profiling can be very sensitive to hardware configuration and the possibility of other programs or tasks running at the same time in a multi-processing
Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor or the ability to allocate tasks between them. There are ...
and multi-programming
In computing, multitasking is the concurrent execution of multiple tasks (also known as processes) over a certain period of time. New tasks can interrupt already started ones before they finish, instead of waiting for them to end. As a result ...
environment.
This sort of test also depends heavily on the selection of a particular programming language, compiler, and compiler options, so algorithms being compared must all be implemented under the same conditions.
Space
This section is concerned with use of memory resources ( registers, cache
Cache, caching, or caché may refer to:
Places United States
* Cache, Idaho, an unincorporated community
* Cache, Illinois, an unincorporated community
* Cache, Oklahoma, a city in Comanche County
* Cache, Utah, Cache County, Utah
* Cache Coun ...
, RAM
Ram, ram, or RAM may refer to:
Animals
* A male sheep
* Ram cichlid, a freshwater tropical fish
People
* Ram (given name)
* Ram (surname)
* Ram (director) (Ramsubramaniam), an Indian Tamil film director
* RAM (musician) (born 1974), Dutch
...
, virtual memory
In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
, secondary memory
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a compute ...
) while the algorithm is being executed. As for time analysis above, analyze the algorithm, typically using space complexity
The space complexity of an algorithm or a computer program is the amount of memory space required to solve an instance of the computational problem as a function of characteristics of the input. It is the memory required by an algorithm until it ex ...
analysis to get an estimate of the run-time memory needed as a function as the size of the input data. The result is normally expressed using Big O notation.
There are up to four aspects of memory usage to consider:
* The amount of memory needed to hold the code for the algorithm.
* The amount of memory needed for the input data
In computing, input/output (I/O, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, possibly a human or another information processing system. Inputs are the signals ...
.
* The amount of memory needed for any output data.
**Some algorithms, such as sorting, often rearrange the input data and don't need any additional space for output data. This property is referred to as "in-place
In computer science, an in-place algorithm is an algorithm which transforms input using no auxiliary data structure. However, a small amount of extra storage space is allowed for auxiliary variables. The input is usually overwritten by the output ...
" operation.
* The amount of memory needed as working space during the calculation.
**This includes local variable
In computer science, a local variable is a variable that is given ''local scope''. A local variable reference in the function or block in which it is declared overrides the same variable name in the larger scope. In programming languages with o ...
s and any stack space needed by routines called during a calculation; this stack space can be significant for algorithms which use recursive
Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathemati ...
techniques.
Early electronic computers, and early home computers, had relatively small amounts of working memory. For example, the 1949 Electronic Delay Storage Automatic Calculator
The Electronic Delay Storage Automatic Calculator (EDSAC) was an early British computer. Inspired by John von Neumann's seminal ''First Draft of a Report on the EDVAC'', the machine was constructed by Maurice Wilkes and his team at the Universi ...
(EDSAC) had a maximum working memory of 1024 17-bit words, while the 1980 Sinclair ZX80
The Sinclair ZX80 is a home computer launched on 29 January 1980 by Science of Cambridge Ltd. (later to be better known as Sinclair Research). It is notable for being one of the first computers available in the United Kingdom for less than a ...
came initially with 1024 8-bit bytes of working memory. In the late 2010s, it is typical for personal computer
A personal computer (PC) is a multi-purpose microcomputer whose size, capabilities, and price make it feasible for individual use. Personal computers are intended to be operated directly by an end user, rather than by a computer expert or tech ...
s to have between 4 and 32 GB of RAM, an increase of over 300 million times as much memory.
Caching and memory hierarchy
Current computers can have relatively large amounts of memory (possibly Gigabytes), so having to squeeze an algorithm into a confined amount of memory is much less of a problem than it used to be. But the presence of four different categories of memory can be significant:
* Processor register
A processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-only. ...
s, the fastest of computer memory technologies with the least amount of storage space. Most direct computation on modern computers occurs with source and destination operands in registers before being updated to the cache, main memory and virtual memory if needed. On a processor core
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, ...
, there are typically on the order of hundreds of bytes or fewer of register availability, although a register file
A register file is an array of processor registers in a central processing unit (CPU). Register banking is the method of using a single name to access multiple different physical registers depending on the operating mode. Modern integrated circuit ...
may contain more physical registers than architectural
Architecture is the art and technique of designing and building, as distinguished from the skills associated with construction. It is both the process and the product of sketching, conceiving, planning, designing, and constructing buildings ...
registers defined in the instruction set architecture.
* Cache memory
In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewher ...
is the second fastest and second smallest memory available in the memory hierarchy. Caches are present in CPUs, GPUs, hard disk drives and external peripherals, and are typically implemented in static RAM
Static random-access memory (static RAM or SRAM) is a type of random-access memory (RAM) that uses latching circuitry (flip-flop) to store each bit. SRAM is volatile memory; data is lost when power is removed.
The term ''static'' differe ...
. Memory caches are multi-leveled; lower levels are larger, slower and typically shared between processor core
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, ...
s in multi-core processor
A multi-core processor is a microprocessor on a single integrated circuit with two or more separate processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary CPU instructions (suc ...
s. In order to process operands in cache memory, a processing unit must fetch the data from the cache, perform the operation in registers and write the data back to the cache. This operates at speeds comparable (about 2-10 times slower) with the CPU or GPU's arithmetic logic unit
In computing, an arithmetic logic unit (ALU) is a combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on floating point numb ...
or floating-point unit
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
if in the L1 cache
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, whic ...
. It is about 10 times slower if there is an L1 cache miss
In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhe ...
and it must be retrieved from and written to the L2 cache
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, whic ...
, and a further 10 times slower if there is an L2 cache miss and it must be retrieved from an L3 cache
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, whic ...
, if present.
* Main physical memory is most often implemented in dynamic RAM
Dynamic random-access memory (dynamic RAM or DRAM) is a type of random-access semiconductor memory that stores each bit of data in a memory cell, usually consisting of a tiny capacitor and a transistor, both typically based on metal-oxid ...
(DRAM). The main memory is much larger (typically gigabyte
The gigabyte () is a multiple of the unit byte for digital information. The prefix '' giga'' means 109 in the International System of Units (SI). Therefore, one gigabyte is one billion bytes. The unit symbol for the gigabyte is GB.
This defini ...
s compared to ≈8 megabyte
The megabyte is a multiple of the unit byte for digital information. Its recommended unit symbol is MB. The unit prefix ''mega'' is a multiplier of (106) in the International System of Units (SI). Therefore, one megabyte is one million bytes o ...
s) than an L3 CPU cache, with read and write latencies typically 10-100 times slower. , RAM is increasingly implemented on-chip of processors, as CPU or GPU memory
A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
.
* Virtual memory
In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
is most often implemented in terms of secondary storage
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a comput ...
such as a hard disk
A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with mag ...
, and is an extension to the memory hierarchy
In computer architecture, the memory hierarchy separates computer storage into a hierarchy based on response time. Since response time, complexity, and capacity are related, the levels may also be distinguished by their performance and controlli ...
that has much larger storage space but much larger latency, typically around 1000 times slower than a cache miss
In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhe ...
for a value in RAM. While originally motivated to create the impression of higher amounts of memory being available than were truly available, virtual memory is more important in contemporary usage for its time-space tradeoff and enabling the usage of virtual machine
In computing, a virtual machine (VM) is the virtualization/ emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized har ...
s. Cache misses from main memory are called page fault
In computing, a page fault (sometimes called PF or hard fault) is an exception that the memory management unit (MMU) raises when a process accesses a memory page without proper preparations. Accessing the page requires a mapping to be added to ...
s, and incur huge performance penalties on programs.
An algorithm whose memory needs will fit in cache memory will be much faster than an algorithm which fits in main memory, which in turn will be very much faster than an algorithm which has to resort to virtual memory. Because of this, cache replacement policies
In computing, cache algorithms (also frequently called cache replacement algorithms or cache replacement policies) are optimizing instructions, or algorithms, that a computer program or a hardware-maintained structure can utilize in order to m ...
are extremely important to high-performance computing, as are cache-aware programming and data alignment
Data structure alignment is the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing.
The CPU in modern computer hardware performs reads and ...
. To further complicate the issue, some systems have up to three levels of cache memory, with varying effective speeds. Different systems will have different amounts of these various types of memory, so the effect of algorithm memory needs can vary greatly from one system to another.
In the early days of electronic computing, if an algorithm and its data wouldn't fit in main memory then the algorithm couldn't be used. Nowadays the use of virtual memory appears to provide much memory, but at the cost of performance. If an algorithm and its data will fit in cache memory, then very high speed can be obtained; in this case minimizing space will also help minimize time. This is called the principle of locality
In physics, the principle of locality states that an object is influenced directly only by its immediate surroundings. A theory that includes the principle of locality is said to be a "local theory". This is an alternative to the concept of ins ...
, and can be subdivided into locality of reference
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
, spatial locality
In physics, the principle of locality states that an object is influenced directly only by its immediate surroundings. A theory that includes the principle of locality is said to be a "local theory". This is an alternative to the concept of ins ...
and temporal locality
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference locali ...
. An algorithm which will not fit completely in cache memory but which exhibits locality of reference may perform reasonably well.
Criticism of the current state of programming
* David May FRS a British
British may refer to:
Peoples, culture, and language
* British people, nationals or natives of the United Kingdom, British Overseas Territories, and Crown Dependencies.
** Britishness, the British identity and common culture
* British English ...
computer scientist
A computer scientist is a person who is trained in the academic study of computer science.
Computer scientists typically work on the theoretical side of computation, as opposed to the hardware side on which computer engineers mainly focus ( ...
and currently Professor
Professor (commonly abbreviated as Prof.) is an academic rank at universities and other post-secondary education and research institutions in most countries. Literally, ''professor'' derives from Latin as a "person who professes". Professor ...
of Computer Science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
at University of Bristol
The University of Bristol is a Red brick university, red brick Russell Group research university in Bristol, England. It received its royal charter in 1909, although it can trace its roots to a Society of Merchant Venturers, Merchant Venturers' sc ...
and founder and CTO of XMOS Semiconductor, believes one of the problems is that there is a reliance on Moore's law
Moore's law is the observation that the number of transistors in a dense integrated circuit (IC) doubles about every two years. Moore's law is an observation and projection of a historical trend. Rather than a law of physics, it is an empi ...
to solve inefficiencies. He has advanced an 'alternative' to Moore's law ( May's law) stated as follows:
Software efficiency halves every 18 months, compensating Moore's Law
:May goes on to state:
In ubiquitous systems, halving the instructions executed can double the battery life and big data sets bring big opportunities for better software and algorithms: Reducing the number of operations from NN to Nlog(N) has a dramatic effect when N is large ... for N = 30 billion, this change is as good as 50 years of technology improvements.
* Software author Adam N. Rosenburg in his blog "''The failure of the Digital computer''", has described the current state of programming as nearing the "Software event horizon", (alluding to the fictitious "''shoe event horizon''" described by Douglas Adams
Douglas Noel Adams (11 March 1952 – 11 May 2001) was an English author and screenwriter, best known for ''The Hitchhiker's Guide to the Galaxy''. Originally a 1978 BBC radio comedy, ''The Hitchhiker's Guide to the Galaxy'' developed into a " ...
in his ''Hitchhiker's Guide to the Galaxy'' book). He estimates there has been a 70 dB factor loss of productivity or "99.99999 percent, of its ability to deliver the goods", since the 1980s—"When Arthur C. Clarke
Sir Arthur Charles Clarke (16 December 191719 March 2008) was an English science-fiction writer, science writer, futurist, inventor, undersea explorer, and television series host.
He co-wrote the screenplay for the 1968 film '' 2001: A Spac ...
compared the reality of computing in 2001 to the computer HAL 9000
HAL 9000 is a fictional artificial intelligence character and the main antagonist in Arthur C. Clarke's '' Space Odyssey'' series. First appearing in the 1968 film '' 2001: A Space Odyssey'', HAL ( Heuristically programmed ALgorithmic compute ...
in his book 2001: A Space Odyssey, he pointed out how wonderfully small and powerful computers were but how disappointing computer programming had become".
Competitions for the best algorithms
The following competitions invite entries for the best algorithms based on some arbitrary criteria decided by the judges:
* Wired magazine
''Wired'' (stylized as ''WIRED'') is a monthly American magazine, published in print and online editions, that focuses on how emerging technologies affect culture, the economy, and politics. Owned by Condé Nast, it is headquartered in San F ...
See also
* Analysis of algorithms
In computer science, the analysis of algorithms is the process of finding the computational complexity of algorithms—the amount of time, storage, or other resources needed to execute them. Usually, this involves determining a function that r ...
—how to determine the resources needed by an algorithm
* Arithmetic coding
Arithmetic coding (AC) is a form of entropy encoding used in lossless data compression. Normally, a string of characters is represented using a fixed number of bits per character, as in the ASCII code. When a string is converted to arithmetic ...
—a form of variable-length entropy encoding
In information theory, an entropy coding (or entropy encoding) is any lossless data compression method that attempts to approach the lower bound declared by Shannon's source coding theorem, which states that any lossless data compression method ...
for efficient data compression
* Associative array
In computer science, an associative array, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In mathematical terms an ...
—a data structure that can be made more efficient using Patricia tree
In computer science, a radix tree (also radix trie or compact prefix tree or compressed trie) is a data structure that represents a space-optimized trie (prefix tree) in which each node that is the only child is merged with its parent. The res ...
s or Judy array
Judy is a short form of the name Judith.
Judy may refer to:
Places
* Judy, Kentucky, village in Montgomery County, United States
* Judy Woods, woodlands in Bradford, West Yorkshire, England, United Kingdom
Animals
* Judy (dog) (1936–1950), ...
s
* Benchmark
Benchmark may refer to:
Business and economics
* Benchmarking, evaluating performance within organizations
* Benchmark price
* Benchmark (crude oil), oil-specific practices
Science and technology
* Benchmark (surveying), a point of known elevati ...
—a method for measuring comparative execution times in defined cases
* Best, worst and average case
In computer science, best, worst, and average cases of a given algorithm express what the resource usage is ''at least'', ''at most'' and ''on average'', respectively. Usually the resource being considered is running time, i.e. time complexity, b ...
—considerations for estimating execution times in three scenarios
* Binary search algorithm
In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the ...
—a simple and efficient technique for searching sorted arrays
* Branch table
In computer programming, a branch table or jump table is a method of transferring program control ( branching) to another part of a program (or a different program that may have been dynamically loaded) using a table of branch or jump instruction ...
—a technique for reducing instruction path-length, size of machine code
In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ver ...
, (and often also memory)
* Comparison of programming paradigms
This article attempts to set out the various similarities and differences between the various programming paradigms as a summary in both graphical and tabular format with links to the separate discussions concerning these similarities and differ ...
—paradigm specific performance considerations
* Compiler optimization
In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power c ...
—compiler-derived optimization
* Computational complexity of mathematical operations
The following tables list the computational complexity of various algorithms for common mathematical operations.
Here, complexity refers to the time complexity of performing computations on a multitape Turing machine. See big O notation for an ...
* Computational complexity theory
In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved ...
* Computer performance
In computing, computer performance is the amount of useful work accomplished by a computer system. Outside of specific contexts, computer performance is estimated in terms of accuracy, efficiency and speed of executing computer program instructio ...
—computer hardware metrics
* Data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressi ...
—reducing transmission bandwidth and disk storage
* Database index
A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without ...
—a data structure that improves the speed of data retrieval operations on a database table
* Entropy encoding
In information theory, an entropy coding (or entropy encoding) is any lossless data compression method that attempts to approach the lower bound declared by Shannon's source coding theorem, which states that any lossless data compression method ...
—encoding data efficiently using frequency of occurrence of strings as a criterion for substitution
* Garbage collection
Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclable ...
—automatic freeing of memory after use
* Green computing
Green computing, green IT, or ICT sustainability, is the study and practice of environmentally sustainable computing or IT.
The goals of green computing are similar to green chemistry: reduce the use of hazardous materials, maximize energy effi ...
—a move to implement 'greener' technologies, consuming less resources
* Huffman algorithm—an algorithm for efficient data encoding
Improving Managed code Performance
��Microsoft MSDN Library
* Locality of reference
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
—for avoidance of caching
In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewher ...
delays caused by non-local memory access
* Loop optimization
In compiler theory, loop optimization is the process of increasing execution speed and reducing the overheads associated with loops. It plays an important role in improving cache performance and making effective use of parallel processing capa ...
* Memory management
Memory management is a form of resource management applied to computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when ...
* Optimization (computer science)
In computer science, program optimization, code optimization, or software optimization, is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources. In general, a computer program may be o ...
* Performance analysis—methods of measuring actual performance of an algorithm at run-time
* Real-time computing
Real-time computing (RTC) is the computer science term for hardware and software systems subject to a "real-time constraint", for example from event to system response. Real-time programs must guarantee response within specified time constra ...
—further examples of time-critical applications
* Run-time analysis—estimation of expected run-times and an algorithm's scalability
* Simultaneous multithreading
Simultaneous multithreading (SMT) is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading. SMT permits multiple independent threads of execution to better use the resources provided by modern process ...
*
* Speculative execution
Speculative execution is an optimization technique where a computer system performs some task that may not be needed. Work is done before it is known whether it is actually needed, so as to prevent a delay that would have to be incurred by doing ...
or Eager execution
In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a ''parameter-passing strategy'' that defines the kind of value that is passed to the f ...
* Branch prediction
In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow i ...
* Super-threading
Temporal multithreading is one of the two main forms of multithreading that can be implemented on computer processor hardware, the other being simultaneous multithreading. The distinguishing difference between the two forms is the maximum number ...
** Hyper-threading
Hyper-threading (officially called Hyper-Threading Technology or HT Technology and abbreviated as HTT or HT) is Intel's proprietary simultaneous multithreading (SMT) implementation used to improve parallelization of computations (doing multi ...
* Threaded code
In computer science, threaded code is a programming technique where the code has a form that essentially consists entirely of calls to subroutines. It is often used in compilers, which may generate code in that form or be implemented in that for ...
—similar to virtual method table or branch table
* Virtual method table
In computer programming, a virtual method table (VMT), virtual function table, virtual call table, dispatch table, vtable, or vftable is a mechanism used in a programming language to support dynamic dispatch (or run-time method binding).
Whe ...
—branch table with dynamically assigned pointers for dispatching
References
{{Authority control
Analysis of algorithms
Computer performance
Software optimization
Software quality