In
computer programming
Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...
, a function (also procedure, method, subroutine, routine, or subprogram) is a callable unit of
software logic that has a well-defined
interface and
behavior
Behavior (American English) or behaviour (British English) is the range of actions of Individual, individuals, organisms, systems or Artificial intelligence, artificial entities in some environment. These systems can include other systems or or ...
and can be invoked multiple times.
Callable units provide a powerful programming tool.
The primary purpose is to allow for the decomposition of a large and/or complicated problem into chunks that have relatively low
cognitive load and to assign the chunks meaningful names (unless they are anonymous). Judicious application can reduce the cost of developing and maintaining software, while increasing its quality and reliability.
Callable units are present at multiple levels of
abstraction
Abstraction is a process where general rules and concepts are derived from the use and classifying of specific examples, literal (reality, real or Abstract and concrete, concrete) signifiers, first principles, or other methods.
"An abstraction" ...
in the programming environment. For example, a
programmer
A programmer, computer programmer or coder is an author of computer source code someone with skill in computer programming.
The professional titles Software development, ''software developer'' and Software engineering, ''software engineer' ...
may write a function in
source code
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, only ...
that is compiled to
machine code that implements similar
semantics
Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
. There is a callable unit in the source code and an associated one in the machine code, but they are different kinds of callable units with different implications and features.
Terminology
Some
programming languages, such as
COBOL and
BASIC
Basic or BASIC may refer to:
Science and technology
* BASIC, a computer programming language
* Basic (chemistry), having the properties of a base
* Basic access authentication, in HTTP
Entertainment
* Basic (film), ''Basic'' (film), a 2003 film
...
, make a distinction between functions that return a value (typically called "functions") and those that do not (typically called "subprogram", "subroutine", or "procedure"). Other programming languages, such as
C,
C++, and
Rust, only use the term "function" irrespective of whether they return a value or not. Some
object-oriented languages, such as
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
and
C#, refer to functions inside
classes as "
methods".
History
The idea of a callable unit was initially conceived by
John Mauchly
John William Mauchly ( ; August 30, 1907 – January 8, 1980) was an American physicist who, along with J. Presper Eckert, designed ENIAC, the first general-purpose electronic digital computer, as well as EDVAC, BINAC and UNIVAC I, the f ...
and
Kathleen Antonelli during their work on
ENIAC
ENIAC (; Electronic Numerical Integrator and Computer) was the first Computer programming, programmable, Electronics, electronic, general-purpose digital computer, completed in 1945. Other computers had some of these features, but ENIAC was ...
and recorded in a January 1947 Harvard symposium on "Preparation of Problems for
EDVAC-type Machines."
Maurice Wilkes,
David Wheeler, and
Stanley Gill are generally credited with the formal invention of this concept, which they termed a ''closed sub-routine'', contrasted with an ''open subroutine'' or
macro. However,
Alan Turing
Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist. He was highly influential in the development of theoretical computer ...
had discussed subroutines in a paper of 1945 on design proposals for the NPL
ACE, going so far as to invent the concept of a
return address stack.
The idea of a subroutine was worked out after computing machines had already existed for some time. The arithmetic and conditional jump instructions were planned ahead of time and have changed relatively little, but the special instructions used for procedure calls have changed greatly over the years. The earliest computers and microprocessors, such as the
Manchester Baby and the
RCA 1802, did not have a single subroutine call instruction. Subroutines could be implemented, but they required programmers to use the call sequence—a series of instructions—at each
call site.
Subroutines were implemented in
Konrad Zuse's
Z4 in 1945.
In 1945,
Alan M. Turing used the terms "bury" and "unbury" as a means of calling and returning from subroutines.
[ (NB. Presented on 1946-03-19 before the Executive Committee of the National Physical Laboratory (Great Britain).)][ (11 pages)]
In January 1947 John Mauchly presented general notes at 'A Symposium of Large Scale Digital Calculating Machinery'
under the joint sponsorship of Harvard University and the Bureau of Ordnance, United States Navy. Here he discusses serial and parallel operation suggesting
Kay McNulty had worked closely with John Mauchly on the
ENIAC
ENIAC (; Electronic Numerical Integrator and Computer) was the first Computer programming, programmable, Electronics, electronic, general-purpose digital computer, completed in 1945. Other computers had some of these features, but ENIAC was ...
team and developed an idea for subroutines for the
ENIAC
ENIAC (; Electronic Numerical Integrator and Computer) was the first Computer programming, programmable, Electronics, electronic, general-purpose digital computer, completed in 1945. Other computers had some of these features, but ENIAC was ...
computer she was programming during World War II.
She and the other ENIAC programmers used the subroutines to help calculate missile trajectories.
Goldstine and
von Neumann wrote a paper dated 16 August 1948 discussing the use of subroutines.
Some very early computers and microprocessors, such as the
IBM 1620
The IBM 1620 was a model of scientific minicomputer produced by IBM. It was announced on October 21, 1959, and was then marketed as an inexpensive scientific computer. After a total production of about two thousand machines, it was withdrawn on N ...
, the
Intel 4004 and
Intel 8008, and the
PIC microcontrollers, have a single-instruction subroutine call that uses a dedicated hardware stack to store return addresses—such hardware supports only a few levels of subroutine nesting, but can support recursive subroutines. Machines before the mid-1960s—such as the
UNIVAC I
The UNIVAC I (Universal Automatic Computer I) was the first general-purpose electronic digital computer design for business application produced in the United States. It was designed principally by J. Presper Eckert and John Mauchly, the invento ...
, the
PDP-1, and the
IBM 1130—typically use a
calling convention which saved the instruction counter in the first memory location of the called subroutine. This allows arbitrarily deep levels of subroutine nesting but does not support recursive subroutines. The
IBM System/360 had a subroutine call instruction that placed the saved instruction counter value into a general-purpose register; this can be used to support arbitrarily deep subroutine nesting and recursive subroutines. The Burroughs
B5000[
] (1961) is one of the first computers to store subroutine return data on a stack.
The DEC
PDP-6 (1964) is one of the first accumulator-based machines to have a subroutine call instruction that saved the return address in a stack addressed by an accumulator or index register. The later
PDP-10
Digital Equipment Corporation (DEC)'s PDP-10, later marketed as the DECsystem-10, is a mainframe computer family manufactured beginning in 1966 and discontinued in 1983. 1970s models and beyond were marketed under the DECsystem-10 name, especi ...
(1966),
PDP-11 (1970) and
VAX-11
The VAX-11 is a discontinued family of 32-bit superminicomputers, running the Virtual Address eXtension (VAX) instruction set architecture (ISA), developed and manufactured by Digital Equipment Corporation (DEC). Development began in 1976. In ...
(1976) lines followed suit; this feature also supports both arbitrarily deep subroutine nesting and recursive subroutines.
[
Guy Lewis Steele Jr.
AI Memo 443]
'Debunking the "Expensive Procedure Call" Myth; or, Procedure call implementations considered harmful"
Section "C. Why Procedure Calls Have a Bad Reputation".
Language support
In the very early assemblers, subroutine support was limited. Subroutines were not explicitly separated from each other or from the main program, and indeed the source code of a subroutine could be interspersed with that of other subprograms. Some assemblers would offer predefined
macros to generate the call and return sequences. By the 1960s, assemblers usually had much more sophisticated support for both inline and separately assembled subroutines that could be linked together.
One of the first programming languages to support user-written subroutines and functions was
FORTRAN II. The IBM FORTRAN II compiler was released in 1958.
ALGOL 58 and other early programming languages also supported procedural programming.
Libraries
Even with this cumbersome approach, subroutines proved very useful. They allowed the use of the same code in many different programs. Memory was a very scarce resource on early computers, and subroutines allowed significant savings in the size of programs.
Many early computers loaded the program instructions into memory from a
punched paper tape. Each subroutine could then be provided by a separate piece of tape, loaded or spliced before or after the main program (or "mainline"); and the same subroutine tape could then be used by many different programs. A similar approach was used in computers that loaded program instructions from
punched card
A punched card (also punch card or punched-card) is a stiff paper-based medium used to store digital information via the presence or absence of holes in predefined positions. Developed over the 18th to 20th centuries, punched cards were widel ...
s. The name ''subroutine library'' originally meant a library, in the literal sense, which kept indexed collections of tapes or decks of cards for collective use.
Return by indirect jump
To remove the need for
self-modifying code
In computer science, self-modifying code (SMC or SMoC) is source code, code that alters its own instruction (computer science), instructions while it is execution (computing), executing – usually to reduce the instruction path length and imp ...
, computer designers eventually provided an ''
indirect jump'' instruction, whose operand, instead of being the
return address
In postal mail, a return address is an explicit inclusion of the address of the person sending the message. It provides the recipient (and sometimes authorized intermediaries) with a means to determine how to respond to the sender of the message ...
itself, was the location of a variable or
processor register containing the return address.
On those computers, instead of modifying the function's return jump, the calling program would store the return address in a variable so that when the function completed, it would execute an indirect jump that would direct execution to the location given by the predefined variable.
Jump to subroutine
Another advance was the ''jump to subroutine'' instruction, which combined the saving of the return address with the calling jump, thereby minimizing
overhead significantly.
In the IBM
System/360, for example, the branch instructions BAL or BALR, designed for procedure calling, would save the return address in a processor register specified in the instruction, by convention register 14. To return, the subroutine had only to execute an indirect branch instruction (BR) through that register. If the subroutine needed that register for some other purpose (such as calling another subroutine), it would save the register's contents to a private memory location or a register
stack.
In systems such as the
HP 2100
The HP 2100 is a series of 16-bit minicomputers that were produced by Hewlett-Packard (HP) from the mid-1960s to early 1990s. Tens of thousands of machines in the series were sold over its 25-year lifetime, making HP the fourth-largest minicomp ...
, the JSB instruction would perform a similar task, except that the return address was stored in the memory location that was the target of the branch. Execution of the procedure would actually begin at the next memory location. In the HP 2100 assembly language, one would write, for example
...
JSB MYSUB (Calls subroutine MYSUB.)
BB ... (Will return here after MYSUB is done.)
to call a subroutine called MYSUB from the main program. The subroutine would be coded as
MYSUB NOP (Storage for MYSUB's return address.)
AA ... (Start of MYSUB's body.)
...
JMP MYSUB,I (Returns to the calling program.)
The JSB instruction placed the address of the NEXT instruction (namely, BB) into the location specified as its operand (namely, MYSUB), and then branched to the NEXT location after that (namely, AA = MYSUB + 1). The subroutine could then return to the main program by executing the indirect jump JMP MYSUB, I which branched to the location stored at location MYSUB.
Compilers for Fortran and other languages could easily make use of these instructions when available. This approach supported multiple levels of calls; however, since the return address, parameters, and return values of a subroutine were assigned fixed memory locations, it did not allow for recursive calls.
Incidentally, a similar method was used by
Lotus 1-2-3, in the early 1980s, to discover the recalculation dependencies in a spreadsheet. Namely, a location was reserved in each cell to store the ''return'' address. Since
circular references are not allowed for natural recalculation order, this allows a tree walk without reserving space for a stack in memory, which was very limited on small computers such as the
IBM PC
The IBM Personal Computer (model 5150, commonly known as the IBM PC) is the first microcomputer released in the List of IBM Personal Computer models, IBM PC model line and the basis for the IBM PC compatible ''de facto'' standard. Released on ...
.
Call stack
Most modern implementations of a function call use a
call stack
In computer science, a call stack is a Stack (abstract data type), stack data structure that stores information about the active subroutines and block (programming), inline blocks of a computer program. This type of stack is also known as an exe ...
, a special case of the
stack data structure, to implement function calls and returns. Each procedure call creates a new entry, called a ''stack frame'', at the top of the stack; when the procedure returns, its stack frame is deleted from the stack, and its space may be used for other procedure calls. Each stack frame contains the ''private data'' of the corresponding call, which typically includes the procedure's parameters and internal variables, and the return address.
The call sequence can be implemented by a sequence of ordinary instructions (an approach still used in
reduced instruction set computing (RISC) and
very long instruction word (VLIW) architectures), but many traditional machines designed since the late 1960s have included special instructions for that purpose.
The call stack is usually implemented as a contiguous area of memory. It is an arbitrary design choice whether the bottom of the stack is the lowest or highest address within this area, so that the stack may grow forwards or backwards in memory; however, many architectures chose the latter.
Some designs, notably some
Forth implementations, used two separate stacks, one mainly for control information (like return addresses and loop counters) and the other for data. The former was, or worked like, a call stack and was only indirectly accessible to the programmer through other language constructs while the latter was more directly accessible.
When stack-based procedure calls were first introduced, an important motivation was to save precious memory. With this scheme, the compiler does not have to reserve separate space in memory for the private data (parameters, return address, and local variables) of each procedure. At any moment, the stack contains only the private data of the calls that are currently ''active'' (namely, which have been called but haven't returned yet). Because of the ways in which programs were usually assembled from libraries, it was (and still is) not uncommon to find programs that include thousands of functions, of which only a handful are active at any given moment. For such programs, the call stack mechanism could save significant amounts of memory. Indeed, the call stack mechanism can be viewed as the earliest and simplest method for
automatic memory management.
However, another advantage of the call stack method is that it allows
recursive function calls, since each nested call to the same procedure gets a separate instance of its private data.
In a
multi-threaded environment, there is generally more than one stack.
An environment that fully supports
coroutines or
lazy evaluation
In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an Expression (computer science), expression until its value is needed (non-strict evaluation) and which avoids repeated eva ...
may use data structures other than stacks to store their activation records.
Delayed stacking
One disadvantage of the call stack mechanism is the increased cost of a procedure call and its matching return. The extra cost includes incrementing and decrementing the stack pointer (and, in some architectures, checking for
stack overflow
In software, a stack overflow occurs if the call stack pointer exceeds the stack bound. The call stack may consist of a limited amount of address space, often determined at the start of the program. The size of the call stack depends on many fa ...
), and accessing the local variables and parameters by frame-relative addresses, instead of absolute addresses. The cost may be realized in increased execution time, or increased processor complexity, or both.
This overhead is most obvious and objectionable in ''leaf procedures'' or ''leaf functions'', which return without making any procedure calls themselves. To reduce that overhead, many modern compilers try to delay the use of a call stack until it is really needed. For example, the call of a procedure ''P'' may store the return address and parameters of the called procedure in certain processor registers, and transfer control to the procedure's body by a simple jump. If the procedure ''P'' returns without making any other call, the call stack is not used at all. If ''P'' needs to call another procedure ''Q'', it will then use the call stack to save the contents of any registers (such as the return address) that will be needed after ''Q'' returns.
Features
In general, a callable unit is a list of instructions that, starting at the first instruction, executes sequentially except as directed via its internal logic. It can be invoked (called) many times during the
execution
Capital punishment, also known as the death penalty and formerly called judicial homicide, is the state-sanctioned killing of a person as punishment for actual or supposed misconduct. The sentence ordering that an offender be punished in ...
of a program. Execution continues at the next instruction after the call instruction when it
returns control.
Implementations
The features of
implementation
Implementation is the realization of an application, execution of a plan, idea, scientific modelling, model, design, specification, Standardization, standard, algorithm, policy, or the Management, administration or management of a process or Goal ...
s of callable units evolved over time and varies by context.
This section describes features of the various common implementations.
General characteristics
Most modern programming languages provide features to define and call functions, including
syntax
In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...
for accessing such features, including:
* Delimit the implementation of a function from the rest of the program
* Assign an
identifier, name, to a function
* Define
formal parameters with a name and
data type
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
for each
* Assign a
data type
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
to the return value, if any
* Specify a return value in the function body
* Call a function
* Provide
actual parameters that correspond to a called function's formal parameters
*
Return control to the caller at the point of call
* Consume the return value in the caller
* Dispose of the values returned by a call
* Provide a private
naming scope for variables
* Identify variables outside the function that are accessible within it
* Propagate an
exceptional condition out of a function and to handle it in the calling context
* Package functions into a container such as
module,
library
A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
,
object, or
class
Class, Classes, or The Class may refer to:
Common uses not otherwise categorized
* Class (biology), a taxonomic rank
* Class (knowledge representation), a collection of individuals or objects
* Class (philosophy), an analytical concept used d ...
Naming
Some languages, such as
Pascal,
Fortran,
Ada and many
dialects of
BASIC
Basic or BASIC may refer to:
Science and technology
* BASIC, a computer programming language
* Basic (chemistry), having the properties of a base
* Basic access authentication, in HTTP
Entertainment
* Basic (film), ''Basic'' (film), a 2003 film
...
, use a different name for a callable unit that returns a value (''function'' or ''subprogram'') vs. one that does not (''subroutine'' or ''procedure'').
Other languages, such as
C,
C++,
C# and
Lisp
Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation.
Originally specified in the late 1950s, ...
, use only one name for a callable unit, ''function''. The C-family languages use the keyword
void
to indicate no return value.
Call syntax
If declared to return a value, a call can be embedded in an
expression in order to consume the return value. For example, a square root callable unit might be called like
y = sqrt(x)
.
A callable unit that does not return a value is called as a stand-alone
statement like
print("hello")
. This syntax can also be used for a callable unit that returns a value, but the return value will be ignored.
Some older languages require a keyword for calls that do not consume a return value, like
CALL print("hello")
.
Parameters
Most implementations, especially in modern languages, support
parameters which the callable declares as ''formal parameters''. A caller passes ''actual parameters'', a.k.a. ''arguments'', to match. Different programming languages provide different conventions for passing arguments.
Return value
In some languages, such as BASIC, a callable has different syntax (i.e. keyword) for a callable that returns a value vs. one that does not.
In other languages, the syntax is the same regardless.
In some of these languages an extra keyword is used to declare no return value; for example
void
in C, C++ and C#.
In some languages, such as Python, the difference is whether the body contains a return statement with a value, and a particular callable may return with or without a value based on control flow.
Side effects
In many contexts, a callable may have
side effect
In medicine, a side effect is an effect of the use of a medicinal drug or other treatment, usually adverse but sometimes beneficial, that is unintended. Herbal and traditional medicines also have side effects.
A drug or procedure usually use ...
behavior such as modifying passed or global data, reading from or writing to a
peripheral device, accessing a
file, halting the program or the machine, or temporarily pausing program execution.
Side effects are considered undesireble by
Robert C. Martin, who is known for promoting design principles. Martin argues that side effects can result in
temporal coupling or order dependencies.
In strictly
functional programming
In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...
languages such as
Haskell
Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
, a function can have no
side effects
In medicine, a side effect is an effect of the use of a medicinal drug or other treatment, usually adverse but sometimes beneficial, that is unintended. Herbal and traditional medicines also have side effects.
A drug or procedure usually used ...
, which means it cannot change the state of the program. Functions always return the same result for the same input. Such languages typically only support functions that return a value, since there is no value in a function that has neither return value nor side effect.
Local variables
Most contexts support ''local variables''
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...
owned by a callable to hold intermediate values. These variables are typically stored in the call's ''activation record'' on the
call stack
In computer science, a call stack is a Stack (abstract data type), stack data structure that stores information about the active subroutines and block (programming), inline blocks of a computer program. This type of stack is also known as an exe ...
along with other information such as the
return address
In postal mail, a return address is an explicit inclusion of the address of the person sending the message. It provides the recipient (and sometimes authorized intermediaries) with a means to determine how to respond to the sender of the message ...
.
Nested call recursion
If supported by the language, a callable may call itself, causing its execution to suspend while another ''nested'' execution of the same callable executes.
Recursion
Recursion occurs when the definition of a concept or process depends on a simpler or previous version of itself. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in m ...
is a useful means to simplify some complex algorithms and break down complex problems. Recursive languages provide a new copy of local variables on each call. If the programmer desires the recursive callable to use the same variables instead of using locals, they typically declare them in a shared context such static or global.
Languages going back to
ALGOL,
PL/I and
C and modern languages, almost invariably use a call stack, usually supported by the instruction sets to provide an activation record for each call. That way, a nested call can modify its local variables without affecting any of the suspended calls variables.
Recursion allows direct implementation of functionality defined by
mathematical induction
Mathematical induction is a method for mathematical proof, proving that a statement P(n) is true for every natural number n, that is, that the infinitely many cases P(0), P(1), P(2), P(3), \dots all hold. This is done by first proving a ...
and recursive
divide and conquer algorithms. Here is an example of a recursive function in C/C++ to find
Fibonacci
Leonardo Bonacci ( – ), commonly known as Fibonacci, was an Italians, Italian mathematician from the Republic of Pisa, considered to be "the most talented Western mathematician of the Middle Ages".
The name he is commonly called, ''Fibonacci ...
numbers:
int Fib(int n)
Early languages like
Fortran did not initially support recursion because only one set of variables and return address were allocated for each callable.
Early computer instruction sets made storing return addresses and variables on a stack difficult. Machines with
index registers or
general-purpose registers, e.g.,
CDC 6000 series
The CDC 6000 series is a discontinued family of mainframe computers manufactured by Control Data Corporation in the 1960s. It consisted of the CDC 6200, CDC 6300, #Versions, CDC 6400, #Versions, CDC 6500, CDC 6600 and #Versions, CDC 6700 computers, ...
,
PDP-6,
GE 635,
System/360,
UNIVAC 1100 series, could use one of those registers as a
stack pointer.
Nested scope
Some languages, e.g.,
Ada,
Pascal,
PL/I,
Python, support declaring and defining a function inside, e.g., a function body, such that the name of the inner is only visible within the body of the outer.
Reentrancy
If a callable can be executed properly even when another execution of the same callable is already in progress, that callable is said to be ''
reentrant''. A reentrant callable is also useful in
multi-threaded situations since multiple threads can call the same callable without fear of interfering with each other. In the
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
CICS transaction processing system, ''quasi-reentrant'' was a slightly less restrictive, but similar, requirement for application programs that were shared by many threads.
Overloading
Some languages support ''overloading'' allow multiple callables with the same name in the same scope, but operating on different types of input. Consider the square root function applied to real number, complex number and matrix input. The algorithm for each type of input is different, and the return value may have a different type. By writing three separate callables with the same name. i.e. ''sqrt'', the resulting code may be easier to write and to maintain since each one has a name that is relatively easy to understand and to remember instead of giving longer and more complicated names like ''sqrt_real'', ''sqrt_complex'', ''qrt_matrix''.
Overloading is supported in many languages that support
strong typing. Often the compiler selects the overload to call based on the type of the input arguments or it fails if the input arguments do not select an overload. Older and weakly-typed languages generally do not support overloading.
Here is an example of overloading in
C++, two functions
Area
that accept different types:
// returns the area of a rectangle defined by height and width
double Area(double h, double w)
// returns the area of a circle defined by radius
double Area(double r)
int main()
PL/I has the
GENERIC
attribute to define a generic name for a set of entry references called with different types of arguments. Example:
DECLARE gen_name GENERIC(
name WHEN(FIXED BINARY),
flame WHEN(FLOAT),
pathname OTHERWISE);
Multiple argument definitions may be specified for each entry. A call to "gen_name" will result in a call to "name" when the argument is FIXED BINARY, "flame" when FLOAT", etc. If the argument matches none of the choices "pathname" will be called.
Closure
A ''
closure'' is a callable plus values of some of its variables captured from the environment in which it was created. Closures were a notable feature of the Lisp programming language, introduced by
John McCarthy. Depending on the implementation, closures can serve as a mechanism for side-effects.
Exception reporting
Besides its
happy path behavior, a callable may need to inform the caller about an
exceptional condition that occurred during its execution.
Most modern languages support exceptions which allows for exceptional control flow that pops the call stack until an exception handler is found to handle the condition.
Languages that do not support exceptions can use the return value to indicate success or failure of a call. Another approach is to use a well-known location like a global variable for success indication. A callable writes the value and the caller reads it after a call.
In the
IBM System/360, where return code was expected from a subroutine, the return value was often designed to be a multiple of 4—so that it could be used as a direct
branch table index into a branch table often located immediately after the call instruction to avoid extra conditional tests, further improving efficiency. In the
System/360 assembly language
In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
, one would write, for example:
BAL 14, SUBRTN01 go to a subroutine, storing return address in R14
B TABLE(15) use returned value in reg 15 to index the branch table,
* branching to the appropriate branch instr.
TABLE B OK return code =00 GOOD }
B BAD return code =04 Invalid input } Branch table
B ERROR return code =08 Unexpected condition }
Call overhead
A call has runtime
overhead, which may include but is not limited to:
* Allocating and reclaiming call stack storage
* Saving and restoring processor registers
* Copying input variables
* Copying values after the call into the caller's context
* Automatic testing of the return code
* Handling of exceptions
* Dispatching such as for a virtual method in an object-oriented language
Various techniques are employed to minimize the runtime cost of calls.
Compiler optimization
Some optimizations for minimizing call overhead may seem straight forward, but cannot be used if the callable has side effects. For example, in the expression
(f(x)-1)/(f(x)+1)
, the function
f
cannot be called only once with its value used two times since the two calls may return different results. Moreover, in the few languages which define the order of evaluation of the division operator's operands, the value of
x
must be fetched again before the second call, since the first call may have changed it. Determining whether a callable has a side effect is difficult indeed,
undecidable by virtue of
Rice's theorem. So, while this optimization is safe in a purely functional programming language, a compiler for a language not limited to functional typically assumes the worst case, that every callable may have side effects.
Inlining
Inlining eliminates calls for particular callables. The compiler replaces each call with the compiled code of the callable. Not only does this avoid the call overhead, but it also allows the
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
to
optimize code of the caller more effectively by taking into account the context and arguments at that call. Inlining, however, usually increases the compiled code size, except when only called once or the body is very short, like one line.
Sharing
Callables can be defined within a program, or separately in a
library
A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
that can be used by multiple programs.
Inter-operability
A
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
translates call and return statements into machine instructions according to a well-defined
calling convention. For code compiled by the same or a compatible compiler, functions can be compiled separately from the programs that call them. The instruction sequences corresponding to call and return statements are called the procedure's
prologue and epilogue.
Built-in functions
A ''built-in function'', or ''builtin function'', or ''intrinsic function'', is a function for which the compiler generates code at
compile time or provides in a way other than for other functions. A built-in function does not need to be defined like other functions since it is ''built in'' to the programming language.
Programming
Trade-offs
Advantages
Advantages of breaking a program into functions include:
*
Decomposing a complex programming task into simpler steps: this is one of the two main tools of
structured programming Structured programming is a programming paradigm aimed at improving the clarity, quality, and development time of a computer program by making specific disciplined use of the structured control flow constructs of selection ( if/then/else) and repet ...
, along with
data structure
In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
s
* Reducing
duplicate code within a program
* Enabling
reuse of code across multiple programs
* Dividing a large programming task among various programmers or various stages of a project
*
Hiding implementation details from users of the function
* Improving readability of code by replacing a block of code with a function call where
descriptivefunction name serves to describe the block of code. This makes the calling code concise and readable even if the function is not meant to be reused.
* Improving
traceability (i.e. most languages offer ways to obtain the call trace which includes the names of the involved functions and perhaps even more information such as file names and line numbers); by not decomposing the code into functions, debugging would be severely impaired
Disadvantages
Compared to using in-line code, invoking a function imposes some
computational overhead in the call mechanism.
A function typically requires standard
housekeeping
Housekeeping is the management and routine support activities of running and maintaining an organized physical institution occupied or used by people, like a house, ship, hospital or factory, such as cleaning, tidying/organizing, cooking, shopp ...
code – both at the entry to, and exit from, the function (
function prologue and epilogue – usually saving
general purpose registers and return address as a minimum).
Conventions
Many programming conventions have been developed regarding callables.
With respect to naming, many developers name a callable with a phrase starting with a
verb
A verb is a word that generally conveys an action (''bring'', ''read'', ''walk'', ''run'', ''learn''), an occurrence (''happen'', ''become''), or a state of being (''be'', ''exist'', ''stand''). In the usual description of English, the basic f ...
when it does a certain task, with an
adjective
An adjective (abbreviations, abbreviated ) is a word that describes or defines a noun or noun phrase. Its semantic role is to change information given by the noun.
Traditionally, adjectives are considered one of the main part of speech, parts of ...
when it makes an inquiry, and with a
noun
In grammar, a noun is a word that represents a concrete or abstract thing, like living creatures, places, actions, qualities, states of existence, and ideas. A noun may serve as an Object (grammar), object or Subject (grammar), subject within a p ...
when it is used to substitute variables.
Some programmers suggest that a callable should perform exactly one task, and if it performs more than one task, it should be split up into multiple callables. They argue that callables are key components in
software maintenance
Software maintenance is the modification of software after delivery.
Software maintenance is often considered lower skilled and less rewarding than new development. As such, it is a common target for outsourcing or offshoring. Usually, the tea ...
, and their roles in the program must remain distinct.
Proponents of
modular programming
Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect or "concern" of the d ...
advocate that each callable should have minimal dependency on the rest of the
codebase. For example, the use of
global variables is generally deemed unwise, because it adds coupling between all callables that use the global variables. If such coupling is not necessary, they advise to
refactor callables to accept passed
parameters instead.
Examples
Early BASIC
Early BASIC variants require each line to have a unique number (
line number) that orders the lines for execution, provides no separation of the code that is callable, no mechanism for passing arguments or to return a value and all variables are global. It provides the command
GOSUB
where ''sub'' is short for ''sub procedure'', ''subprocedure'' or ''subroutine''. Control jumps to the specified line number and then continues on the next line on return.
10 REM A BASIC PROGRAM
20 GOSUB 100
30 GOTO 20
100 INPUT “GIVE ME A NUMBER”; N
110 PRINT “THE SQUARE ROOT OF”; N;
120 PRINT “IS”; SQRT(N)
130 RETURN
This code repeatedly asks the user to enter a number and reports the square root of the value. Lines 100-130 are the callable.
Small Basic
In
Microsoft Small Basic, targeted to the student first learning how to program in a text-based language, a callable unit is called a ''subroutine''.
The
Sub
keyword denotes the start of a subroutine and is followed by a name identifier. Subsequent lines are the body which ends with the
EndSub
keyword.
Sub SayHello
TextWindow.WriteLine("Hello!")
EndSub
This can be called as
SayHello()
.
Visual Basic
In later versions of
Visual Basic Visual Basic is a name for a family of programming languages from Microsoft. It may refer to:
* Visual Basic (.NET), the current version of Visual Basic launched in 2002 which runs on .NET
* Visual Basic (classic), the original Visual Basic suppo ...
(VB), including the
latest product line and
VB6, the term ''procedure'' is used for the callable unit concept. The keyword
Sub
is used to return no value and
Function
to return a value. When used in the context of a class, a procedure is a method.
Each parameter has a
data type
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
that can be specified, but if not, defaults to
Object
for later versions based on
.NET
The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...
and
variant for
VB6.
VB supports parameter passing conventions
by value and
by reference via the keywords
ByVal
and
ByRef
, respectively.
Unless
ByRef
is specified, an argument is passed
ByVal
. Therefore,
ByVal
is rarely explicitly specified.
For a simple type like a number these conventions are relatively clear. Passing
ByRef
allows the procedure to modify the passed variable whereas passing
ByVal
does not. For an object, semantics can confuse programmers since an object is always treated as a reference. Passing an object
ByVal
copies the reference; not the state of the object. The called procedure can modify the state of the object via its methods yet cannot modify the object reference of the actual parameter.
Sub DoSomething()
' Some Code Here
End Sub
The does not return a value and has to be called stand-alone, like
DoSomething
Function GiveMeFive() as Integer
GiveMeFive= 5
End Function
This returns the value 5, and a call can be part of an expression like
y = x + GiveMeFive()
Sub AddTwo(ByRef intValue as Integer)
intValue = intValue + 2
End Sub
This has a side-effect modifies the variable passed by reference and could be called for variable
v
like
AddTwo(v)
. Giving v is 5 before the call, it will be 7 after.
C and C++
In
C and
C++, a callable unit is called a ''function''.
A function definition starts with the name of the type of value that it returns or
void
to indicate that it does not return a value. This is followed by the function name, formal arguments in parentheses, and body lines in braces.
In
C++, a function declared in a
class
Class, Classes, or The Class may refer to:
Common uses not otherwise categorized
* Class (biology), a taxonomic rank
* Class (knowledge representation), a collection of individuals or objects
* Class (philosophy), an analytical concept used d ...
(as non-static) is called a ''member function'' or ''method''. A function outside of a class can be called a ''free function'' to distinguish it from a member function.
void doSomething()
This function does not return a value and is always called stand-alone, like
doSomething()
int giveMeFive()
This function returns the integer value 5. The call can be stand-alone or in an expression like
y = x + giveMeFive()
void addTwo(int *pi)
This function has a side-effect modifies the value passed by address to the input value plus 2. It could be called for variable
v
as
addTwo(&v)
where the ampersand (&) tells the compiler to pass the address of a variable. Giving v is 5 before the call, it will be 7 after.
void addTwo(int& i)
This function requires C++ would not compile as C. It has the same behavior as the preceding example but passes the actual parameter by reference rather than passing its address. A call such as
addTwo(v)
does not include an ampersand since the compiler handles passing by reference without syntax in the call.
PL/I
In
PL/I a called procedure may be passed a ''
descriptor'' providing information about the argument, such as string lengths and array bounds. This allows the procedure to be more general and eliminates the need for the programmer to pass such information. By default PL/I passes arguments by reference. A (trivial) function to change the sign of each element of a two-dimensional array might look like:
change_sign: procedure(array);
declare array(*,*) float;
array = -array;
end change_sign;
This could be called with various arrays as follows:
/* first array bounds from -5 to +10 and 3 to 9 */
declare array1 (-5:10, 3:9)float;
/* second array bounds from 1 to 16 and 1 to 16 */
declare array2 (16,16) float;
call change_sign(array1);
call change_sign(array2);
Python
In
Python, the keyword
def
denotes the start of a function definition. The statements of the function body follow as indented on subsequent lines and end at the line that is indented the same as the first line or end of file.
def format_greeting(name):
return "Welcome " + name
def greet_martin():
print(format_greeting("Martin"))
The first function returns greeting text that includes the name passed by the caller. The second function calls the first and is called like
greet_martin()
to write "Welcome Martin" to the console.
Prolog
In the procedural interpretation of
logic programs, logical implications behave as goal-reduction procedures. A
rule (or
clause
In language, a clause is a Constituent (linguistics), constituent or Phrase (grammar), phrase that comprises a semantic predicand (expressed or not) and a semantic Predicate (grammar), predicate. A typical clause consists of a subject (grammar), ...
) of the form:
:
A :- B
which has the logical reading:
:
A if B
behaves as a procedure that reduces goals that
unify with
A
to subgoals that are instances of
B
.
Consider, for example, the Prolog program:
mother_child(elizabeth, charles).
father_child(charles, william).
father_child(charles, harry).
parent_child(X, Y) :- mother_child(X, Y).
parent_child(X, Y) :- father_child(X, Y).
Notice that the motherhood function,
X = mother(Y) is represented by a relation, as in a
relational database
A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970.
A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured for ...
. However, ''relations'' in Prolog ''function'' as callable units.
For example, the procedure call
?- parent_child(X, charles) produces the output
X = elizabeth. But the same procedure can be called with other input-output patterns. For example:
?- parent_child(elizabeth, Y).
Y = charles.
?- parent_child(X, Y).
X = elizabeth,
Y = charles.
X = charles,
Y = harry.
X = charles,
Y = william.
?- parent_child(william, harry).
no.
?- parent_child(elizabeth, charles).
yes.
See also
*
Asynchronous procedure call, a subprogram that is called after its parameters are set by other activities
*
Command–query separation (CQS)
*
Compound operation
*
Coroutines, subprograms that call each other as if both were the main programs
*
Evaluation strategy
In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a ''parameter-passing strategy'' that defines the kind of value that is passed to the ...
*
Event handler, a subprogram that is called in response to an input event or
interrupt
In digital computers, an interrupt (sometimes referred to as a trap) is a request for the processor to ''interrupt'' currently executing code (when permitted), so that the event can be processed in a timely manner. If the request is accepted ...
*
Function (mathematics)
In mathematics, a function from a set (mathematics), set to a set assigns to each element of exactly one element of .; the words ''map'', ''mapping'', ''transformation'', ''correspondence'', and ''operator'' are sometimes used synonymously. ...
*
Functional programming
In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...
*
Fused operation
*
Intrinsic function
In computer software, in compiler theory, an intrinsic function, also called built-in function or builtin function, is a function ( subroutine) available for use in a given programming language whose implementation is handled specially by the com ...
*
Lambda function (computer programming), a function that is not bound to an identifier
*
Logic programming
*
Modular programming
Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect or "concern" of the d ...
*
Operator overloading
*
Protected procedure
*
Transclusion
References
{{Authority control
Source code
Articles with example C code
Holism
Programming constructs