In
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
, an uninitialized variable is a
variable that is declared but is not set to a definite known value before it is used. It will have ''some'' value, but not a predictable one. As such, it is a programming error and a common source of
bugs in software.
Example of the C language
A common assumption made by novice programmers is that all variables are set to a known value, such as zero, when they are declared. While this is true for many languages, it is not true for all of them, and so the potential for error is there. Languages such as
C use
stack
Stack may refer to:
Places
* Stack Island, an island game reserve in Bass Strait, south-eastern Australia, in Tasmania’s Hunter Island Group
* Blue Stack Mountains, in Co. Donegal, Ireland
People
* Stack (surname) (including a list of people ...
space for variables, and the collection of variables allocated for a subroutine is known as a
stack frame
In computer science, a call stack is a stack data structure that stores information about the active subroutines and inline blocks of a computer program. This type of stack is also known as an execution stack, program stack, control stack, run- ...
. While the computer will set aside the appropriate amount of space for the stack frame, it usually does so simply by adjusting the value of the stack
pointer, and does not set the
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...
itself to any new state (typically out of efficiency concerns). Therefore, whatever contents of that memory at the time will appear as initial values of the variables which occupy those addresses.
Here's a simple example in C:
void count(void)
The final value of
k
is undefined. The answer that it must be 10 assumes that it started at zero, which may or may not be true. Note that in the example, the variable
i
is initialized to zero by the first clause of the
for
statement.
Another example can be when dealing with
structs. In the code snippet below, we have a
struct student
which contains some variables describing the information about a student. The function
register_student
leaks memory contents because it fails to fully initialize the members of
struct student new_student
. If we take a closer look, in the beginning,
age
,
semester
and
student_number
are initialized. But the initialization of the
first_name
and
last_name
members are incorrect. This is because if the length of
first_name
and
last_name
character arrays are less than 16 bytes, during the
strcpy
,
we fail to fully initialize the entire 16 bytes of memory reserved for each of these members. Hence after
memcpy()
'ing the resulted struct to
output
,
we leak some stack memory to the caller.
struct student ;
int register_student(struct student *output, int age, char *first_name, char *last_name)
In any case, even when a variable is ''implicitly'' initialized to a ''default'' value like 0, this is typically not the ''correct'' value. Initialized does not mean correct if the value is a default one. (However, default initialization to
0 is a right practice for pointers and arrays of pointers, since it makes them invalid before they are actually initialized to their correct value.) In C, variables with static storage duration that are not initialized explicitly are initialized to zero (or null, for pointers).
Not only are uninitialized variables a frequent cause of bugs, but this kind of bug is particularly serious because it may not be reproducible: for instance, a variable may remain uninitialized only in some
branch
A branch, also called a ramus in botany, is a stem that grows off from another stem, or when structures like veins in leaves are divided into smaller veins.
History and etymology
In Old English, there are numerous words for branch, includ ...
of the program. In some cases, programs with uninitialized variables may even pass
software tests.
Impacts
Uninitialized variables are powerful bugs since they can be exploited to leak arbitrary memory or to achieve arbitrary memory overwrite or to gain code execution, depending on the case. When exploiting a software which utilizes
address space layout randomization
Address space layout randomization (ASLR) is a computer security technique involved in preventing exploitation of memory corruption vulnerabilities. In order to prevent an attacker from reliably redirecting code execution to, for example, a pa ...
(ASLR), it is often required to know the
base address
In computing, a base address is an address serving as a reference point ("base") for other addresses. Related addresses can be accessed using an ''addressing scheme''.
Under the ''relative addressing'' scheme, to obtain an absolute address, the ...
of the software in memory. Exploiting an uninitialized variable in a way to force the software to leak a pointer from its
address space
In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity.
For software programs to save and retrieve ...
can be used to bypass ASLR.
Use in languages
Uninitialized variables are a particular problem in languages such as assembly language, C, and
C++, which were designed for
systems programming
Systems programming, or system programming, is the activity of programming computer system software. The primary distinguishing characteristic of systems programming when compared to application programming is that application programming aims t ...
. The development of these languages involved a design philosophy in which conflicts between performance and safety were generally resolved in favor of performance. The programmer was given the burden of being aware of dangerous issues such as uninitialized variables.
In other languages, variables are often initialized to known values when created. Examples include:
*
VHDL
VHDL (Very High Speed Integrated Circuit Program, VHSIC Hardware Description Language) is a hardware description language that can model the behavior and structure of Digital electronics, digital systems at multiple levels of abstraction, ran ...
initializes all standard variables into special 'U' value. It is used in simulation, for debugging, to let the user to know when the
don't care
In digital logic, a don't-care term (abbreviated DC, historically also known as ''redundancies'', ''irrelevancies'', ''optional entries'', ''invalid combinations'', ''vacuous combinations'', ''forbidden combinations'', ''unused states'' or ''l ...
initial values, through the
multi-valued logic
Many-valued logic (also multi- or multiple-valued logic) is a propositional calculus in which there are more than two truth values. Traditionally, in Aristotle's logical calculus, there were only two possible values (i.e., "true" and "false") ...
, affect the output.
*
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
does not have uninitialized variables. Fields of classes and objects that do not have an explicit initializer and elements of arrays are automatically initialized with the default value for their type (false for boolean, 0 for all numerical types, null for all reference types).
Local variables in Java must be definitely assigned to before they are accessed, or it is a compile error.
*
Python initializes local variables to
NULL
(distinct from
None
) and raises an
UnboundLocalError
when such a variable is accessed before being (re)initialized to a valid value.
*
D initializes all variables unless explicitly specified by the programmer not to.
Even in languages where uninitialized variables are allowed, many
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
s will attempt to identify the use of uninitialized variables and report them as
compile-time error
An error (from the Latin , meaning 'to wander'Oxford English Dictionary, s.v. “error (n.), Etymology,” September 2023, .) is an inaccurate or incorrect action, thought, or judgement.
In statistics, "error" refers to the difference between t ...
s. Some languages assist this task by offering constructs to handle the initializedness of variables; for example,
C# has a special flavour of call-by-reference parameters to subroutines (specified as
out
instead of the usual
ref
), asserting that the variable is allowed to be uninitialized on entry but will be initialized afterwards.
See also
*
Initialization (programming)
In computer programming, initialization or initialisation is the assignment of an initial value for a data object or variable. The manner in which initialization is performed depends on the programming language, as well as the type, storage class, ...
*
Null pointer
In computing, a null pointer (sometimes shortened to nullptr or null) or null reference is a value saved for indicating that the Pointer (computer programming), pointer or reference (computer science), reference does not refer to a valid Object (c ...
*
Don't care
In digital logic, a don't-care term (abbreviated DC, historically also known as ''redundancies'', ''irrelevancies'', ''optional entries'', ''invalid combinations'', ''vacuous combinations'', ''forbidden combinations'', ''unused states'' or ''l ...
*
Undefined behaviour
In computer programming, a program exhibits undefined behavior (UB) when it contains, or is executing code for which its programming language specification does not mandate any specific requirements. This is different from unspecified behavior, ...
References
Further reading
* {{cite web , title=CWE-457 Use of Uninitialized Variable , url=http://cwe.mitre.org/data/definitions/457.html
Software bugs
Variable (computer science)