
In
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
, source code, or simply code or source, is a
plain text
In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers, images, etc.). It may also include a lim ...
computer program
A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. It is one component of software, which also includes software documentation, documentation and other intangibl ...
written in a
programming language
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
. A
programmer
A programmer, computer programmer or coder is an author of computer source code someone with skill in computer programming.
The professional titles Software development, ''software developer'' and Software engineering, ''software engineer' ...
writes the
human readable source code to control the behavior of a
computer
A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
.
Since a computer, at base, only understands
machine code
In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
, source code must be
translated before a computer can
execute it. The translation process can be implemented three ways. Source code can be converted into machine code by a
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
or an
assembler. The resulting
executable
In computer science, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instruction (computer science), in ...
is machine code ready for the computer. Alternatively, source code can be executed without conversion via an
interpreter. An interpreter loads the source code into memory. It simultaneously translates and executes each
statement. A method that combines compilation and interpretation is to first produce
bytecode
Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (normal ...
. Bytecode is an intermediate representation of source code that is quickly interpreted.
Background
The first programmable computers, which appeared at the end of the 1940s, were programmed in
machine language (simple instructions that could be directly executed by the processor). Machine language was difficult to debug and was not
portable between different computer systems. Initially, hardware resources were scarce and expensive, while
human resources
Human resources (HR) is the set of people who make up the workforce of an organization, business sector, industry, or economy. A narrower concept is human capital, the knowledge and skills which the individuals command. Similar terms include ' ...
were cheaper. As programs grew more complex,
programmer productivity became a bottleneck. This led to the introduction of
high-level programming language
A high-level programming language is a programming language with strong Abstraction (computer science), abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be ea ...
s such as
Fortran in the mid-1950s. These languages
abstracted away the details of the hardware, instead being designed to express algorithms that could be understood more easily by humans. As instructions distinct from the underlying
computer hardware
Computer hardware includes the physical parts of a computer, such as the central processing unit (CPU), random-access memory (RAM), motherboard, computer data storage, graphics card, sound card, and computer case. It includes external devices ...
, software is therefore relatively recent, dating to these early high-level
programming languages such as
Fortran,
Lisp
Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation.
Originally specified in the late 1950s, ...
, and
Cobol
COBOL (; an acronym for "common business-oriented language") is a compiled English-like computer programming language designed for business use. It is an imperative, procedural, and, since 2002, object-oriented language. COBOL is primarily ...
. The invention of high-level programming languages was simultaneous with the
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
s needed to translate the source code automatically into machine code that can be directly executed on the
computer hardware
Computer hardware includes the physical parts of a computer, such as the central processing unit (CPU), random-access memory (RAM), motherboard, computer data storage, graphics card, sound card, and computer case. It includes external devices ...
.
Source code is the form of code that is modified directly by humans, typically in a high-level programming language.
Object code
In computing, object code or object module is the product of an assembler or compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' ...
can be directly executed by the machine and is generated automatically from the source code, often via an intermediate step,
assembly language
In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
. While object code will only work on a specific platform, source code can be ported to a different machine and recompiled there. For the same source code, object code can vary significantly—not only based on the machine for which it is compiled, but also based on performance optimization from the compiler.
Organization
Most programs do not contain all the resources needed to run them and rely on external
libraries
A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
. Part of the compiler's function is to link these files in such a way that the program can be executed by the hardware.
Software developers often use
configuration management
Configuration management (CM) is a management process for establishing and maintaining consistency of a product's performance, functional, and physical attributes with its requirements, design, and operational information throughout its life. ...
to track changes to source code files (
version control
Version control (also known as revision control, source control, and source code management) is the software engineering practice of controlling, organizing, and tracking different versions in history of computer files; primarily source code t ...
). The configuration management system also keeps track of which object code file corresponds to which version of the source code file.
Purposes
Estimation
The number of lines of source code is often used as a metric when evaluating the productivity of computer programmers, the economic value of a code base,
effort estimation for projects in development, and the ongoing cost of
software maintenance
Software maintenance is the modification of software after delivery.
Software maintenance is often considered lower skilled and less rewarding than new development. As such, it is a common target for outsourcing or offshoring. Usually, the tea ...
after release.
Communication
Source code is also used to communicate
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s between people e.g.,
code snippets online or in books.
[Spinellis, D: ''Code Reading: The Open Source Perspective''. Addison-Wesley Professional, 2003. ]
Computer programmers may find it helpful to review existing source code to learn about programming techniques.
[ The sharing of source code between developers is frequently cited as a contributing factor to the maturation of their programming skills.][ Some people consider source code an expressive ]artistic medium
Media, or mediums, are the core types of material (or related other tools) used by an artist, composer, designer, etc. to create a work of art. For example, a visual artist may broadly use the media of painting or sculpting, which themselves have ...
.
Source code often contains comments—blocks of text marked for the compiler to ignore. This content is not part of the program logic, but is instead intended to help readers understand the program.
Companies often keep the source code confidential in order to hide algorithms considered a trade secret
A trade secret is a form of intellectual property (IP) comprising confidential information that is not generally known or readily ascertainable, derives economic value from its secrecy, and is protected by reasonable efforts to maintain its conf ...
. Proprietary, secret source code and algorithms are widely used for sensitive government applications such as criminal justice
Criminal justice is the delivery of justice to those who have been accused of committing crimes. The criminal justice system is a series of government agencies and institutions. Goals include the rehabilitation of offenders, preventing other ...
, which results in black box behavior with a lack of transparency into the algorithm's methodology. The result is avoidance of public scrutiny of issues such as bias.
Modification
Access to the source code (not just the object code
In computing, object code or object module is the product of an assembler or compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' ...
) is essential to modifying it. Understanding existing code is necessary to understand how it works and before modifying it. The rate of understanding depends both on the code base as well as the skill of the programmer. Experienced programmers have an easier time understanding what the code does at a high level. Software visualization is sometimes used to speed up this process.
Many software programmers use an integrated development environment
An integrated development environment (IDE) is a Application software, software application that provides comprehensive facilities for software development. An IDE normally consists of at least a source-code editor, build automation tools, an ...
(IDE) to improve their productivity. IDEs typically have several features built in, including a source-code editor
A source-code editor is a text editor program designed specifically for editing source code of computer programs. It may be a standalone application or it may be built into an integrated development environment (IDE).
Features
Source-code editor ...
that can alert the programmer to common errors. Modification often includes code refactoring
In computer programming and software design, code refactoring is the process of restructuring existing source code—changing the '' factoring''—without changing its external behavior. Refactoring is intended to improve the design, structure, ...
(improving the structure without changing functionality) and restructuring (improving structure and functionality at the same time). Nearly every change to code will introduce new bugs or unexpected ripple effects, which require another round of fixes.[
]Code review
Code review (sometimes referred to as peer review) is a software quality assurance activity in which one or more people examine the source code of a computer program, either after implementation or during the development process. The persons perf ...
s by other developers are often used to scrutinize new code added to a project. The purpose of this phase is often to verify that the code meets style and maintainability
Maintainability is the ease of maintaining or providing maintenance for a functioning product or service. Depending on the field, it can have slightly different meanings.
Usage in different fields Engineering
In engineering, maintainability ...
standards and that it is a correct implementation of the software design
Software design is the process of conceptualizing how a software system will work before it is implemented or modified.
Software design also refers to the direct result of the design process the concepts of how the software will work which co ...
. According to some estimates, code review dramatically reduce the number of bugs persisting after software testing
Software testing is the act of checking whether software satisfies expectations.
Software testing can provide objective, independent information about the Quality (business), quality of software and the risk of its failure to a User (computin ...
is complete. Along with software testing that works by executing the code, static program analysis uses automated tools to detect problems with the source code. Many IDEs support code analysis tools, which might provide metrics on the clarity and maintainability of the code. Debuggers are tools that often enable programmers to step through execution while keeping track of which source code corresponds to each change of state.
Compilation and execution
Source code files in a high-level programming language must go through a stage of preprocessing into machine code
In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
before the instructions can be carried out. After being compiled, the program can be saved as an object file
An object file is a file that contains machine code or bytecode, as well as other data and metadata, generated by a compiler or assembler from source code during the compilation or assembly process. The machine code that is generated is kno ...
and the loader (part of the operating system) can take this saved file and execute it as a process
A process is a series or set of activities that interact to produce a result; it may occur once-only or be recurrent or periodic.
Things called a process include:
Business and management
* Business process, activities that produce a specific s ...
on the computer hardware. Some programming languages use an interpreter instead of a compiler. An interpreter converts the program into machine code at run time, which makes them 10 to 100 times slower than compiled programming languages.
Quality
Software quality
In the context of software engineering, software quality refers to two related but distinct notions:
* Software's functional quality reflects how well it complies with or conforms to a given design, based on functional requirements or specificat ...
is an overarching term that can refer to a code's correct and efficient behavior, its reusability and portability, or the ease of modification. It is usually more cost-effective to build quality into the product from the beginning rather than try to add it later in the development process. Higher quality code will reduce lifetime cost to both suppliers and customers as it is more reliable and easier to maintain.
Maintainability is the quality of software enabling it to be easily modified without breaking existing functionality. Following coding conventions such as using clear function and variable names that correspond to their purpose makes maintenance easier. Use of conditional loop statements only if the code could execute more than once, and eliminating code that will never execute can also increase understandability. Many software development organizations neglect maintainability during the development phase, even though it will increase long-term costs. Technical debt is incurred when programmers, often out of laziness or urgency to meet a deadline, choose quick and dirty solutions rather than build maintainability into their code. A common cause is underestimates in software development effort estimation, leading to insufficient resources allocated to development. A challenge with maintainability is that many software engineering
Software engineering is a branch of both computer science and engineering focused on designing, developing, testing, and maintaining Application software, software applications. It involves applying engineering design process, engineering principl ...
courses do not emphasize it. Development engineers who know that they will not be responsible for maintaining the software do not have an incentive to build in maintainability.[
]
Copyright and licensing
The situation varies worldwide, but in the United States before 1974, software and its source code was not copyright
A copyright is a type of intellectual property that gives its owner the exclusive legal right to copy, distribute, adapt, display, and perform a creative work, usually for a limited time. The creative work may be in a literary, artistic, ...
able and therefore always public domain software
Public-domain software is software that has been placed in the public domain, in other words, software for which there is absolutely no ownership such as copyright, trademark, or patent. Software in the public domain can be modified, distributed, ...
. In 1974, the US Commission on New Technological Uses of Copyrighted Works ( CONTU) decided that "computer programs, to the extent that they embody an author's original creation, are proper subject matter of copyright".Apple Computer, Inc. v. Franklin Computer Corporation Puts the Byte Back into Copyright Protection for Computer Programs
in Golden Gate University Law Review Volume 14, Issue 2, Article 3 by Jan L. Nussbaum (January 1984)[Lemley, Menell, Merges and Samuelson. ''Software and Internet Law'', p. 34.]
Proprietary software
Proprietary software is computer software, software that grants its creator, publisher, or other rightsholder or rightsholder partner a legal monopoly by modern copyright and intellectual property law to exclude the recipient from freely sharing t ...
is rarely distributed as source code. Although the term open-source software
Open-source software (OSS) is Software, computer software that is released under a Open-source license, license in which the copyright holder grants users the rights to use, study, change, and Software distribution, distribute the software an ...
literally refers to public access to the source code, open-source software has additional requirements: free redistribution, permission to modify the source code and release derivative works under the same license, and nondiscrimination between different uses—including commercial use. The free reusability
In computer programming, reusability describes the quality of a software asset that affects its ability to be used in a software system for which it was ''not'' specifically designed. An asset that is easy to reuse and provides utility is conside ...
of open-source software can speed up development.
See also
* Bytecode
Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (normal ...
* Code as data
* Coding conventions
Coding conventions are a set of guidelines for a specific programming language that recommend programming style, practices, and methods for each aspect of a program written in that language. These conventions usually cover file organization, i ...
* Free software
Free software, libre software, libreware sometimes known as freedom-respecting software is computer software distributed open-source license, under terms that allow users to run the software for any purpose as well as to study, change, distribut ...
* Legacy code
* Machine code
In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
* Markup language
A markup language is a Encoding, text-encoding system which specifies the structure and formatting of a document and potentially the relationships among its parts. Markup can control the display of a document or enrich its content to facilitate au ...
* Obfuscated code
* Object code
In computing, object code or object module is the product of an assembler or compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' ...
* Open-source software
Open-source software (OSS) is Software, computer software that is released under a Open-source license, license in which the copyright holder grants users the rights to use, study, change, and Software distribution, distribute the software an ...
* Package (package management system)
A package manager or package management system is a collection of programming tool, software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer in a consistent manner.
A packag ...
* Programming language
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
* Source code repository
* Syntax highlighting
* Visual programming language
References
Sources
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
External links
{{DEFAULTSORT:Source Code