HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
, source code, or simply code, is any collection of code, with or without comments, written using a human-readable
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
, usually as
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects (floating-point numbers, images, etc.). It may also include a limit ...
. The source code of a
program Program, programme, programmer, or programming may refer to: Business and management * Program management, the process of managing several related projects * Time management * Program, a part of planning Arts and entertainment Audio * Programm ...
is specially designed to facilitate the work of computer
programmer A computer programmer, sometimes referred to as a software developer, a software engineer, a programmer or a coder, is a person who creates computer programs — often for larger computer software. A programmer is someone who writes/creates ...
s, who specify the actions to be performed by a computer mostly by writing source code. The source code is often transformed by an assembler or
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
into binary
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
that can be executed by the computer. The machine code is then available for
execution Capital punishment, also known as the death penalty, is the state-sanctioned practice of deliberately killing a person as a punishment for an actual or supposed crime, usually following an authorized, rule-governed process to conclude that ...
at a later time. Most
application software Application may refer to: Mathematics and computing * Application software, computer software designed to help the user to perform specific tasks ** Application layer, an abstraction layer that specifies protocols and interface methods used in a ...
is distributed in a form that includes only
executable In computing, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions", as opposed to a data fil ...
files. If the source code were included it would be useful to a
user Ancient Egyptian roles * User (ancient Egyptian official), an ancient Egyptian nomarch (governor) of the Eighth Dynasty * Useramen, an ancient Egyptian vizier also called "User" Other uses * User (computing), a person (or software) using an ...
, programmer or a
system administrator A system administrator, or sysadmin, or admin is a person who is responsible for the upkeep, configuration, and reliable operation of computer systems, especially multi-user computers, such as servers. The system administrator seeks to en ...
, any of whom might wish to study or modify the program. Alternatively, depending on the technology being used, source code may be interpreted and executed directly.


Definitions

Richard Stallman's definition, formulated in his 1989 seminal license, proposed source code as whatever form in which software is modified:
The “source code” for a work means the preferred form of the work for making modifications to it.
Some classical sources define source code as the text form of programming languages, for example:
Source code (also referred to as source or code) is the version of software as it is originally written (i.e., typed into a computer) by a human in
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects (floating-point numbers, images, etc.). It may also include a limit ...
(i.e., human readable alphanumeric characters).
This responds to the fact that, when program translation first appeared, the contemporary form of software production were textual programming languages, thus source code was text code while
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
was target code. However, as programming pipelines started to incorporate more intermediate forms, some in languages like JavaScript that could be either source or target, text code stopped being synonymous with source code. Stallman's definition thus contemplates JavaScript and HTML's source-target ambivalence, as well as contemplating possible future forms of software production, like visual programming languages, or datasets in Machine Learning. Other broader interpretations, however, consider source code to include the machine code along with all the high level languages that produce it, this definition undoes the original machine/text distinction by considering each step in the program translation to be source code.
For the purpose of clarity "source code" is taken to mean any fully executable description of a software system. It is therefore so construed as to include machine code, very high level languages and executable graphical representations of systems.
This approach allows for a much more flexible approach to system analysis, dispensing with the requirement for designer to collaborate by publishing a convenient form for understanding and modification. It can also be applied to scenarios where a designer is not needed, like DNA. However, this form of analysis doesn't contemplate a costlier machine-to-machine code analysis than human-to-machine code analysis.


History

The earliest programs for
stored-program computer A stored-program computer is a computer that stores program instructions in electronically or optically accessible memory. This contrasts with systems that stored the program instructions with plugboards or similar mechanisms. The definition ...
s were entered in binary through the front panel switches of the computer. This
first-generation programming language A first-generation programming language (1GL) is a machine-level programming language. A first generation (programming) language (1GL) is a grouping of programming languages that are machine level languages used to program first-generation com ...
had no distinction between source code and
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
. When IBM first offered software to work with its machine, the source code was provided at no additional charge. At that time, the cost of developing and supporting software was included in the price of the hardware. For decades, IBM distributed source code with its software product licenses, until 1983. Most early computer magazines published source code as
type-in program A type-in program or type-in listing was computer source code printed in a home computer magazine or book. It was meant to be entered via the keyboard by the reader and then saved to cassette tape or floppy disk. The result was a usable game, ...
s. Occasionally the entire source code to a large program is published as a hardback book, such as ''Computers and Typesetting'', vol. B: ''TeX, The Program'' by
Donald Knuth Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer sc ...
, ''PGP Source Code and Internals'' by Philip Zimmermann, ''PC SpeedScript'' by Randy Thompson, and ''µC/OS, The Real-Time Kernel'' by Jean Labrosse.


Organization

The source code which constitutes a
program Program, programme, programmer, or programming may refer to: Business and management * Program management, the process of managing several related projects * Time management * Program, a part of planning Arts and entertainment Audio * Programm ...
is usually held in one or more
text file A text file (sometimes spelled textfile; an old alternative name is flatfile) is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. In operat ...
s stored on a computer's
hard disk A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magn ...
; usually, these files are carefully arranged into a directory tree, known as a source tree. Source code can also be stored in a database (as is common for
stored procedure A stored procedure (also termed proc, storp, sproc, StoPro, StoredProc, StoreProc, sp, or SP) is a subroutine available to applications that access a relational database management system (RDBMS). Such procedures are stored in the database data d ...
s) or elsewhere. The source code for a particular piece of software may be contained in a single file or many files. Though the practice is uncommon, a program's source code can be written in different programming languages. For example, a program written primarily in the
C programming language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well a ...
, might have portions written in
assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence b ...
for optimization purposes. It is also possible for some components of a piece of software to be written and compiled separately, in an arbitrary programming language, and later integrated into the software using a technique called library linking. In some languages, such as
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
, this can be done at runtime (each class is compiled into a separate file that is linked by the interpreter at runtime). Yet another method is to make the main program an interpreter for a programming language, either designed specifically for the application in question or general-purpose and then write the bulk of the actual user functionality as macros or other forms of add-ins in this language, an approach taken for example by the
GNU Emacs GNU Emacs is a free software text editor. It was created by GNU Project founder Richard Stallman, based on the Emacs editor developed for Unix operating systems. GNU Emacs has been a central component of the GNU project and a flagship project ...
text editor. The code base of a
computer programming Computer programming is the process of performing a particular computation (or more generally, accomplishing a specific computing result), usually by designing and building an executable computer program. Programming involves tasks such as anal ...
project is the larger collection of all the source code of all the
computer program A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. Computer programs are one component of software, which also includes software documentation, documentation and oth ...
s which make up the project. It has become common practice to maintain code bases in
version control system In software engineering, version control (also known as revision control, source control, or source code management) is a class of systems responsible for managing changes to computer programs, documents, large web sites, or other collections ...
s. Moderately complex software customarily requires the compilation or assembly of several, sometimes dozens or maybe even hundreds, of different source code files. In these cases, instructions for compilations, such as a
Makefile In software development, Make is a build automation tool that automatically builds executable programs and libraries from source code by reading files called ''Makefiles'' which specify how to derive the target program. Though integrated ...
, are included with the source code. These describe the programming relationships among the source code files and contain information about how they are to be compiled.


Purposes

Source code is primarily used as input to the process that produces an executable program (i.e., it is
compiled In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
or interpreted). It is also used as a method of communicating
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s between people (e.g., code snippets in books).Spinellis, D: ''Code Reading: The Open Source Perspective''. Addison-Wesley Professional, 2003.
Computer programmers A computer programmer, sometimes referred to as a software developer, a software engineer, a programmer or a coder, is a person who creates computer programs — often for larger computer software. A programmer is someone who writes/creates ...
often find it helpful to review existing source code to learn about programming techniques. The sharing of source code between developers is frequently cited as a contributing factor to the maturation of their programming skills. Some people consider source code an expressive
artistic medium Arts media is the material and tools used by an artist, composer or designer to create a work of art, for example, "pen and ink" where the pen is the tool and the ink is the material. Here is a list of types of art and the media used within those ...
.
Porting In software engineering, porting is the process of adapting software for the purpose of achieving some form of execution in a computing environment that is different from the one that a given program (meant for such execution) was originally desi ...
software to other
computer platform A computing platform or digital platform is an environment in which a piece of software is executed. It may be the hardware or the operating system (OS), even a web browser and associated application programming interfaces, or other underlying s ...
s is usually prohibitively difficult without source code. Without the source code for a particular piece of software, portability is generally computationally expensive. Possible porting options include
binary translation In computing, binary translation is a form of binary recompilation where sequences of instructions are translated from a ''source'' instruction set to the ''target'' instruction set. In some cases such as instruction set simulation, the target ...
and emulation of the original platform.
Decompilation A decompiler is a computer program that translates an executable file to a high-level source file which can be recompiled successfully. It does therefore the opposite of a typical compiler, which translates a high-level language to a low-level ...
of an executable program can be used to generate source code, either in
assembly code In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence b ...
or in a high-level language. Programmers frequently adapt source code from one piece of software to use in other projects, a concept known as
software reusability In computer science and software engineering, reusability is the use of existing ''assets'' in some form within the software product development process; these ''assets'' are products and by-products of the software development life cycle and ...
.


Legal aspects

The situation varies worldwide, but in the United States before 1974, software and its source code was not
copyright A copyright is a type of intellectual property that gives its owner the exclusive right to copy, distribute, adapt, display, and perform a creative work, usually for a limited time. The creative work may be in a literary, artistic, educatio ...
able and therefore always
public domain software Public-domain software is software that has been placed in the public domain, in other words, software for which there is absolutely no ownership such as copyright, trademark, or patent. Software in the public domain can be modified, distribute ...
. In 1974, the US Commission on New Technological Uses of Copyrighted Works (CONTU) decided that "computer programs, to the extent that they embody an author's original creation, are proper subject matter of copyright".Lemley, Menell, Merges and Samuelson. ''Software and Internet Law'', p. 34. In 1983 in the United States court case '' Apple v. Franklin'' it was ruled that the same applied to
object code In computing, object code or object module is the product of a compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ...
; and that the Copyright Act gave computer programs the copyright status of literary works. In 1999, in the United States court case ''
Bernstein v. United States ''Bernstein v. United States'' is a set of court cases brought by Daniel J. Bernstein challenging restrictions on the export of cryptography from the United States. History The case was first brought in 1995, when Bernstein was a student at U ...
'' it was further ruled that source code could be considered a constitutionally protected form of
free speech Freedom of speech is a principle that supports the freedom of an individual or a community to articulate their opinions and ideas without fear of retaliation, censorship, or legal sanction. The right to freedom of expression has been recog ...
. Proponents of free speech argued that because source code conveys information to programmers, is written in a language, and can be used to share humor and other artistic pursuits, it is a protected form of communication.EFF at 25: Remembering the Case that established Code as Speech
on EFF.org by Alison Dame-Boyle (16 April 2015)


Licensing

An author of a non-trivial work like software, has several
exclusive right In Anglo-Saxon law, an exclusive right, or exclusivity, is a de facto, non-tangible prerogative existing in law (that is, the power or, in a wider sense, right) to perform an action or acquire a benefit and to permit or deny others the right t ...
s, among them the copyright for the source code and
object code In computing, object code or object module is the product of a compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ...
. The author has the right and possibility to grant customers and users of his software some of his exclusive rights in form of
software licensing A software license is a legal instrument (usually by way of contract law, with or without printed material) governing the use or redistribution of software. Under United States copyright law, all software is copyright protected, in both source c ...
. Software, and its accompanying source code, can be associated with several licensing paradigms; the most important distinction is
free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, n ...
vs
proprietary software Proprietary software is software that is deemed within the free and open-source software to be non-free because its creator, publisher, or other rightsholder or rightsholder partner exercises a legal monopoly afforded by modern copyright and i ...
. This is done by including a
copyright notice In United States copyright law, a copyright notice is a notice of statutorily prescribed form that informs users of the underlying claim to copyright ownership in a published work. Copyright is a form of protection provided by US law to author ...
that declares licensing terms. If no notice is found, then the default of ''
All rights reserved "All rights reserved" is a copyright formality indicating that the copyright holder ''reserves'', or holds for its own use, all the rights provided by copyright law. Originating in the Buenos Aires Convention of 1910, it is unclear if it has any ...
'' is implied. Generally speaking, a software is free software if its users are free to use it for any purpose, study and change its source code, give or sell its exact copies, and give or sell its modified copies. Software is ''proprietary'' if it is distributed while the source code is kept secret, or is privately owned and restricted. One of the first software licenses to be published and to explicitly grant these freedoms was the
GNU General Public License The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general ...
in 1989; the
BSD license BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lice ...
is another early example from 1990. For proprietary software, the provisions of the various copyright laws, trade secrecy and
patent A patent is a type of intellectual property that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of time in exchange for publishing an enabling disclosure of the invention."A ...
s are used to keep the source code closed. Additionally, many pieces of retail software come with an
end-user license agreement An end-user license agreement or EULA () is a legal contract between a software supplier and a customer or end-user, generally made available to the customer via a retailer acting as an intermediary. A EULA specifies in detail the rights and rest ...
(EULA) which typically prohibits
decompilation A decompiler is a computer program that translates an executable file to a high-level source file which can be recompiled successfully. It does therefore the opposite of a typical compiler, which translates a high-level language to a low-level ...
,
reverse engineering Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompli ...
, analysis, modification, or circumventing of
copy protection Copy protection, also known as content protection, copy prevention and copy restriction, describes measures to enforce copyright by preventing the reproduction of software, films, music, and other media. Copy protection is most commonly found o ...
. Types of source code protection—beyond traditional compilation to
object code In computing, object code or object module is the product of a compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ...
—include code encryption,
code obfuscation In software development, obfuscation is the act of creating source or machine code that is difficult for humans or computers to understand. Like obfuscation in natural language, it may use needlessly roundabout expressions to compose statem ...
or code morphing.


Quality

The way a program is written can have important consequences for its maintainers. Coding conventions, which stress
readability Readability is the ease with which a reader can understand a written text. In natural language, the readability of text depends on its content (the complexity of its vocabulary and syntax) and its presentation (such as typographic aspects that ...
and some language-specific conventions, are aimed at the maintenance of the software source code, which involves
debugging In computer programming and software development, debugging is the process of finding and resolving '' bugs'' (defects or problems that prevent correct operation) within computer programs, software, or systems. Debugging tactics can involve i ...
and updating. Other priorities, such as the speed of the program's execution, or the ability to compile the program for multiple architectures, often make code readability a less important consideration, since code ''quality'' generally depends on its ''purpose''.


See also

*
Bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
* Code as data * Coding conventions *
Computer code A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These progra ...
*
Free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, n ...
*
Legacy code In computing, a legacy system is an old method, technology, computer system, or application program, "of, relating to, or being a previous or outdated computer system", yet still in use. Often referencing a system as "legacy" means that it paved ...
*
Machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
*
Markup language Markup language refers to a text-encoding system consisting of a set of symbols inserted in a text document to control its structure, formatting, or the relationship between its parts. Markup is often used to control the display of the document ...
*
Obfuscated code In software development, obfuscation is the act of creating source or machine code that is difficult for humans or computers to understand. Like obfuscation in natural language, it may use needlessly roundabout expressions to compose statemen ...
*
Object code In computing, object code or object module is the product of a compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ...
*
Open-source software Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. ...
*
Package (package management system) A package manager or package-management system is a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer in a consistent manner. A package manager deals wi ...
*
Programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
*
Source code repository In version control systems, a repository is a data structure that stores metadata for a set of files or directory structure. Depending on whether the version control system in use is distributed, like Git or Mercurial, or centralized, like Subve ...
*
Syntax highlighting Syntax highlighting is a feature of text editors that are used for programming, scripting, or markup languages, such as HTML. The feature displays text, especially source code, in different colours and fonts according to the category of terms ...
*
Visual programming language In computing, a visual programming language (visual programming system, VPL, or, VPS) is any programming language that lets users create programs by manipulating program elements ''graphically'' rather than by specifying them ''textually''. A VPL ...


References


Sources

* (VEW04) "Using a Decompiler for Real-World Source Recovery", M. Van Emmerik and T. Waddington, the ''Working Conference on Reverse Engineering'',
Delft Delft () is a city and municipality in the province of South Holland, Netherlands. It is located between Rotterdam, to the southeast, and The Hague, to the northwest. Together with them, it is part of both the Rotterdam–The Hague metropolita ...
,
Netherlands ) , anthem = ( en, "William of Nassau") , image_map = , map_caption = , subdivision_type = Sovereign state , subdivision_name = Kingdom of the Netherlands , established_title = Before independence , established_date = Spanish Netherl ...
, 9–12 November 2004
Extended version of the paper


External links



by The Linux Information Project (LINFO) *
Same program written in multiple languages
{{DEFAULTSORT:Source Code Text