Data As Code
   HOME

TheInfoList



OR:

In computer science, the expression code as data refers to the idea that
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
written in a
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
can be manipulated as data, such as a sequence of characters or an
abstract syntax tree An abstract syntax tree (AST) is a data structure used in computer science to represent the structure of a program or code snippet. It is a tree representation of the abstract syntactic structure of text (often source code) written in a formal ...
(AST), and it has an
execution Capital punishment, also known as the death penalty and formerly called judicial homicide, is the state-sanctioned killing of a person as punishment for actual or supposed misconduct. The sentence ordering that an offender be punished in ...
semantics only in the context of a given
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
or
interpreter Interpreting is translation from a spoken or signed language into another language, usually in real time to facilitate live communication. It is distinguished from the translation of a written text, which can be more deliberative and make use o ...
. The notion is often used in the context of
Lisp Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation. Originally specified in the late 1950s, ...
-like languages that use S-expressions as their main syntax, as writing programs using nested lists of symbols makes the interpretation of the program as an AST quite transparent (a property known as
homoiconicity In computer programming, homoiconicity (from the Greek words ''homo-'' meaning "the same" and ''icon'' meaning "representation") is an informal property of some programming languages. A language is homoiconic if a program written in it can be mani ...
). These ideas are generally used in the context of what is called
metaprogramming Metaprogramming is a computer programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyse, or transform other programs, and even modi ...
, writing programs that treat other programs as their data. For example, code-as-data allows the
serialization In computing, serialization (or serialisation, also referred to as pickling in Python (programming language), Python) is the process of translating a data structure or object (computer science), object state into a format that can be stored (e. ...
of
first-class functions In computer science, a programming language is said to have first-class functions if it treats function (programming), functions as first-class citizens. This means the language supports passing functions as arguments to other functions, returning ...
in a portable manner. Another use case is storing a program in a string, which is then processed by a compiler to produce an executable. More often there is a reflection API that exposes the structure of a program as an object within the language, reducing the possibility of creating a malformed program. In
computational theory In theoretical computer science and mathematics, the theory of computation is the branch that deals with what problems can be solved on a model of computation, using an algorithm, how efficiently they can be solved or to what degree (e.g., app ...
,
Kleene's second recursion theorem In computability theory, Kleene's recursion theorems are a pair of fundamental results about the application of computable functions to their own descriptions. The theorems were first proved by Stephen Kleene in 1938 and appear in his 1952 ...
provides a form of code-is-data, by proving that a program can have access to its own source code. Code-as-data is also a principle of the
Von Neumann architecture The von Neumann architecture—also known as the von Neumann model or Princeton architecture—is a computer architecture based on the '' First Draft of a Report on the EDVAC'', written by John von Neumann in 1945, describing designs discus ...
, since
stored programs A stored-program computer is a computer that stores program instructions in electronically, electromagnetically, or optically accessible memory. This contrasts with systems that stored the program instructions with plugboards or similar mechanis ...
and data are both represented as bits in the same memory device. This architecture offers the ability to write
self-modifying code In computer science, self-modifying code (SMC or SMoC) is source code, code that alters its own instruction (computer science), instructions while it is execution (computing), executing – usually to reduce the instruction path length and imp ...
. It also opens the security risk of disguising a malicious program as user data and then using an exploit to direct execution to the malicious program.


Data as Code

In
declarative programming In computer science, declarative programming is a programming paradigm—a style of building the structure and elements of computer programs—that expresses the logic of a computation without describing its control flow. Many languages that ap ...
, the Data as Code (DaC) principle refers to the idea that an arbitrary data structure can be exposed using a specialized language semantics or API. For example, a list of integers or a string is data, but in languages such as Lisp and Perl, they can be directly entered and evaluated as code.
Configuration scripts A configuration file, a.k.a. config file, is a file that stores data used to configure a software system such as an application, a server or an operating system. Some applications provide a tool to create, modify, and verify the syntax of the ...
,
domain-specific language A domain-specific language (DSL) is a computer language specialized to a particular application domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains. There are a wide variety of DSLs, ranging ...
s and
markup language A markup language is a Encoding, text-encoding system which specifies the structure and formatting of a document and potentially the relationships among its parts. Markup can control the display of a document or enrich its content to facilitate au ...
s are cases where program execution is controlled by data elements that are not clearly sequences of commands.


References

Programming language topics {{Compu-prog-stub