In
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
, a meta-circular evaluator (MCE) or meta-circular interpreter (MCI) is an
interpreter
Interpreting is translation from a spoken or signed language into another language, usually in real time to facilitate live communication. It is distinguished from the translation of a written text, which can be more deliberative and make use o ...
which defines each feature of the interpreted language using a similar facility of the interpreter's host language. For example, interpreting a lambda application may be implemented using function application.
Meta-circular evaluation is most prominent in the context of
Lisp
Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation.
Originally specified in the late 1950s, ...
.
[
] A self-interpreter is a meta-circular interpreter where the interpreted language is nearly identical to the host language; the two terms are often used synonymously.[
]
History
The dissertation of Corrado Böhm
Corrado Böhm (17 January 1923 – 23 October 2017) was an Italian computer scientist and Professor Emeritus at the Sapienza University of Rome, University of Rome "La Sapienza", known especially for his contributions to the theory of structured ...
[
]
describes the design of a self-hosting compiler. Due to the difficulty of compiling higher-order function In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:
* takes one or more functions as arguments (i.e. a procedural parameter, which is a parameter of a procedure that is itself ...
s, many languages were instead defined via interpreters, most prominently Lisp.[ The term itself was coined by ]John C. Reynolds
John Charles Reynolds (June 1, 1935 – April 28, 2013) was an American computer scientist.
Education and affiliations
John Reynolds studied at Purdue University and then earned a Doctor of Philosophy (Ph.D.) in theoretical physics from Harvard U ...
,[ and popularized through its use in the book '' Structure and Interpretation of Computer Programs''.]
Self-interpreters
A self-interpreter is a meta-circular interpreter where the host language is also the language being interpreted. A self-interpreter displays a universal function for the language in question, and can be helpful in learning certain aspects of the language. A self-interpreter will provide a circular, vacuous definition of most language constructs and thus provides little insight into the interpreted language's semantics, for example evaluation strategy
In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a ''parameter-passing strategy'' that defines the kind of value that is passed to the ...
. Addressing these issues produces the more general notion of a "definitional interpreter".[
]
From self-interpreter to abstract machine
This part is based on Section 3.2.4 of Danvy's thesis.
Here is the core of a self-evaluator for the calculus. The abstract syntax of the calculus is implemented as follows in OCaml
OCaml ( , formerly Objective Caml) is a General-purpose programming language, general-purpose, High-level programming language, high-level, Comparison of multi-paradigm programming languages, multi-paradigm programming language which extends the ...
, representing variables with their de Bruijn index
In mathematical logic, the de Bruijn index is a tool invented by the Dutch mathematician Nicolaas Govert de Bruijn for representing terms of lambda calculus without naming the bound variables. Terms written using these indices are invariant with ...
, i.e., with their lexical offset (starting from 0):
type term = IND of int (* de Bruijn index *)
, ABS of term
, APP of term * term
The evaluator uses an environment:
type value = FUN of (value -> value)
let rec eval (t : term) (e : value list) : value =
match t with
IND n ->
List.nth e n
, ABS t' ->
FUN (fun v -> eval t' (v :: e))
, APP (t0, t1) ->
apply (eval t0 e) (eval t1 e)
and apply (FUN f : value) (a : value) =
f a
let main (t : term) : value =
eval t []
Values (of type value
) conflate expressible values (the result of evaluating an expression in an environment) and denotable values (the values denoted by variables in the environment), a terminology that is due to Christopher Strachey
Christopher S. Strachey (; 16 November 1916 – 18 May 1975) was a British computer scientist. He was one of the founders of denotational semantics, and a pioneer in programming language design and computer time-sharing.F. J. Corbató, et al., T ...
.[
][
]
Environments are represented as lists of denotable values.
The core evaluator has three clauses:
* It maps a variable (represented with a de Bruijn index) into the value in the current environment at this index.
* It maps a syntactic function into a semantic function. (Applying a semantic function to an argument reduces to evaluating the body of the corresponding syntactic function in its lexical environment, extended with the argument.)
* It maps a syntactic application into a semantic application.
This evaluator is compositional in that
each of its recursive calls is made over a proper sub-part of the given
term. It is also higher order since the domain
of values is a function space.
In "Definitional Interpreters", Reynolds answered the question as to whether such a self-interpreter is well defined.
He answered in the negative because the evaluation strategy
In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a ''parameter-passing strategy'' that defines the kind of value that is passed to the ...
of the defined language (the source language) is determined by the evaluation strategy of the defining language (the meta-language).
If the meta-language follows call by value (as OCaml does), the source language follows call by value.
If the meta-language follows call by name (as Algol 60
ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a ...
does), the source language follows call by name.
And if the meta-language follows call by need (as Haskell
Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
does), the source language follows call by need.
In "Definitional Interpreters", Reynolds made a self-interpreter well defined by making it independent of the evaluation strategy of its defining language.
He fixed the evaluation strategy by transforming the self-interpreter into continuation-passing style, which is evaluation-strategy independent, as later captured in Gordon Plotkin
Gordon David Plotkin (born 9 September 1946) is a theoretical computer scientist in the School of Informatics at the University of Edinburgh. Plotkin is probably best known for his introduction of structural operational semantics (SOS) and his ...
's Independence Theorems.[
]
Furthermore, because logical relations had yet to be discovered, Reynolds made the resulting continuation-passing evaluator first order by (1) closure-converting it and (2) defunctionalizing the continuation.
He pointed out the "machine-like quality" of the resulting interpreter, which is the origin of the CEK machine since Reynolds's CPS transformation was for call by value.[
]
For call by name, these transformations map the self-interpreter to an early instance of the Krivine machine.[
]
The SECD machine and many other abstract machines can be inter-derived this way.[
][
]
It is remarkable that the three most famous abstract machines for the calculus functionally correspond to the same self-interpreter.
Self-interpretation in total programming languages
Total functional programming
Total functional programming (also known as strong functional programming, to be contrasted with ordinary, or ''weak'' functional programming) is a programming paradigm that restricts the range of programs to those that are provably terminating.
...
languages that are strongly normalizing cannot be Turing complete
Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist. He was highly influential in the development of theoretical comput ...
, otherwise one could solve the halting problem by seeing if the program type-checks. That means that there are computable functions that cannot be defined in the total language. In particular it is impossible to define a self-interpreter in a total programming language, for example in any of the typed lambda calculi such as the simply typed lambda calculus
The simply typed lambda calculus (), a form
of type theory, is a typed interpretation of the lambda calculus with only one type constructor () that builds function types. It is the canonical and simplest example of a typed lambda calculus. The ...
, Jean-Yves Girard
Jean-Yves Girard (; born 1947) is a French logician working in proof theory. He is a research director (emeritus) at the mathematical institute of University of Aix-Marseille, at Luminy.
Biography
Jean-Yves Girard is an alumnus of the Éc ...
's System F
System F (also polymorphic lambda calculus or second-order lambda calculus) is a typed lambda calculus that introduces, to simply typed lambda calculus, a mechanism of universal quantification over types. System F formalizes parametric polymorph ...
, or Thierry Coquand's calculus of constructions
In mathematical logic and computer science, the calculus of constructions (CoC) is a type theory created by Thierry Coquand. It can serve as both a typed programming language and as constructive foundation for mathematics. For this second reaso ...
. Here, by "self-interpreter" we mean a program that takes a source term representation in some plain format (such as a string of characters) and returns a representation of the corresponding normalized term. This impossibility result does not hold for other definitions of "self-interpreter".
For example, some authors have referred to functions of type as self-interpreters, where is the type of representations of -typed terms. To avoid confusion, we will refer to these functions as ''self-recognizers''.
Brown and Palsberg showed that self-recognizers could be defined in several strongly-normalizing languages, including System F
System F (also polymorphic lambda calculus or second-order lambda calculus) is a typed lambda calculus that introduces, to simply typed lambda calculus, a mechanism of universal quantification over types. System F formalizes parametric polymorph ...
and System Fω.
This turned out to be possible because the types of encoded terms being reflected in the types of their representations prevents constructing a diagonal argument Diagonal argument can refer to:
* Diagonal argument (proof technique), proof techniques used in mathematics.
A diagonal argument, in mathematics, is a technique employed in the proofs of the following theorems:
*Cantor's diagonal argument (the ea ...
.
In their paper, Brown and Palsberg claim to disprove the "conventional wisdom" that self-interpretation is impossible (and they refer to Wikipedia as an example of the conventional wisdom), but what they actually disprove is the impossibility of self-recognizers, a distinct concept. In their follow-up work, they switch to the more specific "self-recognizer" terminology used here, notably distinguishing these from "self-evaluators", of type .
They also recognize that implementing self-evaluation seems harder than self-recognition, and leave the implementation of the former in a strongly-normalizing language as an open problem.
Uses
In combination with an existing language implementation, meta-circular interpreters provide a baseline system from which to extend a language, either upwards by adding more features or downwards by compiling away features rather than interpreting them. They are also useful for writing tools that are tightly integrated with the programming language, such as sophisticated debuggers. A language designed with a meta-circular implementation in mind is often more suited for building languages in general, even ones completely different from the host language.
Examples
Many languages have one or more meta-circular implementations. Here below is a partial list.
Some languages with a meta-circular implementation designed from the bottom up, in grouped chronological order:
* Lisp
Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation.
Originally specified in the late 1950s, ...
, 1958
** Scheme, 1975
*** Pico, 1997Meta-circular implementation of the Pico programming language
/ref>
*** ActorScript, 2009?
** Clojure
Clojure (, like ''closure'') is a dynamic programming language, dynamic and functional programming, functional dialect (computing), dialect of the programming language Lisp (programming language), Lisp on the Java (software platform), Java platfo ...
, 2007
* Forth, 1968
** PostScript
PostScript (PS) is a page description language and dynamically typed, stack-based programming language. It is most commonly used in the electronic publishing and desktop publishing realm, but as a Turing complete programming language, it c ...
, 1982
* Prolog
Prolog is a logic programming language that has its origins in artificial intelligence, automated theorem proving, and computational linguistics.
Prolog has its roots in first-order logic, a formal logic. Unlike many other programming language ...
, 1972
* TeX
Tex, TeX, TEX, may refer to:
People and fictional characters
* Tex (nickname), a list of people and fictional characters with the nickname
* Tex Earnhardt (1930–2020), U.S. businessman
* Joe Tex (1933–1982), stage name of American soul singer ...
, based on virgin TeX, 1978
* Smalltalk
Smalltalk is a purely object oriented programming language (OOP) that was originally created in the 1970s for educational use, specifically for constructionist learning, but later found use in business. It was created at Xerox PARC by Learni ...
, 1980
* Rebol, 1997
** Red, 2011
* Factor, 2003
Some languages with a meta-circular implementation via third parties:
* Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
via Jikes RVM, Squawk, Maxine o
GraalVM's Espresso
* Scala vi
Metascala
* JavaScript
JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior.
Web browsers have ...
vi
Narcissus
o
JS-Interpreter
* Oz via Glinda
* Python via PyPy
* Ruby
Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...
via Rubinius
* Lua via Metalua
See also
* M-expression
* Homoiconicity
In computer programming, homoiconicity (from the Greek words ''homo-'' meaning "the same" and ''icon'' meaning "representation") is an informal property of some programming languages. A language is homoiconic if a program written in it can be mani ...
* Self-hosting compiler
References
{{reflist
External links
Structure and Interpretation of Computer Programs (SICP)
online version of full book, accessed 2009-01-18.
Metascala
Programming language implementation