In
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
CDR coding is a
compressed data representation for
Lisp
Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation.
Originally specified in the late 1950s, ...
linked list
In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
s. It was developed and patented by the
MIT Artificial Intelligence Laboratory, and implemented in
computer
A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
hardware in a number of
Lisp machines derived from the MIT
CADR.
CDR coding is in fact a fairly general idea; whenever a data object ''A'' ends in a
reference
A reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a ''nam ...
to another data structure ''B'', we can instead place the structure ''B'' itself there, overlapping and running off the end of ''A''. By doing this we free the space required by the reference, which can add up if done many times, and also improve
locality of reference
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
, enhancing performance on modern machines. The transformation is especially effective for the
cons
In computer programming, ( or ) is a fundamental function in most dialects of the Lisp programming language. ''constructs'' memory objects which hold two values or pointers to two values. These objects are referred to as (cons) cells, conses, ...
-based lists it was created for; we free about half of the space for each node we perform this transformation on.
It is not always possible to perform this substitution, because there might not be a large enough chunk of free space beyond the end of A. Thus, some objects will end in a real reference, and some with the referenced object, and the machine must be able to tell by reading the final cell which one it is. This can be accomplished with some inefficiency in software by the use of
tagged pointers, which allow a pointer in a final position to be specifically tagged as such, but is best done in hardware.
In the presence of
mutable objects, CDR coding becomes more complex. If a reference is updated to point to another object, but currently has an object stored in that field, the object must be relocated, along with any other pointers to it. Not only are such moves typically expensive or impossible, but over time they cause
fragmentation of the store. This problem is typically avoided by using CDR coding only on
immutable data structures.
External links
*
* L. Peter Deutsch: A LISP Machine with Very Compact Programs. IJCAI 1973, Pages 697 - 703
* Greenblatt, R., LISP Machine Progress Report, memo 444, A.I.Lab., M.I.T., Cambridge, Mass., Aug. 1977.
* L. Peter Deutsch: Experience with a microprogrammed Interlisp system. MICRO 11: Proceedings of the 11th annual workshop on Microprogramming, November 1978, Pages 128–129
* Pages 399-401
Lisp (programming language)
Data compression
{{compu-prog-stub