In
communication
Communication (from la, communicare, meaning "to share" or "to be in relation with") is usually defined as the transmission of information. The term may also refer to the message communicated through such transmissions or the field of inqu ...
s and
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
, a machine-readable medium (or computer-readable medium) is a
medium
Medium may refer to:
Science and technology
Aviation
*Medium bomber, a class of war plane
* Tecma Medium, a French hang glider design
Communication
* Media (communication), tools used to store and deliver information or data
* Medium of ...
capable of storing
data
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...
in a format easily readable by a
digital computer
A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These progra ...
or a
sensor.
It contrasts with
''human-readable'' medium and data.
The result is called machine-readable data or computer-readable data, and the data itself can be described as having machine-readability.
Data
Machine-readable data must be
structured data
A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be co ...
.
Attempts to create machine-readable data occurred as early as the 1960s. At the same time that seminal developments in machine-reading and natural-language processing were releasing (like
Weizenbaum's ELIZA
ELIZA is an early natural language processing computer program created from 1964 to 1966 at the MIT Artificial Intelligence Laboratory by Joseph Weizenbaum. Created to demonstrate the superficiality of communication between humans and machines ...
), people were anticipating the success of machine-readable functionality and attempting to create machine-readable documents. One such example was musicologist
Nancy B. Reich
Nancy Bassen Reich (July 3, 1924 in New York City - January 31, 2019 in Ossining, NY) was an American musicologist, most renowned for her 1985 biography of Clara Schumann.
Biography
She attended the High School of Music and Art, where she play ...
's creation of a machine-readable catalog of composer
William Jay Sydeman's works in 1966.
In the United States, the OPEN Government Data Act of 14 January 2019 defines machine-readable data as "data in a format that can be easily processed by a computer without human intervention while ensuring no semantic meaning is lost." The law directs U.S. federal agencies to publish public data in such a manner, ensuring that "any public data asset of the agency is machine-readable".
Machine-readable data may be classified into two groups: human-readable data that is
marked up so that it can also be read by machines (e.g.
microformat
Microformats (μF) are a set of defined HTML classes created to serve as consistent and descriptive metadata about an element, designating it as representing a certain type of data (such as contact information, geographic coordinates, even ...
s,
RDFa,
HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
), and
data file
A data file is a computer file which stores data to be used by a computer application or system, including input and output data. A data file usually does not contain instructions or code to be executed (that is, a computer program).
Most of the ...
formats intended principally for processing by machines (
CSV,
RDF,
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
,
JSON
JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other s ...
). These formats are only machine readable if the data contained within them is formally structured; exporting a CSV file from a badly structured spreadsheet does not meet the definition.
''Machine readable'' is not synonymous with ''digitally accessible''. A digitally accessible document may be online, making it easier for humans to access via computers, but its content is much harder to extract, transform, and process via computer programming logic if it is not machine-readable.
Extensible Markup Language
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding electronic document, documents in a format that is both Human-readable med ...
(XML) is designed to be both human- and machine-readable, and Extensible Stylesheet Language Transformation (XSLT) is used to improve the presentation of the data for human readability. For example, XSLT can be used to automatically render XML in
Portable Document Format
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating syste ...
(
PDF
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
). Machine-readable data can be automatically transformed for human-readability but, generally speaking, the reverse is not true.
For purposes of implementation of the Government Performance and Results Act (GPRA) Modernization Act, the
Office of Management and Budget
The Office of Management and Budget (OMB) is the largest office within the Executive Office of the President of the United States (EOP). OMB's most prominent function is to produce the president's budget, but it also examines agency programs, pol ...
(OMB) defines "machine readable format" as follows: "Format in a standard computer language (not English text) that can be read automatically by a web browser or computer system. (e.g.; xml). Traditional word processing documents and portable document format (PDF) files are easily read by humans but typically are difficult for machines to interpret. Other formats such as extensible markup language (
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
), (
JSON
JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other s ...
), or spreadsheets with header columns that can be exported as comma separated values (CSV) are machine readable formats. As HTML is a structural markup language, discreetly labeling parts of the document, computers are able to gather document components to assemble tables of contents, outlines, literature search bibliographies, etc. It is possible to make traditional word processing documents and other formats machine readable but the documents must include enhanced structural elements."
OMB Circular A-11, Part 6
, Preparation, Submission, and Execution of the Budget
Media
Examples of machine-readable media include magnetic media such as magnetic disk
Magnetic storage or magnetic recording is the storage of data on a magnetized medium. Magnetic storage uses different patterns of magnetisation in a magnetizable material to store data and is a form of non-volatile memory. The information is ac ...
s, cards, tapes, and drums
A drum kit (also called a drum set, trap set, or simply drums) is a collection of drums, cymbals, and other auxiliary percussion instruments set up to be played by one person. The player (drummer) typically holds a pair of matching drumsticks ...
, punched card
A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
s and paper tape
Five- and eight-hole punched paper tape
Paper tape reader on the Harwell computer with a small piece of five-hole tape connected in a circle – creating a physical program loop
Punched tape or perforated paper tape is a form of data storage ...
s, optical disc
In computing and optical disc recording technologies, an optical disc (OD) is a flat, usually circular disc that encodes binary data (bits) in the form of pits and lands on a special material, often aluminum, on one of its flat surfaces. ...
s, barcode
A barcode or bar code is a method of representing data in a visual, Machine-readable data, machine-readable form. Initially, barcodes represented data by varying the widths, spacings and sizes of parallel lines. These barcodes, now commonly refe ...
s and magnetic ink characters.
Common machine-readable technologies include magnetic recording, processing waveform
In electronics, acoustics, and related fields, the waveform of a signal is the shape of its graph as a function of time, independent of its time and magnitude scales and of any displacement in time.David Crecraft, David Gorham, ''Electron ...
s, and barcode
A barcode or bar code is a method of representing data in a visual, Machine-readable data, machine-readable form. Initially, barcodes represented data by varying the widths, spacings and sizes of parallel lines. These barcodes, now commonly refe ...
s. Optical character recognition
Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...
(OCR) can be used to enable machines to read information available to humans. Any information retrievable by any form of energy can be machine-readable.
Examples include:
* Acoustics
*Chemical
A chemical substance is a form of matter having constant chemical composition and characteristic properties. Some references add that chemical substance cannot be separated into its constituent elements by physical separation methods, i.e., wit ...
**Photochemical
Photochemistry is the branch of chemistry concerned with the chemical effects of light. Generally, this term is used to describe a chemical reaction caused by absorption of ultraviolet (wavelength from 100 to 400 nm), visible light (400– ...
*Electrical
Electricity is the set of physical phenomena associated with the presence and motion of matter that has a property of electric charge. Electricity is related to magnetism, both being part of the phenomenon of electromagnetism, as described ...
**Semiconductor
A semiconductor is a material which has an electrical conductivity value falling between that of a conductor, such as copper, and an insulator, such as glass. Its resistivity falls as its temperature rises; metals behave in the opposite way. ...
used in volatile
Volatility or volatile may refer to:
Chemistry
* Volatility (chemistry), a measuring tendency of a substance or liquid to vaporize easily
* Relative volatility, a measure of vapor pressures of the components in a liquid mixture
* Volatiles, a gr ...
RAM microchips
**Floating-gate transistor
The floating-gate MOSFET (FGMOS), also known as a floating-gate MOS transistor or floating-gate transistor, is a type of metal–oxide–semiconductor field-effect transistor (MOSFET) where the gate is electrically isolated, creating a floating no ...
used in non-volatile
Non-volatile memory (NVM) or non-volatile storage is a type of computer memory that can retain stored information even after power is removed. In contrast, volatile memory needs constant power in order to retain data.
Non-volatile memory typi ...
memory card
A memory card is an electronic data storage device used for storing digital information, typically using flash memory. These are commonly used in digital portable electronic devices. They allow adding memory to such devices using a card in a soc ...
s
**Radio transmission
Radio is the technology of signaling and communicating using radio waves. Radio waves are electromagnetic waves of frequency between 30 hertz (Hz) and 300 gigahertz (GHz). They are generated by an electronic device called a transmit ...
*Magnetic storage
Magnetic storage or magnetic recording is the storage of data on a magnetized medium. Magnetic storage uses different patterns of magnetisation in a magnetizable material to store data and is a form of non-volatile memory. The information is acc ...
*Mechanical
Mechanical may refer to:
Machine
* Machine (mechanical), a system of mechanisms that shape the actuator input to achieve a specific application of output forces and movement
* Mechanical calculator, a device used to perform the basic operations of ...
** Tins And Swins
*** Punched card
A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
*** Paper tape
Five- and eight-hole punched paper tape
Paper tape reader on the Harwell computer with a small piece of five-hole tape connected in a circle – creating a physical program loop
Punched tape or perforated paper tape is a form of data storage ...
****Music roll
A music roll is a storage medium used to operate a mechanical musical instrument. They are used for the player piano, mechanical organ, electronic carillon and various types of orchestrion. The vast majority of music rolls are made of paper. Other ...
*** Music box
A music box (American English) or musical box (British English) is an automatic musical instrument in a box that produces musical notes by using a set of pins placed on a revolving cylinder or disc to pluck the tuned teeth (or ''lamellae ...
cylinder or disk
**Grooves ''(See also: Audio Data)''
*** Phonograph cylinder
Phonograph cylinders are the earliest commercial medium for recording and reproducing sound. Commonly known simply as "records" in their era of greatest popularity (c. 1896–1916), these hollow cylindrical objects have an audio recording engr ...
*** Gramophone record
A phonograph record (also known as a gramophone record, especially in British English), or simply a record, is an analog sound storage medium in the form of a flat disc with an inscribed, modulated spiral groove. The groove usually starts near ...
*** DictaBelt
The Dictabelt, in early years and much less commonly also called a Memobelt, is an analog audio recording medium commercially introduced by the American Dictaphone company in 1947. Having been intended for recording dictation and other speech ...
(groove on plastic belt)
*** Capacitance Electronic Disc
The Capacitance Electronic Disc (CED) is an analog video disc playback system developed by RCA, in which video and audio could be played back on a TV set using a special stylus and high-density groove system similar to phonograph records.
Firs ...
*Optics
Optics is the branch of physics that studies the behaviour and properties of light, including its interactions with matter and the construction of instruments that use or detect it. Optics usually describes the behaviour of visible, ultra ...
**Optical storage
IBM defines optical storage as "any storage method that uses a laser to store and retrieve data from optical media." '' Britannica'' notes that it "uses low-power laser beams to record and retrieve digital (binary) data." Compact disc (CD) an ...
*Thermodynamic
Thermodynamics is a branch of physics that deals with heat, work, and temperature, and their relation to energy, entropy, and the physical properties of matter and radiation. The behavior of these quantities is governed by the four laws of t ...
Applications
Documents
Catalogs
Dictionaries
Passports
See also
* Paper data storage
Paper data storage refers to the use of paper as a data storage device. This includes writing, illustrating, and the use of data that can be interpreted by a machine or is the result of the functioning of a machine. A defining feature of paper d ...
* Symmetric Phase Recording Symmetric Phase Recording is a tape recording (Computer storage media) technology developed by Quantum Corporation packs data across a tape's recording surface by writing adjacent tracks in a herringbone pattern:
track 0 = \\\\\, track 1 = /////, t ...
* Open data
Open data is data that is openly accessible, exploitable, editable and shared by anyone for any purpose. Open data is licensed under an open license.
The goals of the open data movement are similar to those of other "open(-source)" movements ...
* Linked data
In computing, linked data (often capitalized as Linked Data) is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but ...
* Human-readable medium and data
* Semantic Web
* Machine-readable postal marking
References
Computing terminology
Storage media
Optical character recognition
{{compu-storage-stub