XProc is an
XML transformation language
An XML transformation language is a programming language designed specifically to transform an ''input'' XML document into an ''output'' document which satisfies some specific goal.
There are two special cases of transformation:
* XML to XML: ...
for processing documents in pipelines: chaining conversions and other steps together to
achieve the desired results. It can handle documents in
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
,
HTML
Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
,
JSON
JSON (JavaScript Object Notation, pronounced or ) is an open standard file format and electronic data interchange, data interchange format that uses Human-readable medium and data, human-readable text to store and transmit data objects consi ...
,
text
Text may refer to:
Written word
* Text (literary theory)
In literary theory, a text is any object that can be "read", whether this object is a work of literature, a street sign, an arrangement of buildings on a city block, or styles of clothi ...
and
binary
Binary may refer to:
Science and technology Mathematics
* Binary number, a representation of numbers using only two values (0 and 1) for each digit
* Binary function, a function that takes two arguments
* Binary operation, a mathematical op ...
.
The current (stable) version is 3.0.
[The XProc website](_blank)
/ref> While XProc 1.0[The XProc 1.0 specification](_blank)
/ref> is a W3C
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
Recommendation, XProc 3.0 is a standard developed by the W3C
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
XProc Next Community Group.[The XProc next community group](_blank)
/ref>
Its main characteristics are:
* XProc is a programming language, expressed in XML, in which you can write pipelines.
* An XProc pipeline takes data as its input (often XML) and passes this through specialized steps to produce end results.
* Steps range from simple ones, like adding attributes, to more complex stuff like splitting/combining/pruning, transformations with XSLT
XSLT (Extensible Stylesheet Language Transformations) is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text, or XSL Formatting Objects. These formats c ...
and XQuery
XQuery (XML Query) is a query language and functional programming language designed to query and transform collections of structured and unstructured data, primarily in the form of XML. It also supports text data and, through implementation-sp ...
, validations against schemas, etc.
* Within a pipeline you can do things like working with variables, branching, looping, catch errors, etc. Everything is based on the data flowing through.
* XProc pipelines are not limited to a linear succession of steps. They can fork and merge.
* XProc allows you to create custom steps by combining other steps. These custom steps can be used just like any other. Therefore, pipelines and steps are interchangeable concepts in XProc.
* Custom steps can be collected into libraries.
* XProc aids in the housekeeping surrounding the processing, like inspecting directories, reading documents from zip files, writing things to disk, etc
* There is software that can execute these pipelines, the so-called XProc processors.
Example
The following is a (very) simple XProc pipeline:
* It declares two ''ports'':
** An input port called source
. This is where the original document flows in.
** An output port called result
. This is where the resulting document flows out.
* The document that comes in through the source
port automatically flows into the first step of the pipeline. This p:add-attribute
step adds an attribute called timestamp
with the current date and time.
* The result of this flows through the p:delete
step that removes all attributes called data
.
* Since p:delete
is the last step, the resulting document flows out through the output result
port.
So if you supply the following XML document to this pipeline:
- Some data...
It comes out as:
- Some data...
The exact date and time recorded in the timestamp
attribute is of course dependent on the date and time the pipeline is executed.
Understanding and learning XProc
The learning page of the XProc website contains links to all the learning and reference
materials the XProc community group is aware of. There is a special 101 section with introductory learning materials.
History
Ideas for something, some programming language, for processing were there right from
the beginnings of XML, at the end of the twentieth century. But it was not until the end of
2005 that the W3C started a working group called the ''XML Processing Model Working Group''.
this resulted in the recommendation for XProc 1.0 dated May 11, 2010.
There were various attempts to create working XProc 1.0 processors. The only
two currently available as open source products that implement the full
1.0 standard are XML Calabash[The XML Calabash 1.0 processor](_blank)
/ref> and
MorganaXProc.[The Morgana XProc 1.0 processor](_blank)
/ref>
After the release of version 1.0, the XProc working group continued
debating a next version. Ideas were raised for version 2.0. This was based on a non-XML syntax
which didn’t raise a lot of support from the community. Engagement in the working grouped waned and in 2016 it ceased to exist.
In June 2017 the ''XProc Next Community Group''
was founded and started working on a new version, now completely XML based. Because this was a
completely different approach than the 2.0 initiative, the version number was increased to 3.0. A stable version was released on 12 September 2022.
In 2024 the working group started work on a minor update to 3.1.
Implementations
The following processors support the XProc 3.0 standard:
* MorganaXProc-IIIse, maintained by Achim Berndzen. Implements all required and most of the non-required parts of the XProc standard.
* XML Calabash 3,The XML Calabash 3.0 processor
/ref> maintained by Norman Walsh. This is (2024) under development.
Older versions
The following processors support the XProc 1.0 standard. There were several other XProc 1.0 implementations, but these were either
incomplete or are not maintained.
* XML Calabash, maintained by Norman Walsh. This processor is also integrated in the Oxygen XML Editor product.
* Morgana Xproc 1.0, maintained by Achim Berndzen.
Logo
This is the logo of XProc. It was created by Bethan Tovey-Walsh. The fish is called ''Kanava'', which is Finnish for pipeline.
References
{{DEFAULTSORT:Xproc
World Wide Web Consortium standards
XML-based standards
XML-based programming languages
Markup languages
Declarative programming languages
Concurrent programming languages
Domain-specific programming languages