HOME

TheInfoList



OR:

XML Information Set (XML Infoset) is a W3C specification that defines an abstract
data model A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be ...
of an
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
document in terms of a set of ''information items''. The XML Infoset provides a standardized way to refer to the components of XML documents, serving as a foundation for XML-related standards and tools. The XML Infoset identifies eleven different types of information items, including the document, elements, attributes, processing instructions, characters, and
namespaces In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
. Each information item has a set of named properties, which represent specific aspects of the XML document being modeled. For example, an element information item has properties such as the element's namespace name, local name,
children A child () is a human being between the stages of childbirth, birth and puberty, or between the Development of the human body, developmental period of infancy and puberty. The term may also refer to an unborn human being. In English-speaking ...
, and attributes. An XML document has an information set if it is well-formed and satisfies the
namespace In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
constraints. There is no requirement for an XML document to be valid according to a DTD or
XML Schema An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constrai ...
in order to have an information set. XML was initially developed without a formal definition of its infoset. This conceptual foundation was only formalized by later work beginning in 1999, first published as a separate W3C Working Draft at the end of December that year. The Infoset Recommendation Second Edition was adopted on February 4, 2004. The XML Information Set specification has become a cornerstone of the XML technology stack, enabling higher-level specifications such as
XPath XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) in 1999, and can be used to compute values (e.g., strings, numbers, or ...
,
XSLT XSLT (Extensible Stylesheet Language Transformations) is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text, or XSL Formatting Objects. These formats c ...
, DOM,
XQuery XQuery (XML Query) is a query language and functional programming language designed to query and transform collections of structured and unstructured data, primarily in the form of XML. It also supports text data and, through implementation-sp ...
, and many others to describe their functionality in terms of the XML Infoset rather than the concrete XML syntax. This abstraction allows these technologies to operate on XML content regardless of its specific serialization format. If a 2.0 version of the XML standard is ever published, it is likely that this would absorb the Infoset recommendation as an integral part of that standard.


Information items

An information set can contain up to eleven different types of information items: #The Document Information Item (always present) #Element Information Items #Attribute Information Items # Processing Instruction Information Items #Unexpanded Entity Reference Information Items #Character Information Items #Comment Information Items #The Document Type Declaration Information Item #Unparsed Entity Information Items #Notation Information Items #
Namespace In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
Information Items


Infoset augmentation

Infoset augmentation or infoset modification refers to the process of modifying the infoset during
schema Schema may refer to: Science and technology * SCHEMA (bioinformatics), an algorithm used in protein engineering * Schema (genetic algorithms), a set of programs or bit strings that have some genotypic similarity * Schema.org, a web markup vocab ...
validation, for example by adding default attributes. The augmented infoset is called the post-schema-validation infoset, or PSVI. Infoset augmentation is somewhat controversial, with claims that it is a violation of modularity and tends to cause interoperability problems, since applications get different information depending on whether or not validation has been performed. Infoset augmentation is supported by
XML Schema An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constrai ...
but not
RELAX NG In computing, RELAX NG (REgular LAnguage for XML Next Generation) is a schema language for XML—a RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema is itself an XML document but RELAX NG also ...
.


Serialization

Typically, XML Information Set is serialized as XML. There are also serialization formats for Binary XML, CSV, and
JSON JSON (JavaScript Object Notation, pronounced or ) is an open standard file format and electronic data interchange, data interchange format that uses Human-readable medium and data, human-readable text to store and transmit data objects consi ...
.Apache CXF JSON Support
/ref>


See also

XML Information Set instances: *
Document Object Model The Document Object Model (DOM) is a cros s-platform and language-independent API that treats an HTML or XML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document with ...
* Xpath data model * SXML


References


External links

* World Wide Web Consortium standards XML-based standards {{www-stub ja:Extensible Markup Language#XMLインフォメーションセット