PROV (Provenance)
   HOME

TheInfoList



OR:

The standard defines a data model, serializations, and definitions to support the interchange of provenance information on the Web. Here ''provenance'' includes all "information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness". PROV is a set of recommended standards of the
World Wide Web Consortium The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
. These include its data model, an
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
schema for that model, an OWL2 ontology mapping that model to RDF, and a mapping from that ontology to
Dublin Core 140px, Logo of DCMI, maintenance agency for Dublin Core Terms The Dublin Core vocabulary, also known as the Dublin Core Metadata Terms (DCMT), is a general purpose metadata vocabulary for describing resources of any type. It was first developed ...
. It also includes a notation standard for provenance that is easy for humans to read; methods for accessing and querying prov; and a few other subspecifications.


PROV model overview

The core concepts defined by the PROV Model are Entity, Activity and Agent. The remaining concepts are relationships between these (e.g. ''Derivation'', ''Usage'', ''Generation'') or specializations (e.g. ''Person'', ''Collection'', ''Plan''). An Entity captures a thing in the world (in a particular state). The entity ''was derived from'' some other entity, and ''was generated by'' an Activity that ''used'' other entities. An Agent (e.g. a person or
software execution Execution in computer engineering, computer and software engineering is the process by which a computer or virtual machine interprets and acts on the instructions of a computer program. Each instruction of a program is a description of a particul ...
) ''was associated with'' the activity, and the entity that ''was generated by'' the activity ''was attributed to'' that agent.


PROV serializations

Provenance statements can be serialized in different PROV formats, while expressing the same PROV model. Some of the PROV types and relationship names have slight variations from the PROV model concepts to be idiomatic to the format. For example
PROV-N
is a textual format that has a direct mapping to the PROV model: document prefix ex entity(ex:e1) activity(ex:a2, 2011-11-16T16:00:00, 2011-11-16T16:00:01) wasGeneratedBy(ex:e1, ex:a2, -) endDocument The above can be expressed as
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
using th
PROV-XML
schema: 2011-11-16T16:00:00.000Z 2011-11-16T16:00:01.000Z Using th
PROV-O
mapping to the OWL2 ontology language, which again can be serialized in the RDF forma
Turtle
@prefix prov: . @prefix xsd: . @prefix ex: . ex:e1 a prov:Entity . ex:a2 a prov:Activity ; prov:startedAtTime "2011-11-16T16:00:00.000Z"^^xsd:dateTime ; prov:endedAtTime "2011-11-16T16:00:01.000Z"^^xsd:dateTime . ex:e1 prov:wasGeneratedBy ex:a2 .


Tooling

Software tools have been developed to help converting between PROV formats and to generate/parse PROV documents in different programming languages:
PROV Translator
- web service
PROV Toolbox
- Java API and command line tool
PROV Python library
- Python API


References

{{reflist Semantic Web