HOME

TheInfoList



OR:

SPARQL (pronounced "
sparkle Sparkle may refer to: * Sparkle (catamaran), a catamaran designed by Angus Primrose * Sparkle (drink), a lemon-flavored soft drink * Sparkle, a brand of paper towels owned by Georgia-Pacific * Sparkle Plenty, a character in the ''Dick Tracy'' c ...
" , a
recursive acronym A recursive acronym is an acronym that refers to itself, and appears most frequently in computer programming. The term was first used in print in 1979 in Douglas Hofstadter's book '' Gödel, Escher, Bach: An Eternal Golden Braid'', in which Hofs ...
for SPARQL Protocol and RDF Query Language) is an
RDF query language An RDF query language is a computer language, specifically a query language for databases, able to retrieve and manipulate data stored in Resource Description Framework (RDF) format. SPARQL has emerged as the standard RDF query language, and in ...
—that is, a
semantic Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
query language Query languages, data query languages or database query languages (DQL) are computer languages used to make queries in databases and information systems. A well known example is the Structured Query Language (SQL). Types Broadly, query language ...
for
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases ...
s—able to retrieve and manipulate data stored in Resource Description Framework (RDF) format. It was made a standard by the ''RDF Data Access Working Group'' (DAWG) of the
World Wide Web Consortium The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working ...
, and is recognized as one of the key technologies of the semantic web. On 15 January 2008, SPARQL 1.0 was acknowledged by W3C as an official recommendation, and SPARQL 1.1 in March, 2013. SPARQL allows for a query to consist of triple patterns, conjunctions, disjunctions, and optional
pattern A pattern is a regularity in the world, in human-made design, or in abstract ideas. As such, the elements of a pattern repeat in a predictable manner. A geometric pattern is a kind of pattern formed of geometric shapes and typically repeated li ...
s. Implementations for multiple
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s exist. There exist tools that allow one to connect and semi-automatically construct a SPARQL query for a SPARQL endpoint, for example ViziQuer. In addition, tools exist to translate SPARQL queries to other query languages, for example to SQL and to
XQuery XQuery (XML Query) is a query and functional programming language that queries and transforms collections of structured and unstructured data, usually in the form of XML, text and with vendor-specific extensions for other data formats (JSON, b ...
.


Advantages

SPARQL allows users to write queries against what can loosely be called "key-value" data or, more specifically, data that follow the RDF specification of the W3C. Thus, the entire database is a set of "subject-predicate-object" triples. This is analogous to some
NoSQL A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
databases' usage of the term "document-key-value", such as
MongoDB MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and licensed under the S ...
. In SQL
relational database A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relati ...
terms, RDF data can also be considered a table with three columns – the subject column, the predicate column, and the object column. The subject in RDF is analogous to an entity in a SQL database, where the data elements (or fields) for a given business object are placed in multiple columns, sometimes spread across more than one table, and identified by a
unique key In relational database management systems, a unique key is a candidate key that is not the primary key of the relation. All the candidate keys of a relation can uniquely identify the records of the relation, but only one of them is used as the prim ...
. In RDF, those fields are instead represented as separate predicate/object rows sharing the same subject, often the same unique key, with the predicate being analogous to the column name and the object the actual data. Unlike relational databases, the object column is heterogeneous: the per-cell data type is usually implied (or specified in the
ontology In metaphysics, ontology is the philosophy, philosophical study of being, as well as related concepts such as existence, Becoming (philosophy), becoming, and reality. Ontology addresses questions like how entities are grouped into Category ...
) by the
predicate Predicate or predication may refer to: * Predicate (grammar), in linguistics * Predication (philosophy) * several closely related uses in mathematics and formal logic: **Predicate (mathematical logic) **Propositional function **Finitary relation, o ...
value. Also unlike SQL, RDF can have multiple entries per predicate; for instance, one could have multiple "child" entries for a single "person", and can return collections of such objects, like "children". Thus, SPARQL provides a full set of analytic query operations such as JOIN, SORT, AGGREGATE for data whose
schema The word schema comes from the Greek word ('), which means ''shape'', or more generally, ''plan''. The plural is ('). In English, both ''schemas'' and ''schemata'' are used as plural forms. Schema may refer to: Science and technology * SCHEMA ...
is intrinsically part of the data rather than requiring a separate schema definition. However, schema information (the ontology) is often provided externally, to allow joining of different datasets unambiguously. In addition, SPARQL provides specific
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
traversal syntax for data that can be thought of as a graph. The example below demonstrates a simple query that leverages the ontology definition
foaf FOAF (an acronym of friend of a friend) is a machine-readable ontology describing persons, their activities and their relations to other people and objects. Anyone can use FOAF to describe themselves. FOAF allows groups of people to describe s ...
("friend of a friend"). Specifically, the following query returns names and emails of every person in the dataset: PREFIX foaf: SELECT ?name ?email WHERE This query joins together all of the triples with a matching subject, where the type predicate, "a", is a person (foaf:Person), and the person has one or more names (foaf:name) and mailboxes (foaf:mbox). The author of this query chose to reference the subject using the variable name "?person" for readable clarity. Since the first element of the triple is always the subject, the author could have just as easily used any variable name, such as "?subj" or "?x". Whatever name is chosen, it must be the same on each line of the query to signify that the query engine is to join triples with the same subject. The result of the join is a set of rows – ?person, ?name, ?email. This query returns the ?name and ?email because ?person is often a complex URI rather than a human-friendly string. Note that any ?person may have multiple mailboxes, so in the returned set, a ?name row may appear multiple times, once for each mailbox. This query can be distributed to multiple SPARQL endpoints (services that accept SPARQL queries and return results), computed, and results gathered, a procedure known as federated query. Whether in a federated manner or locally, additional triple definitions in the query could allow joins to different subject types, such as automobiles, to allow simple queries, for example, to return a list of names and emails for people who drive automobiles with a high fuel efficiency.


Query forms

In the case of queries that read data from the database, the SPARQL language specifies four different query variations for different purposes. ;SELECT query: Used to extract raw values from a SPARQL endpoint, the results are returned in a table format. ;CONSTRUCT query: Used to extract information from the SPARQL endpoint and transform the results into valid RDF. ;ASK query: Used to provide a simple True/False result for a query on a SPARQL endpoint. ;DESCRIBE query: Used to extract an RDF graph from the SPARQL endpoint, the content of which is left to the endpoint to decide, based on what the maintainer deems as useful information. Each of these query forms takes a WHERE block to restrict the query, although, in the case of the DESCRIBE query, the WHERE is optional. SPARQL 1.1 specifies a language for updating the database with several new query forms.


Example

Another SPARQL query example that models the question "What are all the country capitals in Africa?": PREFIX ex: SELECT ?capital ?country WHERE Variables are indicated by a ? or $ prefix. Bindings for ?capital and the ?country will be returned. When a triple ends with a semicolon, the subject from this triple will implicitly complete the following pair to an entire triple. So for example ex:isCapitalOf ?y is short for ?x ex:isCapitalOf ?y. The SPARQL query processor will search for sets of triples that match these four triple patterns, binding the variables in the query to the corresponding parts of each triple. Important to note here is the "property orientation" (class matches can be conducted solely through class-attributes or properties – see
Duck typing Duck typing in computer programming is an application of the duck test—"If it walks like a duck and it quacks like a duck, then it must be a duck"—to determine whether an object can be used for a particular purpose. With nominative ty ...
) To make queries concise, SPARQL allows the definition of prefixes and base URIs in a fashion similar to
Turtle Turtles are an order of reptiles known as Testudines, characterized by a special shell developed mainly from their ribs. Modern turtles are divided into two major groups, the Pleurodira (side necked turtles) and Cryptodira (hidden necked t ...
. In this query, the prefix "ex" stands for “http://example.com/exampleOntology#”.


Extensions

GeoSPARQL GeoSPARQL is a standard for representation and querying of geospatial linked data for the Semantic Web from the Open Geospatial Consortium (OGC). The definition of a small ontology based on well-understood OGC standards is intended to provide a ...
defines filter functions for
geographic information system A geographic information system (GIS) is a type of database containing geographic data (that is, descriptions of phenomena for which location is relevant), combined with software tools for managing, analyzing, and visualizing those data. In a ...
(GIS) queries using well-understood OGC standards ( GML, WKT, etc.). SPARUL is another extension to SPARQL. It enables the RDF store to be updated with this declarative query language, by adding INSERT and DELETE methods. XSPARQL is an integrated query language combining
XQuery XQuery (XML Query) is a query and functional programming language that queries and transforms collections of structured and unstructured data, usually in the form of XML, text and with vendor-specific extensions for other data formats (JSON, b ...
with SPARQL to query both XML and RDF data sources at once.


Implementations

Open source, reference SPARQL implementations *
RDF4J Eclipse RDF4J (formerly OpenRDF Sesame) is an open-source framework for storing, querying, and analysing RDF data. It was created by the Dutch software company Aduna as part of "On-To-Knowledge", a semantic web project that ran from 1999 to 2002 ...
, formerly Sesame from
Eclipse Foundation The Eclipse Foundation AISBL is an independent, Europe-based not-for-profit corporation that acts as a steward of the Eclipse open source software development community, with legal jurisdiction in the European Union. It is an organization suppo ...
*
Jena (framework) Apache Jena is an open source Semantic Web framework for Java. It provides an API to extract data from and write to RDF graphs. The graphs are represented as an abstract "model". A model can be sourced with data from files, databases, URLs or ...
from
Apache Software Foundation The Apache Software Foundation (ASF) is an American nonprofit corporation (classified as a 501(c)(3) organization in the United States) to support a number of open source software projects. The ASF was formed from a group of developers of the ...
* OpenLink Virtuoso See
List of SPARQL implementations This list shows notable triplestores, APIs, and other storages that have implemented the W3C SPARQL standard. * Amazon Neptune * Apache Marmotta *AllegroGraph * Eclipse RDF4J *Apache Jena with ARQ * Blazegraph * Cray Urika-GD * IBM Db2 - Removed i ...
for more comprehensive coverage, including
triplestore A triplestore or RDF store is a purpose-built database for the storage and retrieval of triples through semantic queries. A triple is a data entity composed of subject–predicate–object, like "Bob is 35" or "Bob knows Fred". Much like a rel ...
, APIs, and other storages that have implemented the SPARQL standard.


See also

*
Semantic Integration Semantic integration is the process of interrelating information from diverse sources, for example calendars and to do lists, email archives, presence information (physical, psychological, and social), documents of all sorts, contacts (including ...
* SPARQL Query Results XML Format * SPARQL Syntax Expressions *
Wikidata Wikidata is a collaboratively edited multilingual knowledge graph hosted by the Wikimedia Foundation. It is a common source of open data that Wikimedia projects such as Wikipedia, and anyone else, can use under the CC0 public domain license ...


References


External links


W3C Data Activity Blog

W3C SPARQL 1.1 Working Group - closed - mailing lists and archives
was RDF Data Access Working Group
SPARQL 1.1 Recommendation

SPARQL 1.0 Query language
(legacy)
SPARQL 1.0 Protocol
(legacy)
SPARQL 1.0 Query XML Results Format
(legacy)

Mappings between OWL-RDF/S & XML Schemas, and XML Schema to OWL Transformation.


SPARQL Syntax Expressions translations of the DAWG test suite

Wikidata

Wikidata Query Service Tutorial

DBpedia
{{DEFAULTSORT:Sparql Data modeling languages Declarative programming languages Query languages RDF data access Resource Description Framework Web services World Wide Web Consortium standards Programming languages created in 2008