Jaql (pronounced "jackal") is a
functional
Functional may refer to:
* Movements in architecture:
** Functionalism (architecture)
** Form follows function
* Functional group, combination of atoms within molecules
* Medical conditions without currently visible organic basis:
** Functional sy ...
data processing and query language most commonly used for
JSON
JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other ser ...
query processing on big data.
It started as an open source project at Google but the latest release was on 2010-07-12. IBM took it over as primary data processing language for their
Hadoop
Apache Hadoop () is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage an ...
software packag
BigInsights
Although having been developed for
JSON
JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other ser ...
it supports a variety of other data sources like
CSV,
TSV,
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
.
A comparison to other BigData query languages like
PIG Latin
Pig Latin is a language game or argot in which words in English are altered, usually by adding a fabricated suffix or by moving the onset or initial consonant or consonant cluster of a word to the end of the word and adding a vocalic syllable ...
and
Hive QL illustrates performance and usability aspects of these technologies.
Jaql supports
lazy evaluation
In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations (sharing).
The b ...
, so expressions are only materialized when needed.
Syntax
The basic concept of Jaql is
source -> operator(parameter) -> sink ;
where a sink can be a source for a downstream operator. So typically a Jaql program has to following structure, expressing a
data processing graph:
source -> operator1(parameter) -> operator2(parameter) -> operator2(parameter) -> operator3(parameter) -> operator4(parameter) -> sink ;
Most commonly for readability reasons Jaql programs are linebreaked after the arrow, as is also a common idiom in Twitte
Scalding
source -> operator1(parameter)
-> operator2(parameter)
-> operator2(parameter)
-> operator3(parameter)
-> operator4(parameter)
-> sink ;
Core operators
/ref>
Expand
Use the EXPAND expression to flatten nested arrays. This expression takes as input an array of nested arrays
[ T ">[_T_.html" ;"title="[ T ">[ T and produces an output array [ T ">[_T_.html"_;"title="[_T_">[_T_<_a>.html" ;"title="[_T_.html" ;"title="[ T ">[ T ">[_T_.html" ;"title="[ T ">[ T and produces an output array [ T by promoting the elements of each nested array to the top-level output array.
Filter
Use the FILTER operator to filter away elements from the specified input array. This operator takes as input an array of elements of type T and outputs an array of the same type, retaining those elements for which a predicate evaluates to true. It is the Jaql equivalent of the
SQL WHERE clause.
Example:
data = Where (SQL)">SQL WHERE clause.
Example:
data = [
,
,
,
data -> filter $.manager;
[
"> ,
,
,
">Where (SQL)">SQL WHERE clause.
Example:
data = [
,
,
,
data -> filter $.manager;
[
data -> filter $.income < 30000;
[
,
]
Group
Use the GROUP expression to group one or more input arrays on a grouping key and applies an aggregate function per group.
Join
Use the JOIN operator to express a join between two or more input arrays. This operator supports multiple types of joins, including natural, left-outer, right-outer, and outer joins.
Sort
Use the SORT operator to sort an input by one or more fields.
Top
The TOP expression selects the first k elements of its input. If a comparator is provided, the output is semantically equivalent to sorting the input, then selecting the first elements.
Transform
Use the TRANSFORM operator to realize a projection or to apply a function to all items of an output.
See also
* jq
* JSONiq
JSONiq is a query and functional programming language that is designed to declaratively query and transform collections of hierarchical and heterogeneous data in format of JSON, XML, as well as unstructured, textual data.
JSONiq is an open speci ...
* XPath
XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) and can be used to compute values (e.g., strings, numbers, or Boolean v ...
* XQuery
XQuery (XML Query) is a query and functional programming language that queries and transforms collections of structured and unstructured data, usually in the form of XML, text and with vendor-specific extensions for other data formats (JSON, bin ...
References
External links
Definition of the JAQL language
JAQL Introduction
and PIG
The pig (''Sus domesticus''), often called swine, hog, or domestic pig when distinguishing from other members of the genus '' Sus'', is an omnivorous, domesticated, even-toed, hoofed mammal. It is variously considered a subspecies of ''Sus s ...
]
Adaptive Processing of User-Defined Aggregates in Jaql
{{Query languages
2008 software
Query languages
JSON