A relational data stream management system (RDSMS) is a distributed, in-memory
data stream management system
A data stream management system (DSMS) is a computer software system to manage continuous data streams. It is similar to a database management system (DBMS), which is, however, designed for static data in conventional databases. A DBMS also offers ...
(DSMS) that is designed to use standards-compliant
SQL queries to process unstructured and structured data streams in real-time. Unlike SQL queries executed in a traditional
RDBMS
A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relation ...
, which return a result and exit, SQL queries executed in a RDSMS do not exit, generating results continuously as new data become available. Continuous SQL queries in a RDSMS use the SQL Window function to analyze, join and aggregate data streams over fixed or sliding windows. Windows can be specified as time-based or row-based.
RDSMS SQL Query Examples
Continuous SQL queries in a RDSMS conform to the
ANSI
The American National Standards Institute (ANSI ) is a private non-profit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organi ...
SQL standards. The most common RDSMS SQL query is performed with the declarative
SELECT
statement. A continuous SQL
SELECT
operates on data across one or more data streams, with optional keywords and clauses that include
FROM
with an optional
JOIN
subclause to specify the rules for joining multiple data streams, the
WHERE
clause and comparison predicate to restrict the records returned by the query,
GROUP BY
to project streams with common values into a smaller set,
HAVING
to filter records resulting from a
GROUP BY
, and
ORDER BY
to sort the results.
The following is an example of a continuous data stream aggregation using a
SELECT
query that aggregates a sensor stream from a weather monitoring station. The
SELECT
query aggregates the minimum, maximum and average temperature values over a one-second time period, returning a continuous stream of aggregated results at one second intervals.
SELECT STREAM
FLOOR(WEATHERSTREAM.ROWTIME to SECOND) AS FLOOR_SECOND,
MIN(TEMP) AS MIN_TEMP,
MAX(TEMP) AS MAX_TEMP,
AVG(TEMP) AS AVG_TEMP
FROM WEATHERSTREAM
GROUP BY FLOOR(WEATHERSTREAM.ROWTIME TO SECOND);
RDSMS SQL queries also operate on data streams over time or row-based windows. The following example shows a second continuous SQL query using the
WINDOW
clause with a one-second duration. The
WINDOW
clause changes the behavior of the query, to output a result for each new record as it arrives. Hence the output is a stream of incrementally updated results with zero result latency.
SELECT STREAM
ROWTIME,
MIN(TEMP) OVER W1 AS WMIN_TEMP,
MAX(TEMP) OVER W1 AS WMAX_TEMP,
AVG(TEMP) OVER W1 AS WAVG_TEMP
FROM WEATHERSTREAM
WINDOW W1 AS ( RANGE INTERVAL '1' SECOND PRECEDING );
See also
*
NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
*
NewSQL
NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system.
Many e ...
External links
IBM System S transcript of a reunion meeting devoted to the personal history of relational databases, SQL System R.
Data management
Relational model