HOME

TheInfoList



OR:

Riak (pronounced "ree-ack" ) is a distributed
NoSQL NoSQL (originally meaning "Not only SQL" or "non-relational") refers to a type of database design that stores and retrieves data differently from the traditional table-based structure of relational databases. Unlike relational databases, which ...
key-value data store that offers
high availability High availability (HA) is a characteristic of a system that aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period. There is now more dependence on these systems as a result of modernization ...
,
fault tolerance Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission-critical, or even life-critical systems. Fault t ...
, operational simplicity, and
scalability Scalability is the property of a system to handle a growing amount of work. One definition for software systems specifies that this may be done by adding resources to the system. In an economic context, a scalable business model implies that ...
. Riak moved to an entirely
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
project in August 2017, with many of the licensed Enterprise Edition features being incorporated. Riak implements the principles from Amazon's
Dynamo "Dynamo Electric Machine" (end view, partly section, ) A dynamo is an electrical generator that creates direct current using a commutator. Dynamos employed electromagnets for self-starting by using residual magnetic field left in the iron cores ...
paper with heavy influence from the
CAP theorem In database theory, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer (scientist), Eric Brewer, states that any distributed data store can provide at most Inconsistent triad, two of the following three guarantees: ; ...
. Written in Erlang, Riak has fault-tolerant data replication and automatic data distribution across the cluster for performance and resilience. Riak has a pluggable backend for its core storage, with the default storage backend being
Bitcask Bitcask is an Erlang (programming language), Erlang application that provides an API for storing and retrieving key/value data into a log-structured hash table. The design owes a lot to the principles found in log-structured file systems and draw ...
.
LevelDB LevelDB is an open-source on-disk key-value store written by Google fellows Jeffrey Dean and Sanjay Ghemawat. Inspired by Bigtable, LevelDB source code is hosted on GitHub under the New BSD License and has been ported to a variety of Unix-base ...
is also supported, with other options (such as the pure-Erlang Leveled) available depending on the version. Riak was originally developed by engineers employed by Basho Technologies and maintained by them until 2017 when the rights were sold to
bet365 Bet365 is a British gambling company founded in 2000. Its product offering includes sports betting, online casino, online poker, and online bingo. Business operations are conducted from its headquarters in Stoke-on-Trent, alongside a satellite ...
after Basho went into receivership.


Main features

;Fault-tolerant availability: Riak replicates key/value stores across a cluster of nodes with a default n_val of three. In the case of node outages due to
network partition A network partition is a division of a computer network into relatively independent subnets, either by design, to optimize them separately, or due to the failure of network devices. Distributed software must be designed to be partition-tolerant, ...
or hardware failures, data can still be written to a neighboring node beyond the initial three, and read-back due to its "masterless" peer-to-peer architecture. ;Queries: Riak provides a REST-ful
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
through HTTP and
Protocol Buffers Protocol Buffers (Protobuf) is a free and open-source cross-platform data format used to serialize structured data. It is useful in developing programs that communicate with each other over a network or for storing data. The method involves an ...
for basic PUT, GET, POST, and DELETE functions. More complex queries are also possible, including secondary indexes, search (via
Apache Solr Solr (pronounced "solar") is an open-source enterprise-search platform, written in Java. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features ...
), and
MapReduce MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel and distributed algorithm on a cluster. A MapReduce program is composed of a ''map'' procedure, which performs filte ...
. MapReduce has native support for both
JavaScript JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior. Web browsers have ...
(using the
SpiderMonkey SpiderMonkey is an open-source JavaScript and WebAssembly engine by the Mozilla Foundation. The engine powers the Firefox Web browser and has used multiple generations of JavaScript just-in-time (JIT) compilers, including TraceMonkey, Jäg ...
runtime) and Erlang. ;Predictable latency: Riak distributes data across nodes with hashing and can provide latency profile, even in the case of multiple node failures. ;Storage options: Keys/values can be stored in memory, disk, or both. ;Multi-datacenter replication: Multi-Datacenter replication (MDC) provides uni-directional and bi-direction replication of data between Riak clusters, whether locally for resilience or globally for faster regional access. Uni-directional replication is useful for read-only sinks such as backups and Disaster Recovery sites. Bi-directional replication allows for multiple Riak cluster to have eventually consistent data across vast distances. Complex replication scenarios such as chains, hub-and-spoke and mesh networks are possible due to the Cascades feature, which allows replication of data between clusters that are not directly connected. There are two primary modes of operation: fullsync and realtime. Fullsync mode ensures that all data on the source cluster is replicated to the sink cluster. Only the metadata and changes are transferred, making this fast and efficient. Realtime mode sends updates made to a source cluster to the sink cluster in realtime. These modes are designed to work together for best performance All multi-datacenter replication occurs over multiple concurrent TCP connections to maximize performance and network utilization. ;Tunable consistency: Option to choose between eventual and strong consistency for each bucket.


Main products

All versions of Riak are now entirely open-source and free, and include the extra features that Basho charged license fees for. Basho operated a freemium model, wherein they provided free versions of Riak in the form of Riak Core, Riak KV, Riak CS and Riak TS but made their money from licensing more advanced features and SLA-based support. The extra features from the Enterprise Editions have since been integrated into the open source version of Riak KV, as of Riak KV release 2.2.6. and Riak CS 2.1.2


Riak Core and Riak Core Lite


Riak Core

riak_core is the distributed systems framework that underpins Riak, forming the foundation for all Riak versions. It is being maintained as part of Riak.


Riak Core Lite

riak_core_lite is intended for general use as a base for creating distributed systems.


Riak KV (Key-Value)

Riak KV is a distributed NoSQL database designed to deliver maximum data availability by distributing data across multiple servers, meaning that if one client can reach one server, it should be able to read and write data. KV went through a few names in its lifetime, starting as Riak then Riak DS (for Data Store) and finally Riak KV (for Key-Value). When Basho Technologies went into receivership in 2017 KV development was picked up by the open source community and has continued into 2021, with 2.2.6 released in 2018 being the first community release of KV. This release integrated some features that were originally restricted to Basho's Enterprise versions of Riak. Version 2.9.0 was the first major community release by the open source community, releasing in November 2019, with version 3.0.1 following on August 20, 2020. Development has continued since then with the latest release being version 3.0.7.


Removed features

The current version of Riak no longer supports some features in the Enterprise edition of Riak, including: * SNMP/JMX support


Separated features in Riak KV 3.0+

The following features of Riak KV 2.x have been removed by default from the Riak build. Specific builds including these features are available. * Yokozuna


Riak CS (Cloud Storage)

Originally known as Riak Moss(Riak Multi-tenant Object Storage System - MOSS) but named as Riak CS (Cloud Storage) when released, Riak CS was first publicly released in January 2012. Riak CS (Cloud Storage) is object storage software built on top of Riak KV, Riak's distributed database. Riak CS is designed to provide simple, highly-available, distributed cloud storage at any scale, and can be used to build cloud architectures or as storage infrastructure for heavy-duty applications and services. Riak CS also includes an application called Stanchion which is used to manage the serialization of requests. This enables Riak CS to manage globally unique entities like users and bucket names. Serialization in this context means that the entire cluster agrees upon a single value for any globally unique entity at any given time; when that value is changed, the new value must be recognized throughout the entire cluster. Riak CS was briefly rebranded as Riak S2 to make it more obviously compatible with Amazon S3 but the name did not catch on and it reverted to Riak CS. In 2021 development for Riak CS was resumed with contributions from TI Tokyo.


Riak TS (Time Series)

Riak TS is an extension to Riak KV optimized for time series data, in that: * it support
structured data
with table definition (with a CREATE TABLE call) required before data can be written; * data slices from contiguous regions in its primary index (“quanta”) are stored on the same partition; * CRUD operations are optimized for speed, at the expense of consistency. A limite

of SQL commands was implemented in Riak TS. There is no provision for consistency guarantees between tables (no foreign indexes). In SELECT statements, WHERE clause is supported but HAVING is not. ORDER BY was to appear in a version that was never released. Riak TS existed as a collection of branches (in separate components of Riak KV such as riak_kv, riak_pb, etc.) and not as product with a repository of its own. It was developed by a dedicated team consisting of Gordon Guthrie (leader), Andy Till and Andrei Zavada, with occasional contributions from other developers. Riak TS was conceived, along with Riak Data Platform project, as an attempt to diversify Basho's product line, an undertaking many insiders regard as misguided and eventually contributing to Basho's demise.


Licensing and support

Riak was originally licensed using a
freemium Freemium, a portmanteau of the words "free" and "premium", is a pricing strategy by which a basic product or service is provided free of charge, but money (a premium) is charged for additional features, services, or virtual (online) or physical ( ...
model: open source versions of Riak KV, Riak CS and Riak TS are available, but end users can pay for additional features and support. However, since Basho entered receivership and bet365 (purchasers of all IP) made all Riak products fully open source, all the premium features are now available in the open source versions. Since Basho's demise, community ad-hoc and paid support options have arisen.


Language support

Riak has official drivers for
Ruby Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...
,
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, Erlang and
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (prog ...
. There are also numerous community-supported drivers for other programming languages.


Community development

After bet365 purchased the Riak IP, the Riak products were made full open source and work to integrate premium features into the open source versions was completed with the 2.2.6 release.


History

Riak was originally written by Andy Gross and Justin Sheehy at
Basho Technologies Basho Technologies was a distributed systems' company that developed a key-value NoSQL database technology, Riak, and an object storage system built upon the Riak platform, called Riak CS. Technology and products Basho was the developer of Riak ...
to power a web Sales Force Automation application by former engineers and executives from Akamai. There was more interest in the datastore technology than the applications built on it, so the company decided to build a business around Riak itself, gaining adoption throughout the Fortune 100 and becoming a foundation to many of the world's fastest-growing Web-based, mobile and social networking applications, as well as cloud service providers. Releases after graduation include


Riak KV

Riak 1.0 was released September 10, 2011


Riak CS

Riak CS was made open source on March 20, 2013


Riak TS

Riak TS was originally released in October 2015


Users

Notable users include
AT&T AT&T Inc., an abbreviation for its predecessor's former name, the American Telephone and Telegraph Company, is an American multinational telecommunications holding company headquartered at Whitacre Tower in Downtown Dallas, Texas. It is the w ...
,
Comcast Comcast Corporation, formerly known as Comcast Holdings,Before the AT&T Broadband, AT&T merger in 2001, the parent company was Comcast Holdings Corporation. Comcast Holdings Corporation now refers to a subsidiary of Comcast Corporation, not th ...
,
GitHub GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
,
Best Buy Best Buy Co., Inc. is an American multinational consumer electronics retailer headquartered in Richfield, Minnesota. Originally founded by Richard M. Schulze and James Wheeler in 1966 as an audio specialty store called Sound of Music, it was r ...
, UK National Health Services (NHS),
The Weather Channel The Weather Channel (TWC) is an American pay television television channel, channel owned by Weather Group, LLC, a subsidiary of Allen Media Group. The channel's headquarters are located in Atlanta, Georgia. Launched on May 2, 1982, the channel ...
, and
Riot Games Riot Games, Inc. is an American video game developer, publisher, and esports tournament organizer based in Los Angeles. It was founded in September 2006 by Brandon Beck and Marc Merrill to develop ''League of Legends'' and went on to develop ...
.


See also

*
Basho Technologies Basho Technologies was a distributed systems' company that developed a key-value NoSQL database technology, Riak, and an object storage system built upon the Riak platform, called Riak CS. Technology and products Basho was the developer of Riak ...
*
Apache Accumulo Apache Accumulo is a highly scalable sorted, distributed key-value store based on Google's Bigtable. It is a system built on top of Apache Hadoop, Apache ZooKeeper, and Apache Thrift. Written in Java, Accumulo has cell-level access labels and ...
* Oracle NoSQL Database *
NoSQL NoSQL (originally meaning "Not only SQL" or "non-relational") refers to a type of database design that stores and retrieves data differently from the traditional table-based structure of relational databases. Unlike relational databases, which ...
*
Structured storage Structuring, also known as smurfing in banking jargon, is the practice of executing financial transactions such as making bank deposits in a specific pattern, calculated to avoid triggering financial institutions to file reports required by law ...
*
Memcached Memcached (pronounced variously /mɛmkæʃˈdiː/ ''mem-cash-dee'' or /ˈmɛmkæʃt/ ''mem-cashed'') is a general-purpose distributed memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and object ...
*
Redis Redis (; Remote Dictionary Server) is an in-memory key–value database, used as a distributed cache and message broker, with optional durability. Because it holds all data in memory and because of its design, Redis offers low- latency reads ...


References


External links

* {{Official website Cloud applications Cloud infrastructure Key-value databases NoSQL Cloud storage Free software programmed in Erlang Software using the Apache license Free database management systems