
Couchbase Server, originally known as Membase, is an
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
, distributed (
shared-nothing architecture A shared-nothing architecture (SN) is a distributed computing architecture in which each update request is satisfied by a single node (processor/memory/storage unit) in a computer cluster. The intent is to eliminate contention among nodes. Nodes do ...
)
multi-model NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
document-oriented database
A document-oriented database, or document store, is a computer program and data storage system designed for storing, retrieving and managing document-oriented information, also known as semi-structured data.
Document-oriented databases are one ...
software package optimized for interactive applications. These applications may serve many
concurrent user
In computer science, the number of concurrent users (sometimes abbreviated CCU) for a resource in a location, with the location being a computing network or a single computer, refers to the total number of people simultaneously accessing or using ...
s by creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, Couchbase Server is designed to provide easy-to-scale key-value, or JSON document access, with low latency and high sustainability throughput. It is designed to be
clustered from a single machine to very large-scale deployments spanning many machines.
Couchbase Server provided client protocol compatibility with
memcached
Memcached (pronounced variously ''mem-cash-dee'' or ''mem-cashed'') is a general-purpose distributed memory caching, memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and Object (computer science) ...
, but added disk
persistence,
data replication
Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.
Terminology
Replication in com ...
, live cluster reconfiguration, rebalancing and
multitenancy
Software multitenancy is a software architecture in which a single instance of software runs on a server and serves multiple tenants. Systems designed in such manner are "shared" (rather than
"dedicated" or "isolated"). A tenant is a group of us ...
with
data partitioning.
Product history
Membase was developed by several leaders of the
memcached
Memcached (pronounced variously ''mem-cash-dee'' or ''mem-cashed'') is a general-purpose distributed memory caching, memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and Object (computer science) ...
project, who had founded a company, NorthScale, to develop a
key-value store with the simplicity, speed, and scalability of memcached, but also the storage, persistence and querying capabilities of a database. The original membase source code was contributed by NorthScale, and project co-sponsors
Zynga
Zynga Inc. () is an American developer running social video game services. It was founded in April 2007, with headquarters in San Mateo, California. The company primarily focuses on mobile and social networking platforms. Zynga states its missio ...
and
Naver Corporation
The Naver Corporation is a South Korean internet conglomerate headquartered in Seongnam that operates the search engine Naver. Naver established itself as an early pioneer in the use of user-generated content through the creation of the online ...
(then known as NHN) to a new project on membase.org in June 2010.
On February 8, 2011, the Membase project founders and Membase, Inc. announced a merger with CouchOne (a company with many of the principal players behind
CouchDB
Apache CouchDB is an open-source document-oriented NoSQL database, implemented in Erlang.
CouchDB uses multiple formats and protocols to store, transfer, and process its data. It uses JSON to store data, JavaScript as its query language using ...
) with an associated project merger. The merged company was called
Couchbase, Inc.
Couchbase, Inc. is an American public (NASDAQ symbol BASE) software company that develops and provides commercial packages and support for Couchbase Server and Couchbase Lite both of which are open-source, NoSQL, multi-model, document-oriented ...
In January 2012, Couchbase released Couchbase Server 1.8.
In September of 2012,
Orbitz
Orbitz.com is a travel fare aggregator website and travel metasearch engine. The website is owned by Orbitz Worldwide, Inc., a subsidiary of Expedia Group. It is headquartered in the Citigroup Center, Chicago, Illinois.
Background
Origina ...
said it had changed some of its systems to use Couchbase.
In December of 2012, Couchbase Server 2.0 (announced in July 2011) was released and included a new
JSON
JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other s ...
document store, indexing and querying, incremental
MapReduce
MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.
A MapReduce program is composed of a ''map'' procedure, which performs filteri ...
and
replication
Replication may refer to:
Science
* Replication (scientific method), one of the main principles of the scientific method, a.k.a. reproducibility
** Replication (statistics), the repetition of a test or complete experiment
** Replication crisi ...
across
data centers.
Architecture
Every Couchbase node consists of a data service, index service, query service, and cluster manager component. Starting with the 4.0 release, the three services can be distributed to run on separate nodes of the cluster if needed.
In the parlance of Eric Brewer's
CAP theorem
In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:Seth Gilbert and Nancy Lynch"Brewer ...
, Couchbase is normally a CP type system meaning it provides
consistency
In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent ...
and
partition tolerance, or it can be set up as an AP system with multiple clusters.
Cluster manager
The cluster manager supervises the configuration and behavior of all the servers in a Couchbase cluster. It configures and supervises inter-node behavior like managing replication streams and re-balancing operations. It also provides metric aggregation and consensus functions for the cluster, and a
REST
Rest or REST may refer to:
Relief from activity
* Sleep
** Bed rest
* Kneeling
* Lying (position)
* Sitting
* Squatting position
Structural support
* Structural support
** Rest (cue sports)
** Armrest
** Headrest
** Footrest
Arts and ente ...
ful cluster management interface. The cluster manager uses the
Erlang programming language
Erlang ( ) is a general-purpose, concurrent, functional programming language, and a garbage-collected runtime system. The term Erlang is used interchangeably with Erlang/OTP, or Open Telecom Platform (OTP), which consists of the Erlang runtime ...
and the
Open Telecom Platform
OTP is a collection of useful middleware, libraries, and tools written in the Erlang programming language. It is an integral part of the open-source distribution of Erlang. The name OTP was originally an acronym for Open Telecom Platform, which w ...
.
Replication and fail-over
Data replication
Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.
Terminology
Replication in com ...
within the nodes of a cluster can be controlled with several parameters.
In December of 2012, support was added for replication between different
data centers.
Data manager
The data manager stores and retrieves documents in response to data operations from applications.
It asynchronously writes data to disk after acknowledging to the client. In version 1.7 and later, applications can optionally ensure data is written to more than one server or to disk before acknowledging a write to the client.
Parameters define item ages that affect when data is persisted, and how max memory and migration from main-memory to disk is handled.
It supports working sets greater than a memory quota per "node" or "bucket".
External systems can subscribe to filtered data streams, supporting, for example,
full text search
In text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database. Full-text search is distinguished from searches based on metadata or on parts of the original tex ...
indexing,
data analytics
Analytics is the systematic computational analysis of data or statistics. It is used for the discovery, interpretation, and communication of meaningful patterns in data. It also entails applying data patterns toward effective decision-making. It ...
or archiving.
Data format
A document is the most basic unit of data manipulation in Couchbase Server. Documents are stored in JSON document format with no predefined schemas. Non-JSON documents can also be stored in Couchbase Server (binary, serialized values, XML, etc.)
Object-managed cache
Couchbase Server includes a built-in multi-threaded object-managed
cache
Cache, caching, or caché may refer to:
Places United States
* Cache, Idaho, an unincorporated community
* Cache, Illinois, an unincorporated community
* Cache, Oklahoma, a city in Comanche County
* Cache, Utah, Cache County, Utah
* Cache Coun ...
that implements memcached compatible APIs such as get, set, delete, append, prepend etc.
Storage engine
Couchbase Server has a tail-append storage design that is immune to data corruption,
OOM killer
Out of memory (OOM) is an often undesired state of computer operation where no additional memory can be allocated for use by programs or the operating system. Such a system will be unable to load any additional programs, and since many programs ...
s or sudden loss of power. Data is written to the data file in an append-only manner, which enables Couchbase to do mostly sequential writes for update, and provide an optimized access patterns for disk I/O.
Performance
A performance benchmark done by
Altoros in 2012, compared Couchbase Server with other technologies.
Cisco Systems
Cisco Systems, Inc., commonly known as Cisco, is an American-based multinational corporation, multinational digital communications technology conglomerate (company), conglomerate corporation headquartered in San Jose, California. Cisco develo ...
published a benchmark that measured the latency and throughput of Couchbase Server with a mixed workload in 2012.
Licensing and support
Couchbase Server is a packaged version of Couchbase's
open source software
Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. Op ...
technology and is available in a community edition without recent bug fixes with an Apache 2.0 license and an edition for commercial use.
Couchbase Server builds are available for Ubuntu, Debian, Red Hat, SUSE, Oracle Linux,
Microsoft Windows and macOS operating systems.
Couchbase has supported software developers' kits for the programming languages
.NET,
PHP
PHP is a General-purpose programming language, general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementati ...
,
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum (aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapp ...
,
Python,
C,
Node.js
Node.js is an open-source server environment. Node.js is cross-platform and runs on Windows, Linux, Unix, and macOS. Node.js is a back-end JavaScript runtime environment. Node.js runs on the V8 JavaScript Engine and executes JavaScript code ou ...
,
Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
,
Go, and
Scala.
N1QL
A
query language
Query languages, data query languages or database query languages (DQL) are computer languages used to make queries in databases and information systems. A well known example is the Structured Query Language (SQL).
Types
Broadly, query language ...
called the non-first normal form query language, N1QL (pronounced nickel), is used for manipulating the JSON data in Couchbase, just like SQL manipulates data in RDBMS. It has SELECT, INSERT, UPDATE, DELETE, MERGE statements to operate on JSON data.
It was announced in March 2015 as "SQL for documents".
The N1QL
data model
A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be c ...
is
non-first normal form (N1NF) with support for nested attributes and domain-oriented
normalization
Normalization or normalisation refers to a process that makes something more normal or regular. Most commonly it refers to:
* Normalization (sociology) or social normalization, the process through which ideas and behaviors that may fall outside of ...
. The N1QL data model is also a proper superset and generalization of the
relational model
The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tup ...
.
Example
;Like query:
;Array query:
Couchbase Mobile
Couchbase Mobile / Couchbase Lite is a
mobile database Mobile computing devices (e.g., smartphones and PDAs) store and share data over a mobile network, or a database which is actually stored by the mobile device. This could be a list of contacts, price information, distance travelled, or any other inf ...
providing data replication.
Couchbase Lite (originally TouchDB) provides native libraries for offline-first NoSQL databases with built-in
peer-to-peer
Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer ...
or
client-server replication mechanisms.
Sync Gateway manages secure access and synchronization of data between Couchbase Lite and Couchbase Server.
Uses
Couchbase began as an evolution of
Memcached
Memcached (pronounced variously ''mem-cash-dee'' or ''mem-cashed'') is a general-purpose distributed memory caching, memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and Object (computer science) ...
, a high-speed data cache, and can be used as a drop-in replacement for Memcached, providing high availability for memcached application without code changes.
Couchbase is used to support applications where a flexible data model, easy scalability, and consistent high performance are required, such as tracking real-time user activity or providing a store of user preferences or online applications.
Couchbase Mobile, which stores data locally on devices (usually mobile devices) is used to create “offline-first” applications that can operate when a device is not connected to a network and synchronize with Couchbase Server once a network connection is re-established.
The Catalyst Lab at
Northwestern University
Northwestern University is a private research university in Evanston, Illinois. Founded in 1851, Northwestern is the oldest chartered university in Illinois and is ranked among the most prestigious academic institutions in the world.
Chart ...
uses Couchbase Mobile to support the Evo application, a healthy lifestyle research program where data is used to help participants improve dietary quality, physical activity, stress, or sleep.
Amadeus
Amadeus may refer to:
*Wolfgang Amadeus Mozart (1756–1791), prolific and influential composer of classical music
*Amadeus (name), a given name and people with the name
* ''Amadeus'' (play), 1979 stage play by Peter Shaffer
* ''Amadeus'' (film), ...
uses Couchbase with
Apache Kafka
Apache Kafka is a distributed event store and stream-processing platform. It is an open-source system developed by the Apache Software Foundation written in Java and Scala. The project aims to provide a unified, high-throughput, low-latency pla ...
to support their “open, simple, and agile” strategy to consume and integrate data on loyalty programs for airline and other travel partners. High scalability is needed when disruptive travel events create a need to recognize and compensate high value customers.
Starting in 2012, it played a role in
LinkedIn
LinkedIn () is an American business and employment-oriented online service that operates via websites and mobile apps. Launched on May 5, 2003, the platform is primarily used for professional networking and career development, and allows job s ...
's caching systems, including
backend caching for recruiter and jobs products, counters for security defense mechanisms, for internal applications.
Alternatives
For caching, Couchbase competes with
Memcached
Memcached (pronounced variously ''mem-cash-dee'' or ''mem-cashed'') is a general-purpose distributed memory caching, memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and Object (computer science) ...
and
Redis
Redis (; Remote Dictionary Server) is an in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Redis supports different kinds of abstract data structures, suc ...
.
For document databases, Couchbase competes with other
document-oriented database
A document-oriented database, or document store, is a computer program and data storage system designed for storing, retrieving and managing document-oriented information, also known as semi-structured data.
Document-oriented databases are one ...
systems. It is commonly compared with
MongoDB
MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and licensed under the Ser ...
,
Amazon DynamoDB
Amazon DynamoDB is a fully managed proprietary NoSQL database service that supports key–value and document data structures and is offered by Amazon.com as part of the Amazon Web Services portfolio. DynamoDB exposes a similar data model to and ...
,
Oracle RDBMS
Oracle Database (commonly referred to as Oracle DBMS, Oracle Autonomous Database, or simply as Oracle) is a multi-model database management system produced and marketed by Oracle Corporation.
It is a database commonly used for running online t ...
,
DataStax
DataStax, Inc. is a real-time data company based in Santa Clara, California. Its product Astra DB is a cloud Database as a service, database-as-a-service based on Apache Cassandra. DataStax also offers DataStax Enterprise (DSE), an on-premises da ...
,
Google Bigtable,
MariaDB
MariaDB is a community-developed, commercially supported fork of the MySQL relational database management system (RDBMS), intended to remain free and open-source software under the GNU General Public License. Development is led by some of the ...
,
IBM Cloudant,
Redis Enterprise,
SingleStore
SingleStore (formerly MemSQL) is a proprietary, cloud-native database designed for data-intensive applications. A distributed, relational, SQL database management system (RDBMS) that features ANSI SQL support, it is known for speed in da ...
, and
MarkLogic
MarkLogic Corporation is an American software business that develops and provides an enterprise NoSQL database, also named ''MarkLogic''. The company was founded in 2001 and is based in San Carlos, California. MarkLogic is a privately held compa ...
.
Bibliography
*
*
*
*
*
*Vemulapalli, Sitaram; et al. (May 10, 2018)
A Guide to N1QL features in Couchbase 5.5: Special Edition Self-published, p. 112
*
Chamberlin, Don;
(Oct 19, 2018) SQL++ For SQL Users: A Tutorial, Couchbase
References
External links
*{{Official website
Free database management systems
Distributed computing architecture
NoSQL
Cross-platform software
Structured storage
Client-server database management systems
Database-related software for Linux
Applications of distributed computing
Databases
Data management
Distributed data stores
Document-oriented databases
A document-oriented database, or document store, is a computer program and data storage system designed for storing, retrieving and managing document-oriented information, also known as semi-structured data.
Document-oriented databases are one ...