HOME

TheInfoList



OR:

Apache Cassandra is a
free and open-source Free and open-source software (FOSS) is software available under a Software license, license that grants users the right to use, modify, and distribute the software modified or not to everyone free of charge. FOSS is an inclusive umbrella term ...
database management system In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and an ...
designed to handle large volumes of data across multiple commodity servers. The system prioritizes availability and
scalability Scalability is the property of a system to handle a growing amount of work. One definition for software systems specifies that this may be done by adding resources to the system. In an economic context, a scalable business model implies that ...
over
consistency In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...
, making it particularly suited for systems with high write throughput requirements due to its LSM tree indexing storage layer. As a wide-column database, Cassandra supports flexible schemas and efficiently handles data models with numerous sparse columns. The system is optimized for applications with well-defined data access patterns that can be incorporated into the schema design. Cassandra supports
computer cluster A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The newes ...
s which may span multiple
data center A data center is a building, a dedicated space within a building, or a group of buildings used to house computer systems and associated components, such as telecommunications and storage systems. Since IT operations are crucial for busines ...
s, featuring
asynchronous Asynchrony is any dynamic far from synchronization. If and as parts of an asynchronous system become more synchronized, those parts or even the whole system can be said to be in sync. Asynchrony or asynchronous may refer to: Electronics and com ...
and masterless replication. It enables low-latency operations for all clients and incorporates
Amazon Amazon most often refers to: * Amazon River, in South America * Amazon rainforest, a rainforest covering most of the Amazon basin * Amazon (company), an American multinational technology company * Amazons, a tribe of female warriors in Greek myth ...
's
Dynamo "Dynamo Electric Machine" (end view, partly section, ) A dynamo is an electrical generator that creates direct current using a commutator. Dynamos employed electromagnets for self-starting by using residual magnetic field left in the iron cores ...
distributed storage and replication techniques, combined with
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
's Bigtable data storage engine model.


History

Avinash Lakshman, a co-author of
Amazon Amazon most often refers to: * Amazon River, in South America * Amazon rainforest, a rainforest covering most of the Amazon basin * Amazon (company), an American multinational technology company * Amazons, a tribe of female warriors in Greek myth ...
's
Dynamo "Dynamo Electric Machine" (end view, partly section, ) A dynamo is an electrical generator that creates direct current using a commutator. Dynamos employed electromagnets for self-starting by using residual magnetic field left in the iron cores ...
, and Prashant Malik developed Cassandra at
Facebook Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
to support the
inbox Electronic mail (usually shortened to email; alternatively hyphenated e-mail) is a method of transmitting and receiving digital messages using electronic devices over a computer network. It was conceived in the late–20th century as the ...
search Searching may refer to: Music * "Searchin', Searchin", a 1957 song originally performed by The Coasters * Searching (China Black song), "Searching" (China Black song), a 1991 song by China Black * Searchin' (CeCe Peniston song), "Searchin" (C ...
functionality. Facebook released Cassandra as open-source software on Google Code in July 2008. In March 2009, it became an Apache Incubator project and on February 17, 2010, it graduated to a top-level project. The developers at
Facebook Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
named their database after
Cassandra Cassandra or Kassandra (; , , sometimes referred to as Alexandra; ) in Greek mythology was a Trojan priestess dedicated to the god Apollo and fated by him to utter true prophecy, prophecies but never to be believed. In modern usage her name is e ...
, the
mythological Myth is a genre of folklore consisting primarily of narratives that play a fundamental role in a society. For scholars, this is very different from the vernacular usage of the term "myth" that refers to a belief that is not true. Instead, the ...
Trojan prophetess, referencing her curse of making prophecies that were never believed.


Features and limitations

Cassandra uses a distributed architecture where all nodes perform identical functions, eliminating single points of failure. The system employs configurable replication strategies to distribute data across clusters, providing redundancy and disaster recovery capabilities. The system is capable of linear scaling, which increases read and write throughput with the addition of new nodes, while maintaining continuous service. Cassandra is categorized as an AP (
Availability In reliability engineering, the term availability has the following meanings: * The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at ...
and Partition Tolerance) system, emphasizing availability and partition tolerance over
consistency In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...
. While it offers tunable consistency levels for both read and write operations, its architecture makes it less suitable for use cases requiring strict consistency guarantees. Additionally, Cassandra's compatibility with
Hadoop Apache Hadoop () is a collection of Open-source software, open-source software utilities for reliable, scalable, distributed computing. It provides a software framework for Clustered file system, distributed storage and processing of big data usin ...
and related tools allows for integration with existing big data processing workflows. Eventual consistency is maintained using tombstones to manage reads, upserts, and deletes. The system's query capabilities have notable limitations. Cassandra does not support advanced query patterns such as multi-table JOINs, ad hoc aggregations, or complex queries. These limitations stem from its distributed architecture, which optimizes for scalability and availability rather than complex query operations.


Data model

As a wide-column store, Cassandra combines features of both key-value and tabular database systems. It implements a partitioned row store model with adjustable consistency levels. The following table compares Cassandra and
relational database management systems A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured form ...
(RDBMS). The data model consists of several hierarchical components:


Keyspace

A keyspace in Cassandra is analogous to a database in relational systems. It contains multiple tables and manages configuration information, including replication strategy and user-defined types (UDTs).


Tables

Tables (formerly called column families prior to CQL 3) are containers for rows of data. Each table has a name and configuration information for its stored data. Tables may be created, dropped, or altered at run-time without blocking updates and queries.


Rows and columns

Each row is identified by a
primary key In the relational model of databases, a primary key is a designated attribute (column) that can reliably identify and distinguish between each individual record in a table. The database creator can choose an existing unique attribute or combinati ...
and contains columns. The first component of a table's primary key is the partition key; within a partition, rows are clustered by the remaining columns of the key. Columns contain data belonging to a row and consist of: * A name * A type * A value * Timestamp metadata (used for write conflict resolution via "last write wins") Unlike traditional RDBMS tables, rows within the same table can have varying columns, providing a flexible structure. This flexibility distinguishes Cassandra from relational databases, as not all columns need to be specified for each row. Other columns may be indexed separately from the primary key.


Storage model

Cassandra uses a Log Structured Merge Tree (LSM tree) index to optimize write throughput, in contrast to the B-tree indexes used by most databases. The storage architecture consists of three main components:


Core components

* Commit Log: A write-ahead log that ensures write durability * Memtable: An in-memory data structure that stores writes, sorted by primary key * SSTable (Sorted String Table): Immutable files containing data flushed from Memtables


Write and read processes

Write operations follow a two-stage process: # The write is recorded in the commit log and added to the Memtable # When the Memtable reaches size or time thresholds, it flushes to an SSTable Read operations: # Check Memtable for latest data # Search SSTables from newest to oldest using bloom filters for efficiency


Data management


Tombstones

Every operation (create/update/delete) generates a new entry, with deletes handled via " tombstones". While common in many databases, tombstones can cause performance degradation in delete-heavy workloads.


Compaction

Compaction consolidates multiple SSTables to: * Reduce storage usage * Remove deleted row tombstones * Improve read performance


Cassandra Query Language

Cassandra Query Language (CQL) is the interface for accessing Cassandra, as an alternative to the traditional Structured Query Language (SQL). CQL adds an
abstraction layer In computing, an abstraction layer or abstraction level is a way of hiding the working details of a subsystem. Examples of software models that use layers of abstraction include the OSI model for network protocols, OpenGL, and other graphics libra ...
that hides implementation details of this structure and provides native syntaxes for collections and other common encodings. Language drivers are available for
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
(
JDBC Java Database Connectivity (JDBC) is an application programming interface (API) for the Java (programming language), Java programming language which defines how a client may access a database. It is a Java-based data access technology used for Java ...
), Python (DBAPI2), Node.JS ( DataStax), Go (gocql), and C++. The key space in Cassandra is a namespace that defines data replication across nodes. Therefore, replication is defined at the key space level. Below is an example of key space creation, including a column family in CQL 3.0: CREATE KEYSPACE MyKeySpace WITH REPLICATION = ; USE MyKeySpace; CREATE COLUMNFAMILY MyColumns (id text, lastName text, firstName text, PRIMARY KEY(id)); INSERT INTO MyColumns (id, lastName, firstName) VALUES ('1', 'Doe', 'John'); SELECT * FROM MyColumns; Which gives: id , lastName , firstName ----+----------+---------- 1 , Doe , John (1 rows)


Distributed architecture


Gossip protocol

Cassandra uses a peer-to-peer gossip protocol for cluster communication. Nodes routinely exchange information about cluster state, including: * Node availability status * Schema versions * Generation timestamps (node bootstrap time) * Version numbers (logical clock values) The system uses
vector clock A vector clock is a data structure used for determining the partial ordering of events in a distributed system and detecting causality violations. Just as in Lamport timestamps, inter-process messages contain the state of the sending process's ...
s to track information currency and ignore outdated state data.


Seed nodes

The architecture designates certain nodes as "seed" nodes that: * Bootstrap the cluster * Serve as guaranteed gossip communication points * Prevent cluster fragmentation * Remain discoverable via service discovery methods This design eliminates single points of failure while maintaining cluster-wide consistency of operational knowledge.


Fault tolerance

Cassandra employs the Phi Accrual Failure Detector to manage node failures during cluster operation. Through this system, each node independently assesses the availability of other nodes during gossip communication. When a node fails to respond, it is "convicted" and removed from write operations, though it can rejoin the cluster upon resuming heartbeat signals. To maintain data integrity during node outages, Cassandra uses a "hinted handoff" mechanism. When writing to an offline node, the coordinator node temporarily stores the write data as a "hint." Once the offline node returns to service, these hints are forwarded to restore data consistency. Notably, Cassandra only permanently removes nodes through explicit administrative decommissioning or rebuilding, preventing temporary communication failures or restarts from triggering unnecessary data rebalancing.


Management and monitoring

Cassandra is a Java-based system that can be managed and monitored via Java Management Extensions (JMX). The JMX-compliant ''Nodetool'' utility, for instance, can be used to manage a Cassandra cluster. Nodetool also offers a number of commands to return Cassandra metrics pertaining to disk usage, latency, compaction, garbage collection, and more. Since the release of Cassandra 2.0.2 in 2013, measures of several metrics are produced via the Dropwizard metrics framework, and may be queried via JMX using tools such as JConsole or passed to external monitoring systems via Dropwizard-compatible reporter plugins.


Releases

Releases after graduation include:


See also

* Bigtable – Original distributed database by Google *
Distributed database A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a computer network, netwo ...
*
Distributed hash table A distributed hash table (DHT) is a Distributed computing, distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node (networking), node can efficiently retrieve the ...
(DHT) * Dynamo (storage system) – Cassandra borrows many elements from Dynamo


References


Bibliography

* * *


External links

* * * * * * From the OSCON 2009 talk on RDBMS vs. Dynamo, Bigtable, and Cassandra. * * * * * {{Facebook navbox 2008 software Apache Software Foundation Apache Software Foundation projects Big data products Bigtable implementations Column-oriented DBMS software for Linux Distributed data stores Facebook software Free database management systems NoSQL Structured storage