TokuDB is an
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
, high-performance
storage engine
A database engine (or storage engine) is the underlying software component that a database management system (DBMS) uses to create, read, update and delete (CRUD) data from a database. Most database management systems include their own application ...
for
MySQL
MySQL () is an Open-source software, open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A rel ...
and
MariaDB
MariaDB is a community-developed, commercially supported Fork (software development), fork of the MySQL relational database management system (RDBMS), intended to remain free and open-source software under the GNU General Public License. Developm ...
. It achieves this by using a
fractal tree index. It is
scalable,
ACID
An acid is a molecule or ion capable of either donating a proton (i.e. Hydron, hydrogen cation, H+), known as a Brønsted–Lowry acid–base theory, Brønsted–Lowry acid, or forming a covalent bond with an electron pair, known as a Lewis ...
and
MVCC compliant, provides
indexing-based query improvements, offers online
schema
Schema may refer to:
Science and technology
* SCHEMA (bioinformatics), an algorithm used in protein engineering
* Schema (genetic algorithms), a set of programs or bit strings that have some genotypic similarity
* Schema.org, a web markup vocab ...
modifications, and reduces
replication lag for both
hard disk drive
A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating hard disk drive platter, pla ...
s and
flash memory
Flash memory is an Integrated circuit, electronic Non-volatile memory, non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory, NOR flash and NAND flash, are named for t ...
.
TokuDB is included in
Percona Server,
MariaDB
MariaDB is a community-developed, commercially supported Fork (software development), fork of the MySQL relational database management system (RDBMS), intended to remain free and open-source software under the GNU General Public License. Developm ...
and
Nagios
Nagios is an event monitoring system that offers monitoring and alerting services for servers, switches, applications and services. It alerts users when things go wrong and alerts them a second time when the problem has been resolved.
Ethan ...
based
opmon. However, it is deprecated in Percona Server 8 and MariaDB 10.5.
Fractal tree indexes
Overview
TokuDB uses a
Fractal tree index tree data structure that keeps data sorted and allows searches and sequential access in the same time as a
B-tree
In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing fo ...
but with insertions and deletions that are asymptotically faster than a B-tree. Fractal trees also allow for messages to be injected into the tree in such a fashion that schema changes (such as adding or dropping a
column
A column or pillar in architecture and structural engineering is a structural element that transmits, through compression, the weight of the structure above to other structural elements below. In other words, a column is a compression member ...
, or adding an index) can be done online and in the background.
As a result, more indexes can be maintained without a drop in performance. This is because adding data to indexes tends to stress the performance of B-trees, but performs well in fractal tree indexes.
Uses
Fractal tree indexes can be applied to a number of applications characterized by near-real time analysis of streaming data. They can be used as the storage layer of a database or as the storage layer of a file system. When used in a database, they can be used in any setting where a B-tree is used, with improved performance. Examples include: network event management, online advertising networks,
clickstream analytics, and air traffic control management.
Other uses include accelerated
crawler performance for
search engine
A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...
s for
social media
Social media are interactive technologies that facilitate the Content creation, creation, information exchange, sharing and news aggregator, aggregation of Content (media), content (such as ideas, interests, and other forms of expression) amongs ...
sites. It can also be used to create indexes and columns online, enabling query flexibility for e-commerce personalization. It is also suited to improving performance and reducing existing loads on transactional websites. In general, it performs well in applications that must simultaneously store
log file
In computing, logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or broad information on current operations. These events may occur in the operating system or in other software. A message o ...
data and execute ''ad hoc'' queries.
Origins
This approach to building memory-efficient systems was originally jointly developed by researchers at the
Massachusetts Institute of Technology
The Massachusetts Institute of Technology (MIT) is a Private university, private research university in Cambridge, Massachusetts, United States. Established in 1861, MIT has played a significant role in the development of many areas of moder ...
,
Rutgers University,
and the
Stony Brook University
Stony Brook University (SBU), officially the State University of New York at Stony Brook, is a public university, public research university in Stony Brook, New York, United States, on Long Island. Along with the University at Buffalo, it is on ...
.
Role on the big data market
TokuDB is named as one of the technologies that enable
big data
Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...
in MySQL.
Tokutek was a Startup Showcase Finalist at the
O'Reilly Strata Conference 2012 on big data.
[
]
See also
*
Comparison of MySQL database engines
*
NewSQL
*
Database engine
References
External links
*
TokuTek websitebefore it was acquired by Percona, from the Wayback Machine
DBMS2.com Overview of TokutekTokuTek organization on GitHub
{{MySQL
Database engines
MySQL
NewSQL
Database-related software for Linux
MariaDB