TokuDB
   HOME

TheInfoList



OR:

TokuDB is an
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized so ...
, high-performance
storage engine A database engine (or storage engine) is the underlying software component that a database management system (DBMS) uses to create, read, update and delete (CRUD) data from a database. Most database management systems include their own application ...
for
MySQL MySQL () is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A relational database ...
and
MariaDB MariaDB is a community-developed, commercially supported fork of the MySQL relational database management system (RDBMS), intended to remain free and open-source software under the GNU General Public License. Development is led by some of the ori ...
. It achieves this by using a
fractal tree index In computer science, a fractal tree index is a tree data structure that keeps data sorted and allows searches and sequential access in the same time as a B-tree but with insertions and deletions that are asymptotically faster than a B-tree. L ...
. It is
scalable Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
,
ACID In computer science, ACID ( atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a se ...
and MVCC compliant, provides indexing-based query improvements, offers online
schema The word schema comes from the Greek word ('), which means ''shape'', or more generally, ''plan''. The plural is ('). In English, both ''schemas'' and ''schemata'' are used as plural forms. Schema may refer to: Science and technology * SCHEMA ...
modifications, and reduces replication lag for both
hard disk drive A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with mag ...
s and
flash memory Flash memory is an electronic non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory, NOR flash and NAND flash, are named for the NOR and NAND logic gates. Both use ...
. TokuDB is included in Percona Server,
MariaDB MariaDB is a community-developed, commercially supported fork of the MySQL relational database management system (RDBMS), intended to remain free and open-source software under the GNU General Public License. Development is led by some of the ori ...
and
Nagios Nagios Core , formerly known as Nagios, is a free and open-source computer-software application that monitors systems, networks and infrastructure. Nagios offers monitoring and alerting services for servers, switches, applications and services ...
based opmon. However, it is deprecated in Percona Server 8 and MariaDB 10.5.


Fractal tree indexes


Overview

TokuDB uses a
Fractal tree index In computer science, a fractal tree index is a tree data structure that keeps data sorted and allows searches and sequential access in the same time as a B-tree but with insertions and deletions that are asymptotically faster than a B-tree. L ...
tree data structure In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be c ...
that keeps data sorted and allows searches and sequential access in the same time as a
B-tree In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing for ...
but with insertions and deletions that are asymptotically faster than a B-tree. Fractal trees also allow for messages to be injected into the tree in such a fashion that schema changes (such as adding or dropping a
column A column or pillar in architecture and structural engineering is a structural element that transmits, through compression (physical), compression, the weight of the structure above to other structural elements below. In other words, a column i ...
, or adding an index) can be done online and in the background. As a result, more indexes can be maintained without a drop in performance. This is because adding data to indexes tends to stress the performance of B-trees, but performs well in fractal tree indexes.


Uses

Fractal tree indexes can be applied to a number of applications characterized by near-real time analysis of streaming data. They can be used as the storage layer of a database or as the storage layer of a file system. When used in a database, they can be used in any setting where a B-tree is used, with improved performance. Examples include: network event management, online advertising networks,
clickstream A click path or clickstream is the sequence of hyperlinks one or more website visitors follows on a given site, presented in the order viewed. A visitor's click path may start within the website or at a separate third party website, often a search e ...
analytics, and air traffic control management. Other uses include accelerated crawler performance for
search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
s for
social media Social media are interactive media technologies that facilitate the creation and sharing of information, ideas, interests, and other forms of expression through virtual communities and networks. While challenges to the definition of ''social me ...
sites. It can also be used to create indexes and columns online, enabling query flexibility for e-commerce personalization. It is also suited to improving performance and reducing existing loads on transactional websites. In general, it performs well in applications that must simultaneously store
log file In computing, logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or just information on current operations. These events may occur in the operating system or in other software. A message or ...
data and execute ''ad hoc'' queries.


Origins

This approach to building memory-efficient systems was originally jointly developed by researchers at the
Massachusetts Institute of Technology The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of th ...
, Rutgers University, and the
Stony Brook University Stony Brook University (SBU), officially the State University of New York at Stony Brook, is a public research university in Stony Brook, New York. Along with the University at Buffalo, it is one of the State University of New York system' ...
.


Role on the big data market

TokuDB is named as one of the technologies that enable
big data Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe Big data is the one associated with large body of information that we could not comprehend when used only in smaller am ...
in MySQL. Tokutek was a Startup Showcase Finalist at the
O'Reilly O'Reilly ( ga, Ó Raghallaigh) is a group of families, ultimately all of Irish Gaelic origin, who were historically the kings of East Bréifne in what is today County Cavan. The clan were part of the Connachta's Uí Briúin Bréifne kindred a ...
Strata Conference 2012 on big data.


See also

*
Comparison of MySQL database engines This is a comparison between notable database engines for the MySQL database management system (DBMS). A database engine (or "storage engine") is the underlying software component that a DBMS uses to create, read, update and delete (CRUD) data f ...
*
NewSQL NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system. Many ...
*
Database engine A database engine (or storage engine) is the underlying software component that a database management system (DBMS) uses to create, read, update and delete (CRUD) data from a database. Most database management systems include their own applicati ...
*
TokuMX TokuMX is an open-source distribution of MongoDB which, among other things, replaces the default B-tree data structure found in the basic MongoDB distribution with a fractal tree index. It is a drop-in replacement for MongoDB (applications will r ...


References


External links

*
TokuTek website
before it was acquired by Percona, from the Wayback Machine
DBMS2.com Overview of Tokutek

TokuTek organization on GitHub
{{MySQL Database engines MySQL NewSQL Database-related software for Linux MariaDB