WebScaleSQL
   HOME

TheInfoList



OR:

WebScaleSQL was an open-source relational database management system (RDBMS) created as a software branch of the production-ready community releases of
MySQL MySQL () is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A relational database ...
. By joining efforts of a few companies and incorporating various changes and new features into MySQL, WebScaleSQL aimed toward fulfilling various needs arising from the deployment of MySQL in large-scale environments, which involve large amounts of data and numerous
database server A database server is a server which uses a database application that provides database services to other computer programs or to computers, as defined by the client–server model. Database management systems (DBMSs) frequently provide database-s ...
s. The
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the w ...
of WebScaleSQL is hosted on
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continu ...
and licensed under the terms of version 2 of the
GNU General Public License The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general ...
. The project website announced in December 2016 that the companies involved would no longer contribute to the project.


Overview

Running MySQL on numerous servers with large amounts of data, at the scale of
terabyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
s and
petabyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
s of data, creates a set of difficulties that in many cases arise the need for implementing specific customized MySQL features, or the need for introducing functional changes to MySQL. More than a few companies have faced the same (or very similar) set of difficulties in their
production environment In software deployment, an environment or tier is a computer system or set of systems in which a computer program or software component is deployed and executed. In simple cases, such as developing and immediately executing a program on the same m ...
s, which used to result in the availability of multiple solutions for similar challenges. WebScaleSQL was announced on March 27, 2014 as a joint effort of
Facebook Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin Mosk ...
,
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
,
LinkedIn LinkedIn () is an American business and employment-oriented online service that operates via websites and mobile apps. Launched on May 5, 2003, the platform is primarily used for professional networking and career development, and allows job se ...
and
Twitter Twitter is an online social media and social networking service owned and operated by American company Twitter, Inc., on which users post and interact with 280-character-long messages known as "tweets". Registered users can post, like, and ...
(with Alibaba Group joining in January 2015), aiming to provide a centralized development structure for extending MySQL with new features specific to its large-scale deployments, such as building large replicated databases running on server farms. As a result, WebScaleSQL attempted to open a path toward deduplicating the efforts each founding company had been putting into maintaining its own branch of MySQL, and toward bringing together more developers. WebScaleSQL was created as a
branch A branch, sometimes called a ramus in botany, is a woody structural member connected to the central trunk of a tree (or sometimes a shrub). Large branches are known as boughs and small branches are known as twigs. The term ''twig'' usually ...
of the MySQL's latest production-ready community release, which was version 5.6 . As the project aimed to tightly follow new MySQL community releases, a branching path was selected instead of becoming a
software fork In software engineering, a project fork happens when developers take a copy of source code from one software package and start independent development on it, creating a distinct and separate piece of software. The term often implies not merely ...
of MySQL. The selection of MySQL production-ready community releases for the WebScaleSQL's upstream, instead of selecting some of the available MySQL forks was the result of a consensus between the four founding companies, which concluded that the features already existing in MySQL 5.6 are suitable for large-scale deployments, while additional features of the same kind are planned for MySQL 5.7.


Features

The initial changes and feature additions that WebScaleSQL introduced to the MySQL 5.6
codebase In software development, a codebase (or code base) is a collection of source code used to build a particular software system, application, or software component. Typically, a codebase includes only human-written source code files; thus, a codeb ...
came from the engineers employed by the four founding companies; however, the project was open to peer-reviewed community contributions. , available new features and changes included the following: * A software framework that provides automated testing of all proposed changes * A customized suite of database performance tests * Various changes to the
automated test In software testing, test automation is the use of software separate from the software being tested to control the execution of tests and the comparison of actual outcomes with predicted outcomes. Test automation can automate some repetitive bu ...
s provided by the MySQL community releases * Performance improvements in various areas, including buffer pool flushing, execution of certain types of SQL queries, and support for
NUMA Nuclear mitotic apparatus protein 1 is a protein that in humans is encoded by the ''NUMA1'' gene. Interactions Nuclear mitotic apparatus protein 1 has been shown to interact with PIM1, Band 4.1, GPSM2 G-protein-signaling modulator 2, also call ...
architectures * Changes related to large-scale deployments, such as the ability to specify sub-second
client Client(s) or The Client may refer to: * Client (business) * Client (computing), hardware or software that accesses a remote service on another computer * Customer or client, a recipient of goods or services in return for monetary or other valuabl ...
timeouts * Performance and reliability improvements to the global transaction identifier (GTID) feature of MySQL 5.6 * So-called super_read_only operation mode for the MySQL server, which disables data modification operations even for privileged database accounts , planned new features and changes included the following: * New asynchronous MySQL client that will eliminate the client-side waiting while establishing
database connection A database connection is a facility in computer science that allows client software to talk to database server software, whether on the same machine or not. A connection is required to send commands and receive answers, usually in the form of a r ...
s, sending queries, and receiving their results * Availability of various
table Table may refer to: * Table (furniture), a piece of furniture with a flat surface and one or more legs * Table (landform), a flat area of land * Table (information), a data arrangement with rows and columns * Table (database), how the table data ...
, user and
compression Compression may refer to: Physical science *Compression (physics), size reduction due to forces *Compression member, a structural element such as a column *Compressibility, susceptibility to compression * Gas compression *Compression ratio, of a ...
statistics * Changes to the internal compression mechanisms * Addition of a logical read-ahead mechanism that will bring significant performance improvements for
full table scan A full table scan (also known as a sequential scan) is a scan made on a database where each Row (database), row of the Table (database), table is read in a sequential (serial) order and the columns encountered are checked for the validity of a cond ...
s


Availability

WebScaleSQL is distributed in a source-code-only form, with no official binaries available. ,
compiling In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
the source code and running WebScaleSQL is supported only on
x86-64 x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging ...
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, w ...
hosts, requiring at the same time a
toolchain In software, a toolchain is a set of programming tools that is used to perform a complex software development task or to create a software product, which is typically another computer program or a set of related programs. In general, the tools for ...
that supports
C99 C99 (previously known as C9X) is an informal name for ISO/IEC 9899:1999, a past version of the C programming language standard. It extends the previous version ( C90) with new features for the language and the standard library, and helps impl ...
and
C++11 C++11 is a version of the ISO/ IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions b ...
language standards. The source code is hosted on GitHub and available under version 2 of the GNU General Public License ( GPL v2).


End of Contributions

In December 2016, the WebScaleSQL website announced the companies originally involved in collaborating on the project (Facebook, Google, LinkedIn, Twitter, and Alibaba) would no longer contribute to the project. The announcement blamed differences among the needs of the various companies for the end of the collaboration.


See also

*
Comparison of relational database management systems The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are ba ...


References


External links

* * {{Database 2014 software Client-server database management systems Free database management systems Linux-only free software MySQL Relational database management software for Linux