A centralized database (sometimes abbreviated CDB) is a
database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
that is located, stored, and maintained in a single location. This location is most often a central computer or database system, for example a desktop or server
CPU
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
, or a mainframe computer.In most cases, a centralized database would be used by an organization (e.g. a business company) or an institution (e.g. a university.) Users access a centralized database through a computer network which is able to give them access to the central CPU, which in turn maintains to the database itself.
Historical context
The need for databases rose in the 60's with the invention of direct access storage, which allowed users to directly access records. Previously, computer systems were tape based, meaning records could only be accessed sequentially.
Organizations quickly adopted databases for storage and retrieval of data. The traditional approach for storing data was to use a centralized database, and users would query the data from various points over a network.
An example for a centralized database could be given with the
Australian Department of Defense, which centralized their databases in the mid 1970s.
Advantages
Centralized databases hold a substantial amount of advantages against other types of databases. Some of them are listed below:
*
Data integrity
Data integrity is the maintenance of, and the assurance of, data accuracy and consistency over its entire life-cycle and is a critical aspect to the design, implementation, and usage of any system that stores, processes, or retrieves data. The ter ...
is maximized and
data redundancy In computer main memory, auxiliary storage and computer buses, data redundancy is the existence of data that is additional to the actual data and permits correction of errors in stored or transmitted data. The additional data can simply be a compl ...
is minimized, as the single storing place of all the data also implies that a given set of data only has one primary record. This aids in the maintaining of data as accurate and as consistent as possible and enhances data reliability.
* Central host computer can be more easily protected from unauthorized access.
* Generally easier
data portability Data portability is a concept to protect users from having their data stored in "silos" or "walled gardens" that are incompatible with one another, i.e. closed platforms, thus subjecting them to vendor lock-in and making the creation of data backup ...
and
database administration
Database administration is the function of managing and maintaining database management systems (DBMS) software. Mainstream DBMS software such as Oracle, IBM Db2 and Microsoft SQL Server need ongoing management. As such, corporations that use D ...
.
* Data kept in the same location is easier to be changed, re-organized, mirrored, or analyzed
* Transactions can more easily comply with the properties of
ACID.
Disadvantages
Centralized databases also have a certain amount of limitations, such as those described below:
* Access speed is limited by network speed.
* The central computer is a single point of failure, if the computer experiences downtime, users will not be able to access any data.
* If there is no fault-tolerant setup and hardware failure occurs, all the data within the database will be lost.
* If someone accesses the central computer, all of the data can easily be compromised.
* Difficult to scale as the centralized computer would need to be replaced to scale up.
Centralized databases vs. Distributed databases
The underlying idea of centralized databases is that they should be able to receive, maintain, and complete every single request that the main system must perform by themselves. There is only one database file, kept at a single location on a given network.
A
distributed database, however, is a database in which all the information is stored on multiple physical locations. Distributed databases are divided into two groups:
homogeneous
Homogeneity and heterogeneity are concepts often used in the sciences and statistics relating to the uniformity of a substance or organism. A material or image that is homogeneous is uniform in composition or character (i.e. color, shape, siz ...
and
heterogeneous
Homogeneity and heterogeneity are concepts often used in the sciences and statistics relating to the uniformity of a substance or organism. A material or image that is homogeneous is uniform in composition or character (i.e. color, shape, siz ...
. It relies on
replication
Replication may refer to:
Science
* Replication (scientific method), one of the main principles of the scientific method, a.k.a. reproducibility
** Replication (statistics), the repetition of a test or complete experiment
** Replication crisi ...
and duplication within its multiple sub-databases in order to maintain its records up to date. It is composed of multiple database files, all controlled by a central DBMS.
The main differences between centralized and distributed databases arise due to their respective basic characteristics. Differences include but are not limited to:
* Centralized databases store data on a single CPU bound to a single certain physical/geographical location. Distributed databases, however, rely on a central DBMS which manages all its different storage devices remotely, as it is not necessary for them to be kept in the same physical and/or geographical location.
* As outlined above, centralized databases are easier to maintain up to date than distributed databases. This is so because distributed databases require additional (often manual) work to keep the data stored relevant, and to avoid data redundancy, as well as to improve the overall performance.
* If data is lost in a centralized system, retrieving it would be much harder. If, however, data is lost in a distributed system, retrieving it would be very easy, because there is always a copy of the data in a different location of the database.
* Designing a centralized database is generally much less complex than designing a distributed database, as distributed database systems are based on a hierarchical structure.
See also
*
Database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
*
Distributed database
*
Parallel database
*
Centralized computing
*
Centralization
References
{{reflist
Types of databases