In
database normalization
Database normalization or database normalisation (see spelling differences) is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrit ...
, unnormalized form (UNF), also known as an unnormalized relation or non-first normal form (N1NF or NF
2),
is a
database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
data model
A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be c ...
(organization of data in a database) which does not meet any of the conditions of database normalization defined by the
relational model
The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tup ...
. Database systems which support unnormalized data are sometimes called non-relational or
NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
databases. In the relational model, unnormalized relations can be considered the starting point for a process of normalization.
It should not be confused with
denormalization
Denormalization is a strategy used on a previously- normalized database to increase performance. In computing, denormalization is the process of trying to improve the read performance of a database, at the expense of losing some write performance, ...
, where normalization is deliberately compromised for selected tables in a relational database.
History
In 1970,
E. F. Codd proposed the
relational data model
The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of ...
, now widely accepted as the standard data model. At that time,
office automation Office automation refers to the varied computer machinery and software used to digitally create, collect, store, manipulate, and relay office information needed for accomplishing basic tasks. Raw data storage, electronic transfer, and the manageme ...
was the major use of data storage systems, which resulted in the proposal of many NF
2 data models like the Schek model, Jaeschke models (non-recursive and recursive algebra), and the Nested Table Data (NTD) model.
IBM organized the first international workshop exclusively on this topic in 1987 which was held in
Darmstadt, Germany
Darmstadt () is a city in the state of Hesse in Germany, located in the southern part of the Rhine-Main-Area (Frankfurt Metropolitan Region). Darmstadt has around 160,000 inhabitants, making it the fourth largest city in the state of Hesse a ...
.
Moreover, a lot of research has been done and journals have been published to address the shortcomings of the
relational model
The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tup ...
. Since the turn of the century,
NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
databases have become popular owing to the demands of
Web 2.0
Web 2.0 (also known as participative (or participatory) web and social web) refers to websites that emphasize user-generated content, ease of use, participatory culture and interoperability (i.e., compatibility with other products, systems, and ...
.
Relational form
Normalization to
first normal form
First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values (or n ...
requires the initial data to be viewed as
relations
Relation or relations may refer to:
General uses
*International relations, the study of interconnection of politics, economics, and law on a global level
*Interpersonal relationship, association or acquaintance between two or more people
*Public ...
. In database systems relations are represented as tables. The relation view implies some constraints on the tables:
* No duplicate rows. In practice, this is ensured by defining one or more columns as
primary keys.
* Rows do not have an intrinsic order. While tables have to be stored and presented in ''some'' order, this is unstable and implementation dependent. If a specific ordering needs to be represented, it has to be in the form of data, e.g. a "number" column.
* Columns have unique names within the same table.
* Each column has a domain (or data type) which defines the allowed values in the column.
* All rows in a table have the same set of columns.
This definition does not preclude columns having sets or relations as values, e.g. nested tables. This is the major difference to
first normal form
First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values (or n ...
.
NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
databases like
Document database
A document-oriented database, or document store, is a computer program and data storage system designed for storing, retrieving and managing document-oriented information, also known as semi-structured data.
Document-oriented databases are one ...
s typically does not conform to the relational view. For example, an JSON or XML database might support duplicate records and intrinsic ordering. Such database can be described as non-relational. But there are also database models which support the relational view, but does not embrace
first normal form
First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values (or n ...
. Such models are called non-first normal form relations (abbreviated NFR, N1NF or NF
2).
Example
This table represent a relation where one of the columns (Transactions) is itself relation-valued. This is a valid relation but does not conform to
first normal form
First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values (or n ...
which does not allow nested relations. The table is therefore unnormalized.
Modern applications
Today, companies like
Google
Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
,
Amazon
Amazon most often refers to:
* Amazons, a tribe of female warriors in Greek mythology
* Amazon rainforest, a rainforest covering most of the Amazon basin
* Amazon River, in South America
* Amazon (company), an American multinational technolog ...
and
Facebook
Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin ...
deal with large amounts of data that are difficult to store efficiently. They use
NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
databases, which are based on the principles of the unnormalized relational model, to deal with the storage issue.
Some examples of
NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
databases are
MongoDB
MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and licensed under the Ser ...
,
Apache Cassandra
Cassandra is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cass ...
and
Redis
Redis (; Remote Dictionary Server) is an in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Redis supports different kinds of abstract data structures, suc ...
. These databases are more
scalable
Scalability is the property of a system to handle a growing amount of work by adding resources to the system.
In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
and easier to query with as they do not involve expensive operations like
JOIN.
See also
*
Denormalization
Denormalization is a strategy used on a previously- normalized database to increase performance. In computing, denormalization is the process of trying to improve the read performance of a database, at the expense of losing some write performance, ...
*
Normalization
Normalization or normalisation refers to a process that makes something more normal or regular. Most commonly it refers to:
* Normalization (sociology) or social normalization, the process through which ideas and behaviors that may fall outside of ...
*
First normal form
First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values (or n ...
*
Second normal form Second normal form (2NF) is a normal form used in database normalization. 2NF was originally defined by E. F. Codd in 1971.Codd, E. F. "Further Normalization of the Data Base Relational Model". (Presented at Courant Computer Science Symposia Seri ...
*
Third normal form
Third normal form (3NF) is a database schema design approach for relational databases which uses normalizing principles to reduce the duplication of data, avoid data anomalies, ensure referential integrity, and simplify data management. It was de ...
*
Boyce–Codd normal form
*
NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
References
{{Database normalization
Data modeling