A foreign key is a set of attributes in a table that refers to the
primary key
In the relational model of databases, a primary key is a ''specific choice'' of a ''minimal'' set of attributes (columns) that uniquely specify a tuple (row) in a relation ( table). Informally, a primary key is "which attributes identify a record ...
of another table. The foreign key links these two tables. Another way to put it: In the context of
relational database
A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relatio ...
s, a foreign key is a set of attributes subject to a certain kind of
inclusion dependency
Referential integrity is a property of data stating that all its references are valid. In the context of relational databases, it requires that if a value of one attribute (column) of a relation (table) references a value of another attribute (e ...
constraints, specifically a constraint that the
tuples
In mathematics, a tuple is a finite ordered list (sequence) of elements. An -tuple is a sequence (or ordered list) of elements, where is a non-negative integer. There is only one 0-tuple, referred to as ''the empty tuple''. An -tuple is defi ...
consisting of the foreign key
attributes
Attribute may refer to:
* Attribute (philosophy), an extrinsic property of an object
* Attribute (research), a characteristic of an object
* Grammatical modifier, in natural languages
* Attribute (computing), a specification that defines a pro ...
in one
relation, R, must also exist in some other (not necessarily distinct) relation, S, and furthermore that those attributes must also be a
candidate key A candidate key, or simply a key, of a relational database is a minimal superkey. In other words, it is any set of columns that have a unique combination of values in each row (which makes it a superkey), with the additional constraint that removin ...
in S.
In simpler words, a foreign key is a set of attributes that ''references'' a candidate key. For example, a table called TEAM may have an attribute, MEMBER_NAME, which is a foreign key referencing a candidate key, PERSON_NAME, in the PERSON table. Since MEMBER_NAME is a foreign key, any value existing as the name of a member in TEAM must also exist as a person's name in the PERSON table; in other words, every member of a TEAM is also a PERSON.
Summary
The table containing the foreign key is called the child table, and the table containing the candidate key is called the referenced or parent table. In database relational modeling and implementation, a candidate key is a set of zero or more attributes, the values of which are guaranteed to be unique for each tuple (row) in a relation. The value or combination of values of candidate key attributes for any tuple cannot be duplicated for any other tuple in that relation.
Since the purpose of the foreign key is to identify a particular row of referenced table, it is generally required that the foreign key is equal to the candidate key in some row of the primary table, or else have no value (the
NULL
Null may refer to:
Science, technology, and mathematics Computing
*Null (SQL) (or NULL), a special marker and keyword in SQL indicating that something has no value
*Null character, the zero-valued ASCII character, also designated by , often used ...
value.
). This rule is called a
referential integrity constraint between the two tables.
Because violations of these constraints can be the source of many database problems, most
database management systems
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases span ...
provide mechanisms to ensure that every non-null foreign key corresponds to a row of the referenced table.
For example, consider a database with two tables: a CUSTOMER table that includes all customer data and an ORDER table that includes all customer orders. Suppose the business requires that each order must refer to a single customer. To reflect this in the database, a foreign key column is added to the ORDER table (e.g., CUSTOMERID), which references the primary key of CUSTOMER (e.g. ID). Because the primary key of a table must be unique, and because CUSTOMERID only contains values from that primary key field, we may assume that, when it has a value, CUSTOMERID will identify the particular customer which placed the order. However, this can no longer be assumed if the ORDER table is not kept up to date when rows of the CUSTOMER table are deleted or the ID column altered, and working with these tables may become more difficult. Many real world databases work around this problem by 'inactivating' rather than physically deleting master table foreign keys, or by complex update programs that modify all references to a foreign key when a change is needed.
Foreign keys play an essential role in
database design
Database design is the organization of data according to a database model. The designer determines what data must be stored and how the data elements interrelate. With this information, they can begin to fit the data to the database model.Teorey, T ...
. One important part of database design is making sure that relationships between real-world entities are reflected in the database by references, using foreign keys to refer from one table to another.
Another important part of database design is
database normalization
Database normalization or database normalisation (see spelling differences) is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrit ...
, in which tables are broken apart and foreign keys make it possible for them to be reconstructed.
Multiple rows in the referencing (or child) table may refer to the same row in the referenced (or parent) table. In this case, the relationship between the two tables is called a
one to many relationship between the referencing table and the referenced table.
In addition, the child and parent table may, in fact, be the same table, i.e. the foreign key refers back to the same table. Such a foreign key is known in
SQL:2003 as a self-referencing or recursive foreign key. In database management systems, this is often accomplished by linking a first and second reference to the same table.
A table may have multiple foreign keys, and each foreign key can have a different parent table. Each foreign key is enforced independently by the
database system
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases span ...
. Therefore, cascading relationships between tables can be established using foreign keys.
A foreign key is defined as an attribute or set of attributes in a relation whose values match a primary key in another relation. The syntax to add such a constraint to an existing table is defined in
SQL:2003 as shown below. Omitting the column list in the
REFERENCES
clause implies that the foreign key shall reference the primary key of the referenced table.
Likewise, foreign keys can be defined as part of the
CREATE TABLE
SQL statement.
CREATE TABLE child_table (
col1 INTEGER PRIMARY KEY,
col2 CHARACTER VARYING(20),
col3 INTEGER,
col4 INTEGER,
FOREIGN KEY(col3, col4) REFERENCES parent_table(col1, col2) ON DELETE CASCADE
)
If the foreign key is a single column only, the column can be marked as such using the following syntax:
CREATE TABLE child_table (
col1 INTEGER PRIMARY KEY,
col2 CHARACTER VARYING(20),
col3 INTEGER,
col4 INTEGER REFERENCES parent_table(col1) ON DELETE CASCADE
)
Foreign keys can be defined with a
stored procedure
A stored procedure (also termed proc, storp, sproc, StoPro, StoredProc, StoreProc, sp, or SP) is a subroutine available to applications that access a relational database management system (RDBMS). Such procedures are stored in the database data di ...
statement.
sp_foreignkey child_table, parent_table, col3, col4
* child_table: the name of the table or view that contains the foreign key to be defined.
* parent_table: the name of the table or view that has the primary key to which the foreign key applies. The primary key must already be defined.
* col3 and col4: the name of the columns that make up the foreign key. The foreign key must have at least one column and at most eight columns.
Referential actions
Because the
database management system
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases span ...
enforces referential constraints, it must ensure data integrity if rows in a referenced table are to be deleted (or updated). If dependent rows in referencing tables still exist, those references have to be considered.
SQL:2003 specifies 5 different referential actions that shall take place in such occurrences:
*
CASCADE
Cascade, Cascades or Cascading may refer to:
Science and technology Science
* Cascade waterfalls, or series of waterfalls
* Cascade, the CRISPR-associated complex for antiviral defense (a protein complex)
* Cascade (grape), a type of fruit
* B ...
*
RESTRICT
In the C programming language, restrict is a keyword, introduced by the C99 standard, that can be used in pointer declarations. By adding this type qualifier, a programmer hints to the compiler that for the lifetime of the pointer, no other p ...
*
NO ACTION
*
SET NULL
Set, The Set, SET or SETS may refer to:
Science, technology, and mathematics Mathematics
* Set (mathematics), a collection of elements
* Category of sets, the category whose objects and morphisms are sets and total functions, respectively
Elect ...
*
SET DEFAULT
CASCADE
Whenever rows in the parent (referenced) table are deleted (or updated), the respective rows of the child (referencing) table with a matching foreign key column will be deleted (or updated) as well. This is called a cascade delete (or update).
RESTRICT
A value cannot be updated or deleted when a row exists in a referencing or child table that references the value in the referenced table.
Similarly, a row cannot be deleted as long as there is a reference to it from a referencing or child table.
To understand RESTRICT (and CASCADE) better, it may be helpful to notice the following difference, which might not be immediately clear. The referential action CASCADE modifies the "behavior" of the (child) table itself where the word CASCADE is used. For example, ON DELETE CASCADE effectively says "When the referenced row is deleted from the other table (master table), then delete ''also from me''". However, the referential action RESTRICT modifies the "behavior" of the master table, ''not'' the child table, although the word RESTRICT appears in the child table and not in the master table! So, ON DELETE RESTRICT effectively says: "When someone tries to delete the row from the other table (master table), prevent deletion ''from that other table'' (and of course, also don't delete from me, but that's not the main point here)."
RESTRICT is not supported by Microsoft SQL 2012 and earlier.
NO ACTION
NO ACTION and RESTRICT are very much alike. The main difference between NO ACTION and RESTRICT is that with NO ACTION the referential integrity check is done after trying to alter the table. RESTRICT does the check before trying to execute the
UPDATE
Update(s) or Updated may refer to:
Music
* ''Update'' (Anouk album), 2004
* ''Update'' (Berlin Jazz Orchestra album), 2004
* ''Update'' (Jane Zhang album), 2007
* ''Update'' (Mal Waldron album), 1987
* ''Update'' (Yandel album), 2017
* ''Up ...
or
DELETE
Deletion or delete may refer to:
Computing
* File deletion, a way of removing a file from a computer's file system
* Code cleanup, a way of removing unnecessary variables, data structures, cookies, and temporary files in a programming language
* ...
statement. Both referential actions act the same if the referential integrity check fails: the UPDATE or DELETE statement will result in an error.
In other words, when an UPDATE or DELETE statement is executed on the referenced table using the referential action NO ACTION, the DBMS verifies at the end of the statement execution that none of the referential relationships are violated. This is different from RESTRICT, which assumes at the outset that the operation will violate the constraint. Using NO ACTION, the
triggers or the semantics of the statement itself may yield an end state in which no foreign key relationships are violated by the time the constraint is finally checked, thus allowing the statement to complete successfully.
SET NULL, SET DEFAULT
In general, the action taken by the
DBMS
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases span ...
for SET NULL or SET DEFAULT is the same for both ON DELETE or ON UPDATE: the value of the affected referencing attributes is changed to NULL for SET NULL, and to the specified default value for SET DEFAULT.
Triggers
Referential actions are generally implemented as implied
triggers (i.e. triggers with system-generated names, often hidden.) As such, they are subject to the same limitations as user-defined triggers, and their order of execution relative to other triggers may need to be considered; in some cases it may become necessary to replace the referential action with its equivalent user-defined trigger to ensure proper execution order, or to work around mutating-table limitations.
Another important limitation appears with
transaction isolation
In database systems, isolation determines how transaction integrity is visible to other users and systems.
A lower isolation level increases the ability of many users to access the same data at the same time, but increases the number of concurrenc ...
: your changes to a row may not be able to fully cascade because the row is referenced by data your transaction cannot "see", and therefore cannot cascade onto. An example: while your transaction is attempting to renumber a customer account, a simultaneous transaction is attempting to create a new invoice for that same customer; while a CASCADE rule may fix all the invoice rows your transaction can see to keep them consistent with the renumbered customer row, it won't reach into another transaction to fix the data there; because the database cannot guarantee consistent data when the two transactions commit, one of them will be forced to roll back (often on a first-come-first-served basis.)
CREATE TABLE account (acct_num INT, amount DECIMAL(10,2));
CREATE TRIGGER ins_sum BEFORE INSERT ON account
FOR EACH ROW SET @sum = @sum + NEW.amount;
Example
As a first example to illustrate foreign keys, suppose an accounts database has a table with invoices and each invoice is associated with a particular supplier. Supplier details (such as name and address) are kept in a separate table; each supplier is given a 'supplier number' to identify it. Each invoice record has an attribute containing the supplier number for that invoice. Then, the 'supplier number' is the primary key in the Supplier table. The foreign key in the Invoice table points to that primary key. The relational schema is the following. Primary keys are marked in bold, and foreign keys are marked in italics.
Supplier (SupplierNumber, Name, Address)
Invoice (InvoiceNumber, Text, ''SupplierNumber'')
The corresponding
Data Definition Language statement is as follows.
CREATE TABLE Supplier (
SupplierNumber INTEGER NOT NULL,
Name VARCHAR(20) NOT NULL,
Address VARCHAR(50) NOT NULL,
CONSTRAINT supplier_pk PRIMARY KEY(SupplierNumber),
CONSTRAINT number_value CHECK(SupplierNumber > 0)
)
CREATE TABLE Invoice (
InvoiceNumber INTEGER NOT NULL,
Text VARCHAR(4096),
SupplierNumber INTEGER NOT NULL,
CONSTRAINT invoice_pk PRIMARY KEY(InvoiceNumber),
CONSTRAINT inumber_value CHECK (InvoiceNumber > 0),
CONSTRAINT supplier_fk
FOREIGN KEY(SupplierNumber) REFERENCES Supplier(SupplierNumber)
ON UPDATE CASCADE ON DELETE RESTRICT
)
See also
*
Candidate key A candidate key, or simply a key, of a relational database is a minimal superkey. In other words, it is any set of columns that have a unique combination of values in each row (which makes it a superkey), with the additional constraint that removin ...
*
Compound key {{Unreferenced, date=October 2020
In database design, a composite key is a candidate key that consists of two or more attributes (table columns) that together uniquely identify an entity occurrence (table row). A compound key is a composite key for ...
*
Superkey
In the relational data model a superkey is a set of attributes that uniquely identifies each tuple of a relation. Because superkey values are unique, tuples with the same superkey value must also have the same non-key attribute values. That is ...
*
Junction table
An associative entity is a term used in relational and entity–relationship theory. A relational database requires the implementation of a base relation (or base table) to resolve many-to-many relationships. A base relation representing this ...
References
External links
SQL-99 Foreign KeysMicrosoft SQL 2012 table_constraint (Transact-SQL)
{{DEFAULTSORT:Foreign Key
Data modeling
Databases
SQL
Articles with example SQL code
de:Schlüssel (Datenbank)#Fremdschlüssel
no:Nøkkel (database)#Fremmednøkkel