Single Point Of Truth
   HOME

TheInfoList



OR:

In information science and
information technology Information technology (IT) is a set of related fields within information and communications technology (ICT), that encompass computer systems, software, programming languages, data processing, data and information processing, and storage. Inf ...
, single source of truth (SSOT) architecture, or single point of truth (SPOT) architecture, for
information system An information system (IS) is a formal, sociotechnical, organizational system designed to collect, process, Information Processing and Management, store, and information distribution, distribute information. From a sociotechnical perspective, info ...
s is the practice of structuring
information model An information model in software engineering is a representation of concepts and the relationships, constraints, rules, and Operation (mathematics), operations to specify Semantic data model, data semantics for a chosen domain of discourse. Typica ...
s and associated data schemas such that every
data element In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has: # An identification such as a data element name # A clear data element definition # One or more representation term ...
is
mastered Mastering is a form of audio post production which is the process of preparing and transferring recorded audio from a source containing the final mix to a data storage device called a master recording, the source from which all copies will be ...
(or edited) in only one place, providing data normalization to a canonical form (for example, in
database normalization Database normalization is the process of structuring a relational database in accordance with a series of so-called '' normal forms'' in order to reduce data redundancy and improve data integrity. It was first proposed by British computer scien ...
or content
transclusion In computer science, transclusion is the inclusion of part or all of an electronic document into one or more other documents by reference via hypertext. Transclusion is usually performed when the referencing document is displayed, and is norma ...
). There are several scenarios with respect to copies and updates: * The master data is never copied and instead only references to it are made; this means that all reads and updates go directly to the SSOT. * The master data is copied but the copies are only read and only the master data is updated; if requests to read data are only made on copies, this is an instance of
CQRS In information technology, Command Query Responsibility Segregation (CQRS) is a system architecture that extends the idea behind command–query separation Command-query separation (CQS) is a principle of imperative computer programming. It was ...
. *The master data is copied and the copies are updated; this needs a reconciliation mechanism when there are concurrent updates. **Updates on copies can be thrown out whenever a concurrent update is made on the master, so they are not considered fully committed until propagated to the master. (many blockchains work that way.) **Concurrent updates are merged. (if an automatic merge fails, it could fall back on another strategy, which could be the previous strategy or something else like manual intervention, which most source version control systems do.) The advantages of SSOT architectures include easier prevention of mistaken inconsistencies (such as a duplicate value/copy somewhere being forgotten), and greatly simplified
version control Version control (also known as revision control, source control, and source code management) is the software engineering practice of controlling, organizing, and tracking different versions in history of computer files; primarily source code t ...
. Without a SSOT, dealing with inconsistencies implies either complex and error-prone consensus algorithms, or using a simpler architecture that's liable to lose data in the face of inconsistency (the latter may seem unacceptable but it is sometimes a very good choice; it is how most blockchains operate: a transaction is actually final only if it was included in the next block that is mined). Ideally, SSOT systems provide data that are authentic (and authenticatable), relevant, and referable. Deployment of an SSOT architecture is becoming increasingly important in enterprise settings where incorrectly linked duplicate or de-normalized data elements (a direct consequence of intentional or unintentional
denormalization Denormalization is a strategy used on a previously- normalized database to increase performance. In computing, denormalization is the process of trying to improve the read performance of a database, at the expense of losing some write performance, ...
of any explicit data model) pose a risk for retrieval of outdated, and therefore incorrect, information. Common examples (i.e., example classes of implementation) are as follows: * In
electronic health record An electronic health record (EHR) is the systematized collection of electronically stored patient and population health information in a digital format. These records can be shared across different health care settings. Records are shared thro ...
s (EHRs), it is imperative to accurately validate patient identity against a single referential repository, which serves as the SSOT. Duplicate representations of data within the enterprise would be implemented by the use of pointers rather than duplicate database tables, rows, or cells. This ensures that data updates to elements in the authoritative location are comprehensively distributed to all
federated database A federated database system (FDBS) is a type of meta-database management system (DBMS), which transparently maps multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer ne ...
constituencies in the larger overall
enterprise architecture Enterprise architecture (EA) is a business function concerned with the structures and behaviours of a business, especially business roles and processes that create and use business data. The international definition according to the Federation of ...
. EHRs are an excellent class for exemplifying how SSOT architecture is both poignantly necessary and challenging to achieve: it is challenging because inter-organization
health information exchange Health Information Exchange (HIE) is the electronic exchange of health care information across organizations within a region, community, or hospital system. Participants in this data exchange are collectively called Health Information Networks (HI ...
is inherently a
cybersecurity Computer security (also cybersecurity, digital security, or information technology (IT) security) is a subdiscipline within the field of information security. It consists of the protection of computer software, systems and networks from thr ...
competence hurdle, and nonetheless it is necessary, to prevent
medical error A medical error is a preventable adverse effect of care (" iatrogenesis"), whether or not it is evident or harmful to the patient. This might include an inaccurate or incomplete diagnosis or treatment of a disease, injury, syndrome, behavior, ...
s, to prevent the wasted costs of inefficiency (such as duplicated work or rework), and to make the
primary care Primary care is a model of health care that supports first-contact, accessible, continuous, comprehensive, and coordinated person-focused care. It aims to optimise population health and reduce disparities across the groups by ensuring equitable ...
and
medical home The medical home, also known as the patient-centered medical home or primary care medical home (PCMH), is a team-based health care delivery model led by a health care provider to provide comprehensive and continuous medical care to patients with ...
concepts feasible (to achieve competent care transitions). * Single-source publishing as a general principle or ideal in
content management Content management (CM) are a set of processes and technologies that support the collection, managing, and publishing of information in any form or medium. When stored and accessed via computers, this information may be more specifically referre ...
relies on having SSOTs, via
transclusion In computer science, transclusion is the inclusion of part or all of an electronic document into one or more other documents by reference via hypertext. Transclusion is usually performed when the referencing document is displayed, and is norma ...
or (otherwise, at least) substitution. Substitution happens via libraries of objects that can be propagated as static copies which are later refreshed when necessary (that is, when refreshing of the
copy-paste Cut, copy, and paste are essential commands of modern human–computer interaction and user interface design. They offer an interprocess communication technique for transferring data through a computer's user interface. The ''cut'' command remo ...
or
import An importer is the receiving country in an export from the sending country. Importation and exportation are the defining financial transactions of international trade. Import is part of the International Trade which involves buying and receivin ...
is triggered by a larger updating event).
Component content management system A component content management system (CCMS) is a content management system that manages content at a granular level (component) rather than at the document level. Each component represents a single topic, concept or asset (for example an image, ...
s are a class of
content management system A content management system (CMS) is computer software used to manage the creation and modification of digital content ( content management).''Managing Enterprise Content: A Unified Content Strategy''. Ann Rockley, Pamela Kostur, Steve Manning. New ...
s that aim to provide competence on this level.


Implementation


Ontologic interactions

An acknowledged prerequisite (of the notion that any given single source of truth can exist) is that it depends on the ontologic condition that no more than a single truth (about any particular fact or idea) exists, an assertion that is ontologic in both the IT sense and the general sense of that word. In many instances, this presents no problem (for example, within particular
namespace In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
s, or even across them, as long as
naming collision A naming collision is a circumstance where two or more identifiers in a given namespace or a given scope cannot be unambiguously resolved, and such unambiguous resolution is a requirement of the underlying system. Example: XML element names In ...
s or broader name conflicts are adequately handled). The broadest contexts (and thus thorniest, regarding ontologic discrepancies) require adequate epistemic regime comparison and
reconciliation Reconciliation or reconcile may refer to: Accounting * Reconciliation (accounting) Arts, entertainment, and media Books * Reconciliation (Under the North Star), ''Reconciliation'' (''Under the North Star''), the third volume of the ''Under the ...
(or at least negotiation or transactional exchanges). An archetypal example of this class of reconciliation is that two theological seminary libraries, from two different religions (X and Y), could exchange information with an SSOT architecture, but the unification of truth would reside on the level of the statement that "religion X asserts that God is purple whereas religion Y asserts that God is green", rather than on the level of "God is purple" or "God is green".


Architectures or architectural features

An ideal implementation of SSOT is rarely possible in most enterprises. This is because many organisations have multiple information systems, each of which needs access to data relating to the same entities (e.g., customer). Often these systems are purchased as
commercial off-the-shelf Commercial-off-the-shelf or commercially available off-the-shelf (COTS) products are packaged or canned (ready-made) hardware or software, which are adapted aftermarket to the needs of the purchasing organization, rather than the commissioning of ...
products from vendors and cannot be modified in trivial ways. Each of these various systems therefore needs to store its own version of common data or entities, and therefore each system must retain its own copy of a record (hence immediately violating the SSOT approach defined above). For example, an
enterprise resource planning Enterprise resource planning (ERP) is the integrated management of main business processes, often in real time and mediated by software and technology. ERP is usually referred to as a category of business management software—typically a suit ...
(ERP) system (such as
SAP Sap is a fluid transported in the xylem cells (vessel elements or tracheids) or phloem sieve tube elements of a plant. These cells transport water and nutrients throughout the plant. Sap is distinct from latex, resin, or cell sap; it is a s ...
or
Oracle e-Business Suite Oracle Applications comprise the applications software or business software of the Oracle Corporation both in the cloud and on-premises. The term refers to the non-database and non-middleware parts. The suite of applications includes enterprise re ...
) may store a customer record; the
customer relationship management Customer relationship management (CRM) is a strategic process that organizations use to manage, analyze, and improve their interactions with customers. By leveraging data-driven insights, CRM helps businesses optimize communication, enhance cus ...
(CRM) system also needs a copy of the customer record (or part of it) and the warehouse dispatch system might also need a copy of some or all of the customer data (e.g., shipping address). In cases where vendors do not support such modifications, it is not always possible to replace these records with pointers to the SSOT. For organisations (with more than one information system) wishing to implement a Single Source of Truth (without modifying all but one master system to store pointers to other systems for all entities), some supporting architectures are: *
Master data management Master data management (MDM) is a discipline in which business and information technology collaborate to ensure the uniformity, accuracy, stewardship, semantic consistency, and accountability of the enterprise's official shared master data assets. ...
(MDM) * Event store and event sourcing (ES)


Master data management (MDM)

An MDM system can act as the source of truth for any given entity that might not necessarily have an alternative "source of truth" in another system. Typically the MDM acts as a hub for multiple systems, many of which could allow (be the source of truth for) updates to different aspects of information on a given entity. For example, the CRM system may be the "source of truth" for most aspects of the customer, and is updated by a call centre operator. However, a customer may (for example) also update their address via a customer service web site, with a different back-end database from the CRM system. The MDM application receives updates from multiple sources, acts as a broker to determine which updates are to be regarded as authoritative (the golden record) and then syndicates this updated data to all subscribing systems. The MDM application normally requires an ESB to syndicate its data to multiple subscribing systems.


Event store and event sourcing (ES)

In event oriented architectures, it has become increasingly common to find an implementation of th
Event Sourcing
pattern which stores the system state as an ordered sequence of state changes. To do this, you need an
Event Store Event may refer to: Gatherings of people * Ceremony, an event of ritual significance, performed on a special occasion * Convention (meeting), a gathering of individuals engaged in some common interest * Event management, the organization of ev ...
, a particular type of database designed to hold all the events that change the state of the system. The event store in a
Event Sourcing
+ Command Query Responsibility Separation + Domain Driven Design +
Messaging A message is a unit of communication that conveys information from a sender to a receiver. It can be transmitted through various forms, such as spoken or written words, signals, or electronic data, and can range from simple instructions to co ...
architecture is in fact a "single source of truth", with the additional advantage that it can also act as an Enterprise Service Bus as it can listen directly to the event store for status changes as everything passes by. In addition, by saving all the events, it also plays the role of
Data Warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business intelligence, reporting and data analysis and is a core component of business intelligence. Data warehouses are central Re ...
. One last advantage is that through this system th
Shared Database pattern
can be implemented, another technique not mentioned to obtain a single source of truth.


Data warehouse (DW)

While the primary purpose of a data warehouse is to support reporting and analysis of data that has been combined from multiple sources, the fact that such data has been combined (according to business logic embedded in the data transformation and integration processes) means that the data warehouse is often used as a ''de facto'' SSOT. Generally, however, the data available from the data warehouse are not used to update other systems; rather the DW becomes the "single source of truth" for reporting to multiple stakeholders. In this context, the Data Warehouse is more correctly referred to as a "
single version of the truth In computerized business management, single version of the truth (SVOT), is a technical concept describing the data warehousing ideal of having either a single centralised database, or at least a distributed synchronised database, which stores all ...
" since other versions of the truth exist in its operational data sources (no data originates in the DW; it is simply a reporting mechanism for data loaded from operational systems).


See also

*
Blockchain The blockchain is a distributed ledger with growing lists of Record (computer science), records (''blocks'') that are securely linked together via Cryptographic hash function, cryptographic hashes. Each block contains a cryptographic hash of th ...
, distributed data store for digital transactions *
Circular reporting Circular reporting, or false confirmation, is a situation in source criticism where a piece of information appears to come from multiple independent sources, but in reality comes from only one source. In many cases, the problem happens mistaken ...
, problem where a source gets info from somewhere, that then uses that source as a reference *
Database normalization Database normalization is the process of structuring a relational database in accordance with a series of so-called '' normal forms'' in order to reduce data redundancy and improve data integrity. It was first proposed by British computer scien ...
, technique for designing tables in relational databases such that duplication of information is minimised *
Don't repeat yourself "Don't repeat yourself" (DRY) is a principle of software development aimed at reducing repetition of information which is likely to change, replacing it with abstractions that are less likely to change, or using data normalization which avoids r ...
*
Single version of the truth In computerized business management, single version of the truth (SVOT), is a technical concept describing the data warehousing ideal of having either a single centralised database, or at least a distributed synchronised database, which stores all ...
, ideal where all the data of an organisation is stored in a consistent and non-redundant form *
Single version of facts Datavault or data vault modeling is a database modeling method that is designed to provide long-term historical storage of data coming in from multiple operational systems. It is also a method of looking at historical data that deals with issues ...
, concept in data vault *
System of record A system of record (SOR) or source system of record (SSoR) is a data management term for an information storage system (commonly implemented on a computer system running a database management system) that is the authoritative data source for a gi ...
, the authoritative data source for a given data element


Source code

In software design, the same schema, business logic and other components are often repeated in multiple different contexts, while each version refers to itself as "Source Code". To address this problem, the concepts of SSOT can also be applied to software development principles using processes like
recursive transcompiling A source-to-source translator, source-to-source compiler (S2S compiler), transcompiler, or transpiler is a type of translator that takes the source code of a program written in a programming language as its input and produces an equivalent sou ...
to iteratively turn a single source of truth into many different kinds of source code, which will match each other structurally because they are all derived from the same SSOT.Why Google stores billions of lines of code in a single repository
/ref>


References

{{DEFAULTSORT:Single Source Of Truth Data modeling Database normalization Data management