HOME

TheInfoList



OR:

In
graph theory In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
and
theoretical computer science Theoretical computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory. It is difficult to circumsc ...
, a maximum common induced subgraph of two graphs ''G'' and ''H'' is a graph that is an
induced subgraph In the mathematical field of graph theory, an induced subgraph of a graph is another graph, formed from a subset of the vertices of the graph and ''all'' of the edges (from the original graph) connecting pairs of vertices in that subset. Definit ...
of both ''G'' and ''H'', and that has as many vertices as possible. Finding this graph is
NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
. In the associated
decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whethe ...
, the input is two graphs ''G'' and ''H'' and a number ''k''. The problem is to decide whether ''G'' and ''H'' have a common induced subgraph with at least ''k'' vertices. This problem is
NP-complete In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryin ...
. It is a generalization of the induced
subgraph isomorphism problem In theoretical computer science, the subgraph isomorphism problem is a computational task in which two graphs ''G'' and ''H'' are given as input, and one must determine whether ''G'' contains a subgraph that is isomorphic to ''H''. Subgraph isomor ...
, which arises when ''k'' equals the number of vertices in the smaller of ''G'' and ''H'', so that this entire graph must appear as an induced subgraph of the other graph. Based on
hardness of approximation In computer science, hardness of approximation is a field that studies the algorithmic complexity of finding near-optimal solutions to optimization problems. Scope Hardness of approximation complements the study of approximation algorithms by pro ...
results for the
maximum independent set In graph theory, an independent set, stable set, coclique or anticlique is a set of vertices in a graph, no two of which are adjacent. That is, it is a set S of vertices such that for every two vertices in S, there is no edge connecting the two ...
problem, the maximum common induced subgraph problem is also hard to approximate. This implies that, unless
P = NP The P versus NP problem is a major unsolved problem in theoretical computer science. In informal terms, it asks whether every problem whose solution can be quickly verified can also be quickly solved. The informal term ''quickly'', used above ...
, there is no
approximation algorithm In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned solu ...
that, in
polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
on n-vertex graphs, always finds a solution within a factor of n^ of optimal, for any \epsilon > 0. One possible solution for this problem is to build a modular product graph of ''G'' and ''H''. In this graph, the largest
clique A clique ( AusE, CanE, or ), in the social sciences, is a group of individuals who interact with one another and share similar interests. Interacting with cliques is part of normative social development regardless of gender, ethnicity, or popula ...
corresponds to a maximum common induced subgraph of ''G'' and ''H''. Therefore, algorithms for finding maximum cliques can be used to find the maximum common induced subgraph. Moreover, a modified maximum-clique algorithm can be used to find a maximum common ''connected'' subgraph. The McSplit algorithm (along with its McSplit↓ variant) is a forward checking algorithm that does not use the clique encoding, but uses a compact data structure to keep track of the vertices in graph ''H'' to which each vertex in graph ''G'' may be mapped. Both versions of the McSplit algorithm outperform the clique encoding for many graph classes.A more efficient implementation of McSplit is McSplitDAL+PR, which combines a
Reinforcement Learning Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine ...
agent with some heuristic scores computed with the
PageRank PageRank (PR) is an algorithm used by Google Search to rank web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. Accordi ...
algorithm. Maximum common induced subgraph algorithms have a long tradition in
cheminformatics Cheminformatics (also known as chemoinformatics) refers to use of physical chemistry theory with computer and information science techniques—so called "''in silico''" techniques—in application to a range of descriptive and prescriptive proble ...
and pharmacophore mapping..


See also

*
Molecule mining This page describes mining for molecules. Since molecules may be represented by molecular graphs this is strongly related to graph mining and structured data mining. The main problem is how to represent molecules while discriminating the data i ...
*
Maximum common edge subgraph Given two graphs G and G', the maximum common edge subgraph problem is the problem of finding a graph H with as many edges as possible which is isomorphic to both a subgraph of G and a subgraph of G'. The maximum common edge subgraph problem on ...


References

{{reflist NP-complete problems Cheminformatics Computational problems in graph theory