Graph Kernel
   HOME

TheInfoList



OR:

In
structure mining Structure mining or structured data mining is the process of finding and extracting useful information from semi-structured data sets. Graph mining, sequential pattern mining and molecule mining are special cases of structured data mining. Des ...
, a graph kernel is a
kernel function In operator theory, a branch of mathematics, a positive-definite kernel is a generalization of a positive-definite function or a positive-definite matrix. It was first introduced by James Mercer in the early 20th century, in the context of solvi ...
that computes an
inner product In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, ofte ...
on
graphs Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties * Graph (topology), a topological space resembling a graph in the sense of discre ...
. Graph kernels can be intuitively understood as functions measuring the similarity of pairs of graphs. They allow kernelized learning algorithms such as
support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised max-margin models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborato ...
s to work directly on graphs, without having to do
feature extraction Feature may refer to: Computing * Feature recognition, could be a hole, pocket, or notch * Feature (computer vision), could be an edge, corner or blob * Feature (machine learning), in statistics: individual measurable properties of the phenome ...
to transform them to fixed-length, real-valued
feature vector In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a data set. Choosing informative, discriminating, and independent features is crucial to produce effective algorithms for pattern re ...
s. They find applications in
bioinformatics Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
, in
chemoinformatics Cheminformatics (also known as chemoinformatics) refers to the use of physical chemistry theory with computer and information science techniques—so called "''in silico''" techniques—in application to a range of descriptive and prescriptive pr ...
(as a type of molecule kernels), and in
social network analysis Social network analysis (SNA) is the process of investigating social structures through the use of networks and graph theory. It characterizes networked structures in terms of ''nodes'' (individual actors, people, or things within the network) ...
. Concepts of graph kernels have been around since the 1999, when D. Haussler introduced convolutional kernels on discrete structures. The term graph kernels was more officially coined in 2002 by R. I. Kondor and J. Lafferty as kernels ''on'' graphs, i.e. similarity functions between the nodes of a single graph, with the
World Wide Web The World Wide Web (WWW or simply the Web) is an information system that enables Content (media), content sharing over the Internet through user-friendly ways meant to appeal to users beyond Information technology, IT specialists and hobbyis ...
hyperlink In computing, a hyperlink, or simply a link, is a digital reference providing direct access to Data (computing), data by a user (computing), user's point and click, clicking or touchscreen, tapping. A hyperlink points to a whole document or to ...
graph as a suggested application. In 2003, Gärtner ''et al.'' and Kashima ''et al.'' defined kernels ''between'' graphs. In 2010, Vishwanathan ''et al.'' gave their unified framework. In 2018, Ghosh et al. described the history of graph kernels and their evolution over two decades.


Applications

The marginalized graph kernel has been shown to allow accurate predictions of the atomization energy of small organic molecules.


Example Kernels

An example of a kernel between graphs is the random walk kernel, which conceptually performs
random walk In mathematics, a random walk, sometimes known as a drunkard's walk, is a stochastic process that describes a path that consists of a succession of random steps on some Space (mathematics), mathematical space. An elementary example of a rand ...
s on two graphs simultaneously, then counts the number of
path A path is a route for physical travel – see Trail. Path or PATH may also refer to: Physical paths of different types * Bicycle path * Bridle path, used by people on horseback * Course (navigation), the intended path of a vehicle * Desir ...
s that were produced by ''both'' walks. This is equivalent to doing random walks on the
direct product In mathematics, a direct product of objects already known can often be defined by giving a new one. That induces a structure on the Cartesian product of the underlying sets from that of the contributing objects. The categorical product is an abs ...
of the pair of graphs, and from this, a kernel can be derived that can be efficiently computed. Another examples is the Weisfeiler-Leman graph kernelShervashidze, Nino, et al. "Weisfeiler-lehman graph kernels." Journal of Machine Learning Research 12.9 (2011). which computes multiple rounds of the Weisfeiler-Leman algorithm and then computes the similarity of two graphs as the inner product of the histogram vectors of both graphs. In those histogram vectors the kernel collects the number of times a color occurs in the graph in every iteration. Note that the Weisfeiler-Leman kernel in theory has an infinite dimension as the number of possible colors assigned by the Weisfeiler-Leman algorithm is infinite. By restricting to the colors that occur in both graphs, the computation is still feasible.


See also

* Tree kernel, as special case of non-cyclic graphs *
Molecule mining Molecule mining is the process of data mining, or extracting and discovering patterns, as applied to molecules. Since molecules may be represented by molecular graphs, this is strongly related to graph mining and structured data mining. The m ...
, as special case of small multi-label graphs


References

Graph algorithms Kernel methods for machine learning {{comp-sci-stub