HOME

TheInfoList



OR:

A growing self-organizing map (GSOM) is a growing variant of a
self-organizing map A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the ...
(SOM). The GSOM was developed to address the issue of identifying a suitable map size in the SOM. It starts with a minimal number of nodes (usually 4) and grows new nodes on the boundary based on a heuristic. By using the value called Spread Factor (SF), the data analyst has the ability to control the growth of the GSOM. All the starting nodes of the GSOM are boundary nodes, i.e. each node has the freedom to grow in its own direction at the beginning. (Fig. 1) New Nodes are grown from the boundary nodes. Once a node is selected for growing all its free neighboring positions will be grown new nodes. The figure shows the three possible node growth options for a rectangular GSOM.


The algorithm

The GSOM process is as follows: #Initialization phase: ##Initialize the weight vectors of the starting nodes (usually four) with random numbers between 0 and 1. ##Calculate the growth threshold (GT) for the given data set of dimension D according to the spread factor (SF) using the formula GT = - D \times \ln ( SF ) #Growing Phase: ##Present input to the network. ##Determine the weight vector that is closest to the input vector mapped to the current feature map (winner), using Euclidean distance (similar to the SOM). This step can be summarized as: find q' such that \left , v - w_ \right \vert \le \left , v - w_q \right \vert \forall q \in \mathbb where v, w are the input and weight vectors respectively, q is the position vector for nodes and \mathbb is the set of natural numbers. ##The weight vector adaptation is applied only to the neighborhood of the winner and the winner itself. The neighborhood is a set of neurons around the winner, but in the GSOM the starting neighborhood selected for weight adaptation is smaller compared to the SOM (localized weight adaptation). The amount of adaptation (learning rate) is also reduced exponentially over the iterations. Even within the neighborhood, weights that are closer to the winner are adapted more than those further away. The weight adaptation can be described by w_j ( k + 1 ) = \begin w_j ( k ) & \textj \notin \Nu_ \\ w_j ( k ) + LR ( k ) \times (x_k - w_j ( k ) ) & \textj \in \Nu_ \end where the Learning Rate LR ( k ), k \in \mathbb is a sequence of positive parameters converging to zero as k \to \infty. w_j ( k ), w_j ( k + 1 ) are the weight vectors of the node j before and after the adaptation and \Nu_ is the neighbourhood of the winning neuron at the ( k + 1 )th iteration. The decreasing value of LR ( k ) in the GSOM depends on the number of nodes existing in the map at time k. ##Increase the error value of the winner (error value is the difference between the input vector and the weight vectors). ##When TE_i > GT(where TE_i is the total error of node i and GT is the growth threshold). Grow nodes if i is a boundary node. Distribute weights to neighbors if i is a non-boundary node. ##Initialize the new node weight vectors to match the neighboring node weights. ##Initialize the learning rate (LR) to its starting value. ##Repeat steps 2 – 7 until all inputs have been presented and node growth is reduced to a minimum level. #Smoothing phase. ##Reduce learning rate and fix a small starting neighborhood. ##Find winner and adapt the weights of the winner and neighbors in the same way as in growing phase.


Applications

The GSOM can be used for many preprocessing tasks in Data mining, for
Nonlinear dimensionality reduction Nonlinear dimensionality reduction, also known as manifold learning, refers to various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low- ...
, for approximation of principal curves and manifolds, for clustering and
classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood. Classification is the grouping of related facts into classes. It may also refer to: Business, organizat ...
. It gives often the better representation of the data geometry than the SOM (see the classical benchmark for principal curves on the left).


References


Bibliography

* * {{cite journal , last1 = Hsu , first1 = A. , last2 = Tang , first2 = S. , last3 = Halgamuge , first3 = S. K. , year = 2003 , title = An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data , journal = Bioinformatics , volume = 19 , issue = 16, pages = 2131–2140 , doi=10.1093/bioinformatics/btg296, pmid = 14594719 , doi-access = free * Alahakoon, D., Halgamuge, S. K. and Sirinivasan, B. (2000) Dynamic Self Organizing Maps With Controlled Growth for Knowledge Discovery, IEEE Transactions on Neural Networks, Special Issue on Knowledge Discovery and Data Mining, 11, pp 601–614. * Alahakoon, D., Halgamuge, S. K. and Sirinivasan, B. (1998) A Structure Adapting Feature Map for Optimal Cluster Representation in Proceedings of the 5th International Conference on Neural Information Processing (ICONIP 98), Kitakyushu, Japan, pp 809–812 * Alahakoon, D., Halgamuge, S. K. and Sirinivasan, B. (1998) A Self Growing Cluster Development Approach to Data Mining in Proceedings of IEEE International Conference on Systems, Man and Cybernetics, San Diego, USA, pp 2901–2906 * Alahakoon, D. and Halgamuge, S. K. (1998) Knowledge Discovery with Supervised and Unsupervised Self Evolving Neural Networks in Proceedings of 5th International Conference on Soft Computing and Information/Intelligent Systems, Fukuoka, Japan, pp 907–910


See also

*
Self-organizing map A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the ...
* Time Adaptive Self-Organizing Map *
Elastic map Elastic maps provide a tool for nonlinear dimensionality reduction. By their construction, they are a system of elastic springs embedded in the data space. This system approximates a low-dimensional manifold. The elastic coefficients of this sy ...
*
Artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
*
Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
* Data mining *
Nonlinear dimensionality reduction Nonlinear dimensionality reduction, also known as manifold learning, refers to various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low- ...
Machine learning algorithms Artificial neural networks