Mode Collapse
   HOME

TheInfoList



OR:

In
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
, mode collapse is a failure mode observed in
generative model In statistical classification, two main approaches are called the generative approach and the discriminative approach. These compute classifiers by different approaches, differing in the degree of statistical modelling. Terminology is incons ...
s, originally noted in Generative Adversarial Networks (GANs). It occurs when the model produces outputs that are less diverse than expected, effectively "collapsing" to generate only a few
modes Mode ( la, modus meaning "manner, tune, measure, due measure, rhythm, melody") may refer to: Arts and entertainment * '' MO''D''E (magazine)'', a defunct U.S. women's fashion magazine * ''Mode'' magazine, a fictional fashion magazine which is ...
of the data distribution while ignoring others. This phenomenon undermines the goal of generative models to capture the full diversity of the training data. There are typically two times at which a model can collapse: either during training or during post-training finetuning. Mode collapse reduces the utility of generative models in applications, such as in *image synthesis (repetitive or near-identical images); * data augmentation (limited diversity in synthetic data); * scientific simulations (failure to explore all plausible scenarios).


Distinctions

Mode collapse is distinct from
overfitting mathematical modeling, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably". An overfitt ...
, where a model learns detailed patterns in the training data that does not generalize to the test data, and underfitting, where it fails to learn patterns.
Memorization Memorization is the process of committing something to memory. It is a mental process undertaken in order to store in memory for later recall visual, auditory, or tactical information. The scientific study of memory is part of cognitive neuros ...
is where a model learns to reproduce data from the training data. Memorization is often confused with mode collapse. However, a model can memorize the training dataset without mode collapse. Indeed, if a model is severely mode-collapsed, then it has failed to memorize large parts of the training dataset. Model collapse is one particular mechanism for the phenomenon of mode collapse, i.e. when a generative model 2 is pretrained mainly on the outputs of model 1, then another new generative model 3 is pretrained mainly on the outputs of model 2, etc. When models are trained in this way, each model is typically more mode-collapsed than the previous one. However, there are other mechanisms for mode collapse.


In GANs

Training-time mode collapse was originally noted and studied in GANs, where it arises primarily due to imbalances in the training dynamics between the generator and discriminator in GANs. In the original GAN paper, it was also called the "Helvetica scenario". Common causes include: * If the discriminator learns too slowly, the generator may exploit weaknesses by producing a narrow set of outputs that consistently fool the discriminator. * Traditional GAN loss functions (e.g., Jensen-Shannon divergence) may be too lenient on generating same-looking outputs. * The adversarial training process can lead to oscillatory behavior, where the generator and discriminator fail to converge to a stable equilibrium, but instead engage in a rock-beats-paper-beats-scissors kind of cycling. The generator would generate just "rock" until the discriminator learns to classify that as generated, then the generator switch to generating just "scissors", and so on. The generator would always be mode-collapsed, though the precise mode in which it collapses to would change during training. Several GAN-specific strategies were developed to mitigate mode collapse: * Two time-scale update rule. * Mini-batch discrimination allows the discriminator to evaluate entire batches of samples, encouraging diversity. * Unrolled GANs optimize the generator against future states of the discriminator. *
Wasserstein GAN The Wasserstein Generative Adversarial Network (WGAN) is a variant of generative adversarial network (GAN) proposed in 2017 that aims to "improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning c ...
uses Earth Mover's distance to provide more stable gradients. * Use a big and balanced training dataset. *
Regularization Regularization may refer to: * Regularization (linguistics) * Regularization (mathematics) * Regularization (physics) * Regularization (solid modeling) * Regularization Law, an Israeli law intended to retroactively legalize settlements See also ...
methods such as gradient penalty and spectral normalization.


Finetuning

The
large language model A large language model (LLM) is a language model consisting of a neural network with many parameters (typically billions of weights or more), trained on large quantities of unlabelled text using self-supervised learning. LLMs emerged around 2018 an ...
s are usually trained in two steps. In the first step ("pretraining"), the model is trained to simply generate text sampled from a large dataset. In the second step ("finetuning"), the model is trained to perform specific tasks by training it on a small dataset containing just the task-specific data. For example, to make a chatbot in this method, one first pretrains a large
transformer A transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple circuits. A varying current in any coil of the transformer produces a varying magnetic flux in the transformer' ...
model over a few trillion words of text scraped from the Internet, then finetunes it on a few million words of example chatlogs that the model should imitate. Mode collapse may occur during finetuning, as the model learns to generate text that accomplishes the specific task, but loses ability to generate other forms of text. It may also be able to generate a smaller subset of texts that accomplish the specific task. It is hypothesized that there is a tradeoff between quality and diversity. Given a single pretrained model, one may finetune it to perform a specific task. More finetuning would result in higher average task performance, but less diverse outputs. Less finetuning would result in lower average performance, but more diverse outputs. A similar tradeoff has been observed in image generation models and GAN-based text generators. Similarly, mode collapse may occur during RLHF, via reward hacking the reward model or other mechanisms.


See also

*
Variational autoencoder In machine learning, a variational autoencoder (VAE), is an artificial neural network architecture introduced by Diederik P. Kingma and Max Welling, belonging to the families of probabilistic graphical models and variational Bayesian methods. ...
*
Generative model In statistical classification, two main approaches are called the generative approach and the discriminative approach. These compute classifiers by different approaches, differing in the degree of statistical modelling. Terminology is incons ...
*
Generative artificial intelligence Generative artificial intelligence (generative AI, GenAI, or GAI) is a subset of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data. These models machine learning, learn the underlying p ...
* Generative pre-trained transformer *
Overfitting mathematical modeling, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably". An overfitt ...


References

{{Reflist, 30em Machine learning Artificial intelligence Generative artificial intelligence