Transformer (neural Network)
   HOME



picture info

Transformer (neural Network)
The transformer is a deep learning architecture based on the multi-head Attention (machine learning), attention mechanism, in which text is converted to numerical representations called Large language model#Tokenization, tokens, and each token is converted into a vector via lookup from a word embedding table. At each layer, each Tokenization (lexical analysis), token is then Contextualization (computer science), contextualized within the scope of the context window with other (unmasked) tokens via a parallel multi-head attention mechanism, allowing the signal for key tokens to be amplified and less important tokens to be diminished. Transformers have the advantage of having no recurrent units, therefore requiring less training time than earlier Recurrent neural network, recurrent neural architectures (RNNs) such as long short-term memory (LSTM). Later variations have been widely adopted for training large language models (LLM) on large (language) Training, validation, and test da ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  



MORE