Reflection (artificial Intelligence)
   HOME





Reflection (artificial Intelligence)
Reflection is the term used for how some large language models (specifically reasoning language models (RLMs)) share information among their input or previous layers, based on their outputs or subsequent layers. This process is designed to mimic self-assessment and internal deliberation, aiming to minimize errors (like hallucinations) and increase interpretability. Reflection is a form of " test-time compute", where additional computational resources are used during inference. Introduction Traditional neural networks process inputs in a feedforward manner, generating outputs in a single pass. However, their limitations in handling complex tasks, and especially compositional ones, have led to the development of methods that simulate internal deliberation. Techniques such as chain-of-thought prompting encourage models to generate intermediate reasoning steps, thereby improving their performance in such tasks. The feedback can take place either after a full network pass and de ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Large Language Model
A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pretrained transformers (GPTs), which are largely used in generative chatbots such as ChatGPT or Gemini. LLMs can be fine-tuned for specific tasks or guided by prompt engineering. These models acquire predictive power regarding syntax, semantics, and ontologies inherent in human language corpora, but they also inherit inaccuracies and biases present in the data they are trained in. History Before the emergence of transformer-based models in 2017, some language models were considered large relative to the computational and data constraints of their time. In the early 1990s, IBM's statistical models pioneered word alignment techniques for machine translation, laying the groundwork for corpus-based language modeling. A sm ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Chain Of Thought Prompting
Prompt engineering is the process of structuring or crafting an instruction in order to produce the best possible output from a generative artificial intelligence (AI) model. A ''prompt'' is natural language text describing the task that an AI should perform. A prompt for a text-to-text language model can be a query, a command, or a longer statement including context, instructions, and conversation history. Prompt engineering may involve phrasing a query, specifying a style, choice of words and grammar, providing relevant context, or describing a character for the AI to mimic. When communicating with a text-to-image or a text-to-audio model, a typical prompt is a description of a desired output such as "a high-quality photo of an astronaut riding a horse" or "Lo-fi slow BPM electro chill with organic samples". Prompting a text-to-image model may involve adding, removing, or emphasizing words to achieve a desired subject, style, layout, lighting, and aesthetic. History In 2018, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Reflective Programming
In computer science, reflective programming or reflection is the ability of a process to examine, introspect, and modify its own structure and behavior. Historical background The earliest computers were programmed in their native assembly languages, which were inherently reflective, as these original architectures could be programmed by defining instructions as data and using self-modifying code. As the bulk of programming moved to higher-level compiled languages such as ALGOL, COBOL, Fortran, Pascal, and C, this reflective ability largely disappeared until new programming languages with reflection built into their type systems appeared. Brian Cantwell Smith's 1982 doctoral dissertation introduced the notion of computational reflection in procedural programming languages and the notion of the meta-circular interpreter as a component of 3-Lisp. Uses Reflection helps programmers make generic software libraries to display data, process different formats of data, perform ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Google DeepMind
DeepMind Technologies Limited, trading as Google DeepMind or simply DeepMind, is a British–American artificial intelligence research laboratory which serves as a subsidiary of Alphabet Inc. Founded in the UK in 2010, it was acquired by Google in 2014 and merged with Google AI's Google Brain division to become Google DeepMind in April 2023. The company is headquartered in London, with research centres in the United States, Canada, France, Germany, and Switzerland. DeepMind introduced neural Turing machines (neural networks that can access external memory like a conventional Turing machine), resulting in a computer that loosely resembles short-term memory in the human brain. DeepMind has created neural network models to play video games and board games. It made headlines in 2016 after its AlphaGo program beat a human professional Go player Lee Sedol, a world champion, in a five-game match, which was the subject of a documentary film. A more general program, AlphaZero, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Normalization (machine Learning)
In machine learning, normalization is a statistical technique with various applications. There are two main forms of normalization, namely ''data normalization'' and ''activation normalization''. Data normalization (or feature scaling) includes methods that rescale input data so that the features have the same range, mean, variance, or other statistical properties. For instance, a popular choice of feature scaling method is min-max normalization, where each feature is transformed to have the same range (typically ,1/math> or 1,1/math>). This solves the problem of different features having vastly different scales, for example if one feature is measured in kilometers and another in nanometers. Activation normalization, on the other hand, is specific to deep learning, and includes methods that rescale the activation of hidden neurons inside neural networks. Normalization is often used to: * increase the speed of training convergence, * reduce sensitivity to variations and feat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Policy Gradient
Policy gradient methods are a class of reinforcement learning algorithms. Policy gradient methods are a sub-class of policy optimization methods. Unlike value-based methods which learn a value function to derive a policy, policy optimization methods directly learn a policy function \pi that selects actions without consulting a value function. For policy gradient to apply, the policy function \pi_\theta is parameterized by a differentiable parameter \theta. Overview In policy-based RL, the actor is a parameterized policy function \pi_\theta, where \theta are the parameters of the actor. The actor takes as argument the state of the environment s and produces a probability distribution \pi_\theta(\cdot \mid s). If the action space is discrete, then \sum_ \pi_\theta(a \mid s) = 1. If the action space is continuous, then \int_ \pi_\theta(a \mid s) \mathrma = 1. The goal of policy optimization is to find some \theta that maximizes the expected episodic reward J(\theta):J(\theta) = ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

DeepSeek-R1
Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., doing business as DeepSeek, is a Chinese artificial intelligence company that develops large language models (LLMs). Based in Hangzhou, Zhejiang, Deepseek is owned and funded by the Chinese hedge fund High-Flyer. DeepSeek was founded in July 2023 by Liang Wenfeng, the co-founder of High-Flyer, who also serves as the CEO for both companies. The company launched an eponymous chatbot alongside its DeepSeek-R1 model in January 2025. Released under the MIT License, DeepSeek-R1 provides responses comparable to other contemporary large language models, such as OpenAI's GPT-4 and o1. Its training cost was reported to be significantly lower than other LLMs. The company claims that it trained its V3 model for US$6 million—far less than the US$100 million cost for OpenAI's GPT-4 in 2023—and using approximately one-tenth the computing power consumed by Meta's comparable model, Llama 3.1. DeepS ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Reinforcement Learning
Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learning is one of the three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input-output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected. Instead, the focus is on finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge) with the goal of maximizing the cumulative reward (the feedback of which might be incomplete or delayed). The search for this balance is known as the exploration–exploitation dilemma. The environment is typically stated in the form of a Markov decision process (MDP), as many reinforcement learning algorithms use dyn ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Metacognition
Metacognition is an awareness of one's thought processes and an understanding of the patterns behind them. The term comes from the root word ''Meta (prefix), meta'', meaning "beyond", or "on top of".Metcalfe, J., & Shimamura, A. P. (1994). ''Metacognition: knowing about knowing''. Cambridge, MA: MIT Press. Metacognition can take many forms, such as reflecting on one's ways of thinking, and knowing when and how oneself and others use particular strategies for Problem solving, problem-solving. There are generally two components of metacognition: (1) cognitive conceptions and (2) cognitive regulation system.Hartelt, T. & Martens, H. (2024). Influence of self-assessment and conditional metaconceptual knowledge on students' self-regulation of intuitive and scientific conceptions of evolution. Journal of Research in Science Teaching, 61(5), 1134–1180. https://doi.org/10.1002/tea.21938 Research has shown that both components of metacognition play key roles in metaconceptual knowledge and ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Reasoning Language Model
Reasoning language models (RLMs) are large language models that have been further trained to solve multi-step reasoning tasks. These models perform better on logical, mathematical or programmatic tasks than traditional autoregressive LLMs, have the ability to backtrack, and employ test-time compute as an additional scaling axis beyond training examples, parameter count, and train-time compute. History 2024 o1-preview, an LLM with enhanced reasoning, was released in September 2024. The full version, o1, followed in December 2024. OpenAI also began sharing results on its successor, o3. The development of reasoning LLMs has illustrated what Rich Sutton termed the "bitter lesson": that general methods leveraging computation often outperform those relying on specific human insights. For instance, some research groups, such as the Generative AI Research Lab (GAIR), initially explored complex techniques like tree search and reinforcement learning in attempts to replicate o1's c ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Latent Space
A latent space, also known as a latent feature space or embedding space, is an embedding of a set of items within a manifold in which items resembling each other are positioned closer to one another. Position within the latent space can be viewed as being defined by a set of latent variables that emerge from the resemblances from the objects. In most cases, the dimensionality of the latent space is chosen to be lower than the dimensionality of the feature space from which the data points are drawn, making the construction of a latent space an example of dimensionality reduction, which can also be viewed as a form of data compression. Latent spaces are usually fit via machine learning, and they can then be used as feature spaces in machine learning models, including classifiers and other supervised predictors. The interpretation of the latent spaces of machine learning models is an active field of study, but latent space interpretation is difficult to achieve. Due to the black-box ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Chain-of-thought Prompting
Prompt engineering is the process of structuring or crafting an instruction in order to produce the best possible output from a generative artificial intelligence (AI) model. A ''prompt'' is natural language text describing the task that an AI should perform. A prompt for a text-to-text language model can be a query, a command, or a longer statement including context, instructions, and conversation history. Prompt engineering may involve phrasing a query, specifying a style, choice of words and grammar, providing relevant context, or describing a character for the AI to mimic. When communicating with a text-to-image or a text-to-audio model, a typical prompt is a description of a desired output such as "a high-quality photo of an astronaut riding a horse" or "Lo-fi slow BPM electro chill with organic samples". Prompting a text-to-image model may involve adding, removing, or emphasizing words to achieve a desired subject, style, layout, lighting, and aesthetic. History In 2018, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]