Outer Alignment (artificial Intelligence)
   HOME





Outer Alignment (artificial Intelligence)
Outer alignment is a concept in artificial intelligence (AI) safety that refers to the challenge of specifying training objectives for AI systems in a way that truly reflects human values and intentions. It is often described as the reward misspecification problem, as it concerns whether the goal provided during training actually captures what humans want the AI to accomplish. Outer alignment is distinct from inner alignment, which focuses on whether the AI internalizes and pursues the specified goal once trained. Because human preferences are complex and often implicit, crafting precise and comprehensive reward functions remains an open problem. AI systems, particularly goal-optimizing ones, are vulnerable to Goodhart's Law, which states that when a measure becomes a target, it ceases to be a good measure. Consequently, optimizing for a poorly specified proxy can produce harmful or unintended outcomes. Sub-problems in this domain include specification gaming, where agents exploit l ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Artificial Intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of research in computer science that develops and studies methods and software that enable machines to machine perception, perceive their environment and use machine learning, learning and intelligence to take actions that maximize their chances of achieving defined goals. High-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon (company), Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Amazon Alexa, Alexa); autonomous vehicles (e.g., Waymo); Generative artificial intelligence, generative and Computational creativity, creative tools (e.g., ChatGPT and AI art); and Superintelligence, superhuman play and analysis in strategy games (e.g., ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  



MORE