Outer Alignment (artificial Intelligence)
   HOME





Outer Alignment (artificial Intelligence)
Outer alignment is a concept in artificial intelligence (AI) safety that refers to the challenge of specifying training objectives for AI systems in a way that truly reflects human values and intentions. It is often described as the reward misspecification problem, as it concerns whether the goal provided during training actually captures what humans want the AI to accomplish. Outer alignment is distinct from inner alignment, which focuses on whether the AI internalizes and pursues the specified goal once trained. Because human preferences are complex and often implicit, crafting precise and comprehensive reward functions remains an open problem. AI systems, particularly goal-optimizing ones, are vulnerable to Goodhart's Law, which states that when a measure becomes a target, it ceases to be a good measure. Consequently, optimizing for a poorly specified proxy can produce harmful or unintended outcomes. Sub-problems in this domain include specification gaming, where agents exploit l ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Artificial Intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of research in computer science that develops and studies methods and software that enable machines to machine perception, perceive their environment and use machine learning, learning and intelligence to take actions that maximize their chances of achieving defined goals. High-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon (company), Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Amazon Alexa, Alexa); autonomous vehicles (e.g., Waymo); Generative artificial intelligence, generative and Computational creativity, creative tools (e.g., ChatGPT and AI art); and Superintelligence, superhuman play and analysis in strategy games (e.g., ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Goodhart's Law
Goodhart's law is an adage that has been stated as, "When a measure becomes a target, it ceases to be a good measure". It is named after British economist Charles Goodhart, who is credited with expressing the core idea of the adage in a 1975 article on monetary policy in the United Kingdom: It was used to criticize the British Thatcher government for trying to conduct monetary policy on the basis of targets for broad and narrow money, but the law reflects a much more general phenomenon. Priority and background Numerous concepts are related to this idea, at least one of which predates Goodhart's statement. Notably, Campbell's law likely has precedence, as Jeff Rodamar has argued, since various formulations date to 1969. Other academics had similar insights at the time. Jerome Ravetz's 1971 book '' Scientific Knowledge and Its Social Problems'' also predates Goodhart, though it does not formulate the same law. He discusses how systems in general can be gamed, focuses on cases ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Systems Engineering
Systems engineering is an interdisciplinary field of engineering and engineering management that focuses on how to design, integrate, and manage complex systems over their Enterprise life cycle, life cycles. At its core, systems engineering utilizes systems thinking principles to organize this body of knowledge. The individual outcome of such efforts, an engineered system, can be defined as a combination of components that work in synergy to collectively perform a useful Function (engineering), function. Issues such as requirements engineering, Reliability engineering, reliability, logistics, coordination of different teams, testing and evaluation, maintainability, and many other Discipline (academia), disciplines, aka List of system quality attributes, "ilities", necessary for successful system design, development, implementation, and ultimate decommission become more difficult when dealing with large or complex projects. Systems engineering deals with work processes, optimizat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Machine Learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task (computing), tasks without explicit Machine code, instructions. Within a subdiscipline in machine learning, advances in the field of deep learning have allowed Neural network (machine learning), neural networks, a class of statistical algorithms, to surpass many previous machine learning approaches in performance. ML finds application in many fields, including natural language processing, computer vision, speech recognition, email filtering, agriculture, and medicine. The application of ML to business problems is known as predictive analytics. Statistics and mathematical optimisation (mathematical programming) methods comprise the foundations of machine learning. Data mining is a related field of study, focusing on exploratory data analysi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  



MORE