Reward Hacking
   HOME



picture info

Reward Hacking
Specification gaming or reward hacking occurs when anArtificial intelligence , AI trained with reinforcement learning optimizes an objective function—achieving the literal, formal specification of an objective—without actually achieving an outcome that the programmers intended. DeepMind researchers have analogized it to the human behavior of finding a "shortcut" when being evaluated: "In the real world, when rewarded for doing well on a homework assignment, a student might copy another student to get the right answers, rather than learning the material—and thus exploit a loophole in the task specification." Examples Around 1983, Eurisko, an early attempt at evolving general heuristics, unexpectedly assigned the highest possible fitness function, fitness level to a parasitic mutated heuristic, ''H59'', whose only activity was to artificially maximize its own fitness level by taking unearned partial credit for the accomplishments made by other heuristics. The "bug" was fixed ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Artificial Intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of research in computer science that develops and studies methods and software that enable machines to machine perception, perceive their environment and use machine learning, learning and intelligence to take actions that maximize their chances of achieving defined goals. High-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon (company), Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Amazon Alexa, Alexa); autonomous vehicles (e.g., Waymo); Generative artificial intelligence, generative and Computational creativity, creative tools (e.g., ChatGPT and AI art); and Superintelligence, superhuman play and analysis in strategy games (e.g., ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Niels Bohr Institute
The Niels Bohr Institute () is a research institute of the University of Copenhagen. The research of the institute spans astronomy, geophysics, nanotechnology, particle physics, quantum mechanics, and biophysics. Overview The institute was founded in 1921, as the Institute for Theoretical Physics of the University of Copenhagen, by the Danish theoretical physicist Niels Bohr, who had been on the staff of the University of Copenhagen since 1914, and who had been lobbying for its creation since his appointment as professor in 1916. On the 80th anniversary of Niels Bohr's birth – October 7, 1965 – the Institute officially became the Niels Bohr Institute. Much of its original funding came from the charitable foundation of the Carlsberg Group, Carlsberg brewery, and later from the Rockefeller Foundation. During the 1920s, and 1930s, the institute was the center of the developing disciplines of atomic physics and quantum physics. Physicists from across Europe (and sometimes furthe ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

AI Software
Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. High-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Alexa); autonomous vehicles (e.g., Waymo); generative and creative tools (e.g., ChatGPT and AI art); and superhuman play and analysis in strategy games (e.g., chess and Go). However, many AI applications are not perceived as AI: "A lot of cutting edge AI has filtered into general applications, often without b ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Outer Alignment (artificial Intelligence)
Outer alignment is a concept in artificial intelligence (AI) safety that refers to the challenge of specifying training objectives for AI systems in a way that truly reflects human values and intentions. It is often described as the reward misspecification problem, as it concerns whether the goal provided during training actually captures what humans want the AI to accomplish. Outer alignment is distinct from inner alignment, which focuses on whether the AI internalizes and pursues the specified goal once trained. Because human preferences are complex and often implicit, crafting precise and comprehensive reward functions remains an open problem. AI systems, particularly goal-optimizing ones, are vulnerable to Goodhart's Law, which states that when a measure becomes a target, it ceases to be a good measure. Consequently, optimizing for a poorly specified proxy can produce harmful or unintended outcomes. Sub-problems in this domain include specification gaming, where agents exploit l ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Goodhart's Law
Goodhart's law is an adage that has been stated as, "When a measure becomes a target, it ceases to be a good measure". It is named after British economist Charles Goodhart, who is credited with expressing the core idea of the adage in a 1975 article on monetary policy in the United Kingdom: It was used to criticize the British Thatcher government for trying to conduct monetary policy on the basis of targets for broad and narrow money, but the law reflects a much more general phenomenon. Priority and background Numerous concepts are related to this idea, at least one of which predates Goodhart's statement. Notably, Campbell's law likely has precedence, as Jeff Rodamar has argued, since various formulations date to 1969. Other academics had similar insights at the time. Jerome Ravetz's 1971 book '' Scientific Knowledge and Its Social Problems'' also predates Goodhart, though it does not formulate the same law. He discusses how systems in general can be gamed, focuses on cases ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Paperclip Maximizer
Instrumental convergence is the hypothetical tendency for most sufficiently intelligent, goal-directed beings (human and nonhuman) to pursue similar sub-goals, even if their ultimate goals are quite different. More precisely, agents (beings with agency) may pursue instrumental goals—goals which are made in pursuit of some particular end, but are not the end goals themselves—without ceasing, provided that their ultimate (intrinsic) goals may never be fully satisfied. Instrumental convergence posits that an intelligent agent with seemingly harmless but unbounded goals can act in surprisingly harmful ways. For example, a computer with the sole, unconstrained goal of solving a complex mathematics problem like the Riemann hypothesis could attempt to turn the entire Earth into one giant computer to increase its computational power so that it can succeed in its calculations. Proposed basic AI drives include utility function or goal-content integrity, self-protection, freedom from ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Road Runner (video Game)
''Road Runner'' is a racing video game based on the Wile E. Coyote and Road Runner shorts. It was released in arcades by Atari Games in 1985. Gameplay The player controls Road Runner, who is chased by Wile E. Coyote. In order to escape, Road Runner runs endlessly to the left. While avoiding Wile E. Coyote, the player must pick up bird seeds on the street, avoid obstacles like cars, and get through mazes. Most of the time Wile E. Coyote will just run after the Road Runner, but he occasionally uses tools like rockets, roller skates, and pogo sticks. Development Originally, the game was set to be released as a LaserDisc game, with the game's graphics being overlaid over video footage showing the road and background scenery. Whenever the player managed to outwit the Coyote - such as tricking him into running off a cliff - a sequence from one of the original ''Road Runner'' shorts showing that exact situation would be played. The game was going to be released in 1984, but Atar ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Farming (video Gaming)
Grinding is a term within video game culture that describes time spent in the game in which a player repeats a general task in order to gain rewards like in-game currency, in-game experience, player stats or other reward types. The method was first seen in '' dnd'', and though there are many adaptations of it, it has since become an entire category of gameplay. The term "grinding" itself comes from the general human culture of working hard, or "putting the axe to grind." A related term in gaming is "farming," which is a similar act of repeated action with intention to get a reward. Motivation A player is commonly motivated to grind due to a desire to earn rewards, gather resources, or increase their level. Alternatively, some people may enjoy repetitive tasks for the purpose of relaxation, especially if the task has a consistently positive result. MMORPGs often require grinding, which is achieved through a progression system. These systems vary from game to game but tend to ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Level (video Games)
In Video game, video games, a level (also referred to as a map, mission, stage, course, or round in some older games) is any space available to the player during the course of completion of an objective. Video game levels generally have progressively increasing difficulty to appeal to players with different skill levels. Each level may present new concepts and challenges to keep a player's interest high to play for a long time. In games with linear progression, levels are areas of a larger world, such as Green Hill Zone. Games may also feature interconnected levels, representing locations. Although the challenge in a game is often to defeat some sort of character, levels are sometimes designed with a movement challenge, such as a jumping puzzle, a form of obstacle course. Players must judge the distance between platforms or ledges and safely jump between them to reach the next area. These puzzles can slow the momentum down for players of fast action games; the first ''Half-Life ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Q*Bert
''Q*bert'' () is a 1982 Action game, action video game developed and published by Gottlieb for Arcade video game, arcades. It is a Video game graphics, 2D action game with Puzzle video game, puzzle elements that uses Isometric video game graphics, isometric graphics to create a 2.5D, pseudo-3D effect. The objective of each level in the game is to change every cube in a pyramid to a target color by making Q*bert, the on-screen character, hop on top of the cube while avoiding obstacles and enemies. Players use a joystick to control the character. The game was conceived by Warren Davis and Jeff Lee (video game artist), Jeff Lee. Lee designed the title character and original concept, which was further developed and implemented by Davis. ''Q*bert'' was developed under the project name ''Cubes''. ''Q*bert'' was well-received in arcades and among critics. The game was Gottlieb's most successful video game and is among the most recognized brands from the golden age of arcade video ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


WarGames
''WarGames'' is a 1983 American techno-thriller film directed by John Badham, written by Lawrence Lasker and Walter F. Parkes, and starring Matthew Broderick, Dabney Coleman, John Wood and Ally Sheedy. Broderick plays David Lightman, a young computer hacker who unwittingly accesses a United States military supercomputer programmed to simulate, predict and execute nuclear war against the Soviet Union, triggering a false alarm that threatens to start World War III. The film premiered at the 1983 Cannes Film Festival, and was released by MGM/UA Entertainment on June 3, 1983. It was a widespread critical and commercial success, grossing $125 million worldwide against a $12 million budget. At the 56th Academy Awards, the film was nominated for three Oscars, including Best Original Screenplay. It also won a BAFTA Award for Best Sound. ''WarGames'' is credited with popularizing concepts of computer hacking, information technology, and cybersecurity in wide ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Tetris
''Tetris'' () is a puzzle video game created in 1985 by Alexey Pajitnov, a Soviet software engineer. In ''Tetris'', falling tetromino shapes must be neatly sorted into a pile; once a horizontal line of the game board is filled in, it disappears, granting points and preventing the pile from overflowing. Over 200 versions of ''Tetris'' have been published by numerous companies on more than 65 platforms, often with altered game mechanics, some of which have become standard over time. To date, these versions of ''Tetris'' collectively serve as the second-best-selling video game series with over 520 million sales, mostly on mobile devices. In the 1980s, Pajitnov worked for the Computing Center of the Academy of Sciences, where he programmed ''Tetris'' on the Elektronika 60 and adapted it to the IBM PC with the help of Dmitry Pavlovsky and Vadim Gerasimov. Floppy disk copies were distributed freely throughout Moscow, before spreading to Eastern Europe. Robert Stein of Andro ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]