Sabermetrician
   HOME

TheInfoList



OR:

Sabermetrics, or originally SABRmetrics, is the empirical analysis of
baseball Baseball is a bat-and-ball sport played between two teams of nine players each, taking turns batting and fielding. The game occurs over the course of several plays, with each play generally beginning when a player on the fielding t ...
, especially
baseball statistics Baseball statistics play an important role in evaluating the progress of a player or team. Since the flow of a baseball game has natural breaks to it, and normally players act individually rather than performing in clusters, the sport lends itsel ...
that measure in-game activity. Sabermetricians collect and summarize the relevant data from this in-game activity to answer specific questions. The term is derived from the acronym SABR, which stands for the
Society for American Baseball Research The Society for American Baseball Research (SABR) is a membership organization dedicated to fostering the research and dissemination of the history and record of baseball primarily through the use of statistics. Established in Cooperstown, New ...
, founded in 1971. The term "sabermetrics" was coined by Bill James, who is one of its pioneers and is often considered its most prominent advocate and public face.


Early history

Henry Chadwick, a sportswriter in New York, developed the
box score A box score is a structured summary of the results from a sport competition. The box score lists the game score as well as individual and team achievements in the game. Among the sports in which box scores are common are baseball, basketball, f ...
in 1858. This was the first way statisticians were able to describe the sport of baseball by numerically tracking various aspects of game play. The creation of the box score has given baseball statisticians a summary of the individual and team performances for a given game. Sabermetrics research began in the middle of the 20th century with the writings of
Earnshaw Cook Earnshaw Cook (March 28, 1900 in Reisterstown, Maryland – November 11, 1987 in Baltimore, Maryland) was an early researcher and proponent of sabermetrics, the analysis of baseball through statistical means. Engineering A member of the Princeto ...
, one of the earliest sabermetricians. Cook's 1964 book ''Percentage Baseball'' was one of the first of its kind. At first, most organized baseball teams and professionals dismissed Cook's work as meaningless. The idea of a science of baseball statistics began to achieve legitimacy in 1977 when Bill James began releasing ''Baseball Abstracts'', his annual compendium of baseball data. However, James's ideas were slow to find widespread acceptance. Bill James believed there was a widespread misunderstanding about how the game of baseball was played, claiming the sport was not defined by its rules but actually, as summarized by engineering professor Richard J. Puerzer, "defined by the conditions under which the game is played--specifically, the ballparks but also the players, the ethics, the strategies, the equipment, and the expectations of the public." Sabermetricians—sometimes considered baseball statisticians—began trying to replace the longtime favorite statistic known as the batting average. It has been claimed that team batting average provides a relatively poor fit for team runs scored. Sabermetric reasoning would say that runs win ballgames, and that a good measure of a player's worth is his ability to help his team score more runs than the opposing team. Before Bill James popularized sabermetrics,
Davey Johnson David Allen Johnson (born January 30, 1943) is an American former professional baseball player and manager. He played as a second baseman from through , most notably as a member of the Baltimore Orioles dynasty that won four American League ...
used an IBM System/360 at team owner
Jerold Hoffberger Jerold Charles Hoffberger (April 7, 1919 – April 9, 1999) was an American businessman. He was president of the National Brewing Company from 1946 to 1973. He was also part-owner of the Baltimore Orioles of the American League from 1954 to 1965 ...
's brewery to write a FORTRAN baseball computer simulation while playing for the
Baltimore Orioles The Baltimore Orioles are an American professional baseball team based in Baltimore. The Orioles compete in Major League Baseball (MLB) as a member club of the American League (AL) East division. As one of the American League's eight charter ...
in the early 1970s. He used his results in an unsuccessful attempt to promote to his manager
Earl Weaver Earl Sidney Weaver (August 14, 1930 – January 19, 2013) was an American professional baseball manager, author, and television broadcaster. After playing in minor league baseball, he retired without playing in Major League Baseball (MLB). He be ...
the idea that he should bat second in the lineup. He wrote
IBM BASIC The IBM Personal Computer Basic, commonly shortened to IBM BASIC, is a programming language first released by IBM with the IBM Personal Computer, Model 5150 (IBM PC) in 1981. IBM released four different versions of the Microsoft BASIC interpre ...
programs to help him manage the
Tidewater Tides The Norfolk Tides are a Minor League Baseball team of the International League and the Triple-A affiliate of the Baltimore Orioles. They are located in Norfolk, Virginia, and are named in nautical reference to the city's location on the Chesap ...
, and after becoming manager of the
New York Mets The New York Mets are an American professional baseball team based in the New York City borough of Queens. The Mets compete in Major League Baseball (MLB) as a member of the National League (NL) East division. They are one of two major league ...
in 1984, he arranged for a team employee to write a
dBASE II dBase (also stylized dBASE) was one of the first database management systems for microcomputers and the most successful in its day. The dBase system includes the core database engine, a query system, a forms engine, and a programming language ...
application to compile and store
advanced metrics Sports analytics are a collection of relevant, historical, statistics that can provide a competitive advantage to a team or individual. Through the collection and analyzation of these data, sports analytics inform players, coaches and other staff in ...
on team statistics. Craig R. Wright was another employee in Major League Baseball, working with the Texas Rangers in the early 1980s. During his time with the Rangers, he became known as the first front office employee in MLB history to work under the title Sabermetrician. David Smith founded
Retrosheet Retrosheet is a nonprofit organization whose website features box scores of Major League Baseball (MLB) games from 1906 to the present, and play-by-play narratives for almost every contest since the 1930s. It also includes scores from every major ...
in 1989, with the objective of computerizing the box score of every major league baseball game ever played, in order to more accurately collect and compare the statistics of the game. The Oakland Athletics began to use a more quantitative approach to baseball by focusing on sabermetric principles in the 1990s. This initially began with
Sandy Alderson Richard Lynn "Sandy" Alderson (born November 22, 1947) is an American baseball executive. He is currently the president of the New York Mets. He previously served as the general manager of the New York Mets from 2011 to 2018, an executive in the O ...
as the general manager of the team when he used the principles toward obtaining relatively undervalued players. His ideas were continued when Billy Beane took over as general manager in 1997, a job he held until 2015, and hired his assistant
Paul DePodesta Paul DePodesta (born December 16, 1972) is an American football executive and former baseball executive who is the chief strategy officer of the Cleveland Browns of the National Football League (NFL). He previously served as a front-office assista ...
. Through the statistical analysis done by Beane and DePodesta in the 2002 season, the Oakland A's went on to win 20 games in a row. This was a historic moment for the franchise, in which the 20th game was played at the Alameda County Coliseum. His approaches to baseball soon gained national recognition when
Michael Lewis Michael Monroe Lewis (born October 15, 1960) Gale Biography In Context. is an American author and financial journalist. He has also been a contributing editor to ''Vanity Fair'' since 2009, writing mostly on business, finance, and economics. He ...
published '' Moneyball: The Art of Winning an Unfair Game'' in 2003 to detail Beane's use of Sabermetrics. In 2011, a film based on Lewis' book—also called '' Moneyball''—was released and gave broad exposure to the techniques used in the Oakland Athletics' front office.


Traditional measurements

Sabermetrics was created in an attempt for baseball fans to learn about the sport through objective evidence. This is performed by evaluating players in every aspect of the game, specifically batting, pitching, and fielding. These evaluation measures are usually phrased in terms of either runs or team wins as older statistics were deemed ineffective.


Batting measurements

The traditional measure of batting performance is considered to be hits divided by the total number of at-bats. Bill James, along with other fathers of sabermetrics, found this measure to be flawed, as it ignores any other way a batter can reach base besides a hit. This led to the creation of the On-base percentage, which takes walks and hit-by-pitches into consideration. To calculate the On-Base percentage, the total number of hits + bases on balls + hit by pitch are divided by at bats + bases on balls + hit by pitch + sacrifice flies. Another issue with the traditional measure of the batting average is that it does not distinguish between hits (i.e., singles, doubles, triples, and home runs) and gives each hit equal value. Thus, a measure that differentiates among these four hit outcomes, the slugging percentage, was created. To calculate the slugging percentage, the total number of bases of all hits is divided by the total number of times at bat. Stephen Jay Gould proposed that the disappearance of .400 batting average is actually a sign of general improvement in batting. This is because, in the modern era, players are becoming more focused on hitting for power than for average. Therefore, it has become more valuable to compare players using the slugging percentage and on-base percentage over the batting average. These two improved sabermetric measures are important skills to measure in a batter and have been combined to create the modern statistic
on-base plus slugging On-base plus slugging (OPS) is a sabermetric baseball statistic calculated as the sum of a player's on-base percentage and slugging percentage. The ability of a player both to get on base and to hit for power, two important offensive skills, are ...
(OPS). OPS is the sum of the on-base percentage and the slugging percentage. This modern statistic has become useful in comparing players and is a powerful method of predicting runs scored from a certain player. Some of the other statistics that sabermetricians use to evaluate batting performance are weighted on-base average, secondary average,
runs created Runs created (RC) is a baseball statistic invented by Bill James to estimate the number of runs a hitter contributes to their team. Purpose James explains in his book, ''The Bill James Historical Baseball Abstract'', why he believes runs created is ...
, and equivalent average.


Pitching measurements

The traditional measure of pitching performance is earned run average. It is calculated as
earned run In baseball, an earned run is any run that was fully enabled by the offensive team's production in the face of competent play from the defensive team. Conversely, an unearned run is a run that would not have been scored without the aid of an erro ...
s allowed per 9 innings. Earned run average does not separate the ability of the pitcher from the abilities of the fielders that he plays with. Another classic measure for pitching is a pitcher's
winning percentage In sports, a winning percentage is the fraction of games or matches a team or individual has won. The statistic is commonly used in standings or rankings to compare teams or individuals. It is defined as wins divided by the total number of match ...
. Winning percentage is calculated by dividing wins by the number of decisions (wins and losses). Winning percentage is also heavily dependent on the pitcher's team, particularly on the number of runs it scores. Sabermetricians have attempted to find different measures of pitching performance that exclude the performances of the fielders involved. One of the earliest developed, and one of the most popular in use, is
walks plus hits per inning pitched In baseball statistics, walks plus hits per inning pitched (WHIP) is a sabermetric measurement of the number of baserunners a pitcher has allowed per inning pitched. WHIP is calculated by adding the number of walks and hits allowed and divid ...
(WHIP), which while not completely defense-independent, tends to indicate how many times a pitcher is likely to put a player on base (either by base-on-balls, hit-by-pitch, or base hit) and thus how effective batters are against a particular pitcher in reaching base. A more recent development is the creation of defense independent pitching statistics (DIPS) system.
Voros McCracken Robert "Voros" McCracken (born August 17, 1971, Chicago) is an American baseball sabermetrician. "Voros" (vörös in hungarian = red in english) is a nickname from his partial Hungarian heritage. He is widely recognized for his pioneering work on ...
has been credited with the development of this system in 1999. Through his research, McCracken was able to show that there is little to no difference between pitchers in the number of hits they allow on balls put into play—regardless of their skill level. Some examples of these statistics are
defense-independent ERA In baseball statistics, Defense-Independent ERA (dERA), created by Voros McCracken, projects what a pitcher's earned run average (ERA) would have been, if not for the effects of defense and luck on the actual games in which he pitched. Method Ver ...
, fielding independent pitching, and defense-independent component ERA. Other sabermetricians have furthered the work in DIPS, such as Tom Tango who runs the ''Tango on Baseball'' sabermetrics website. '' Baseball Prospectus'' created another statistics called the
peripheral ERA Peripheral ERA (PERA) is a pitching statistic created by the Baseball Prospectus team. It is the expected earned run average taking into account park-adjusted hits, walks, strikeouts, and home runs allowed. Unlike Voros McCracken's DIPS, hits a ...
. This measure of a pitcher's performance takes hits, walks, home runs allowed, and strikeouts while adjusting for ballpark factors. Each ballpark has different dimensions when it comes to the outfield wall so a pitcher should not be measured the same for each of these parks.
Batting average on balls in play In baseball statistics, batting average on balls in play (abbreviated BABIP) is a measurement of how often batted balls result in hits, excluding home runs. It can be expressed as, "when you hit the ball and it’s not a home run, what’s your b ...
(BABIP) is another useful measurement for determining pitchers’ performance. When a pitcher has a high BABIP, they will often show improvements in the following season, while a pitcher with low BABIP will often show a decline in the following season. This is based on the statistical concept of
regression to the mean In statistics, regression toward the mean (also called reversion to the mean, and reversion to mediocrity) is the fact that if one sample of a random variable is extreme, the next sampling of the same random variable is likely to be closer to it ...
. Others have created various means of attempting to quantify individual pitches based on characteristics of the pitch, as opposed to runs earned or balls hit.


Advanced methods

Value over replacement player (VORP) was once considered a popular sabermetric statistic. This statistic demonstrates how much a player contributes to his team in comparison to a hypothetical player that performs at the minimum level needed to hold a roster position on a major league team. This measurement was invented by Keith Woolner, a former writer for the sabermetric group/website ''Baseball Prospectus''.
Wins above replacement Wins Above Replacement or Wins Above Replacement Player, commonly abbreviated to WAR or WARP, is a non-standardized sabermetric baseball statistic developed to sum up "a player's total contributions to his team". A player's WAR value is claimed to ...
(WAR) is another popular sabermetric statistic for evaluating a player's contributions to his team. Similar to VORP, WAR compares a given player to a replacement-level player in order to determine the number of additional wins the player has provided to his team. WAR values vary with hitting positions and are largely determined by a player's successful performance and amount of playing time.


Quantitative analysis in baseball

Many traditional and modern statistics, such as ERA and Wins Shared, don't give a full understanding of what is taking place on the field. Simple ratios are not sufficient to understand the statistical data of baseball. Structured quantitative analysis is capable of explaining many aspects of the game, for example, to examine how often a team should attempt to steal.


Applications

Sabermetrics can be used for multiple purposes, but the most common are evaluating past performance and predicting future performance to determine a player's contributions to his team. These may be useful when determining who should win end-of-the-season awards such as MVP and when determining the value of making a certain trade. Most baseball players tend to play a few years in the minor leagues before they are called up to the major league. The competitive differences coupled with ballpark effects make the exact comparison of a player's statistics a problem. Sabermetricians have been able to clear this problem by adjusting the player's minor league statistics, also known as the Minor-League Equivalency. Through these adjustments, teams are able to look at a player's performance in both AA and AAA to determine if he is fit to be called up to the majors.


Applied statistics

Sabermetrics methods are generally used for three purposes: # To compare key performances among certain specific players under realistic data conditions. The evaluation of past performance of a player enables an analytic overview. The comparison of this data between players can help one understand key points such as their market values. In that way, the role and the salary that should be given to that player can be defined. # To provide prediction of future performance of a given player or a team. When past data is available about the performance of a team or a specific player, Sabermetrics can be used to predict the average future performances for the next season. Thus, a prediction can be made with a certain probability about the number of wins and losses. # To provide a useful function of the player's contributions to his team. When analyzing data, one is able to understand the contributions a player makes to the success/failure of his team. Given that correlation, one can objectively sign or release players with certain characteristics.


Machine learning model

A
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
model can be built using data sets available at sources such as baseball-reference. This model will give probability estimates for the outcome of specific games or the performance of particular players. These estimates are increasingly accurate when applied to a large number of events over a long term. The game outcome (win/lose) is treated as having a binomial distribution. Predictions can be made using a logistic regression model with explanatory variables including: opponents' runs scored, runs scored, shutouts time at bat, winning rate, and pitcher whip.


Recent advances

Many sabermetricians are still working hard to contribute to the field through creating new measures and asking new questions. Bill James' two '' Historical Baseball Abstract'' editions and '' Win Shares'' book have continued to advance the field of sabermetrics, 25 years after he helped start the movement. His former assistant
Rob Neyer Rob Neyer (born June 22, 1966) is an American baseball writer known for his use of statistical analysis or sabermetrics. He started his career working for Bill James and STATS and then joined ESPN.com as a columnist and blogger from 1996 to 2011. ...
, who is now a senior writer at ESPN.com and national baseball editor of SBNation, also worked on popularizing sabermetrics since the mid-1980s.
Nate Silver Nathaniel Read Silver (born January 13, 1978) is an American statistician, writer, and poker player who analyzes baseball (see sabermetrics), basketball, and elections (see psephology). He is the founder and editor-in-chief of ''FiveThirtyEigh ...
, a former writer and managing partner of ''Baseball Prospectus'', invented
PECOTA PECOTA, an acronym for ''Player Empirical Comparison and Optimization Test Algorithm'', is a sabermetric system for forecasting Major League Baseball player performance. The word is a backronym based on the name of journeyman major league player Bil ...
. This acronym stands for ''Player Empirical Comparison and Optimization Test Algorithm'', and is a sabermetric system for forecasting Major League Baseball player performance. Simply put, it assumes that the careers of similar players will follow a similar trajectory. This system has been owned by ''Baseball Prospectus'' since 2003 and helps the website's authors invent or improve widely relied-upon sabermetric measures and techniques. Beginning in the 2007 baseball season, the MLB started looking at technology to record detailed information regarding each pitch that is thrown in a game. This became known as the
PITCHf/x PITCHf/x is a system created and maintained by Sportvision that tracks the speeds and trajectories of pitched baseballs. This system, which made its debut in the 2006 Major League Baseball (MLB) postseason, is installed in every MLB stadium. The ...
system which is able to record the speed of the pitch, at its release point and as it crossed the plate, as well as the location and angle of the break of certain pitches through video cameras. FanGraphs is a website that favors this system as well as the analysis of play-by-play data. The website also specializes in publishing advanced baseball statistics as well as graphics that evaluate and track the performance of players and teams.


In popular culture

* '' Moneyball'', the 2011 film about Billy Beane's use of sabermetrics to build the Oakland Athletics. The film is based on
Michael Lewis Michael Monroe Lewis (born October 15, 1960) Gale Biography In Context. is an American author and financial journalist. He has also been a contributing editor to ''Vanity Fair'' since 2009, writing mostly on business, finance, and economics. He ...
's book of the same name. * The
season 3 A season is a division of the year based on changes in weather, ecology, and the number of daylight hours in a given region. On Earth, seasons are the result of the axial parallelism of Earth's axial tilt, tilted orbit around the Sun. In tempera ...
''
Numb3rs ''Numbers'' (stylized as ''NUMB3RS'') is an American crime drama television series that was broadcast on CBS from January 23, 2005, to March 12, 2010, for six seasons and 118 episodes. The series was created by Nicolas Falacci and Cheryl Heu ...
'' episode "Hardball" focuses on sabermetrics, and the
season 1 Season One may refer to: Albums * ''Season One'' (Suburban Legends album), 2004 * ''Season One'' (All Sons & Daughters album), 2012 * ''Season One'' (Saukrates album), 2012 See also * * * Season 2 (disambiguation) * Season 4 (disambiguati ...
episode "Sacrifice" also covers the subject. * " MoneyBART", the third episode of ''
The Simpsons ''The Simpsons'' is an American animated sitcom created by Matt Groening for the Fox Broadcasting Company. The series is a satirical depiction of American life, epitomized by the Simpson family, which consists of Homer Simpson, Homer, Marge ...
'' 22nd season, in which
Lisa Lisa or LISA may refer to: People People with the mononym * Lisa Lisa (born 1967), American actress and lead singer of the Cult Jam * Lisa (Japanese musician, born 1974), stylized "LISA", Japanese singer and producer * Lisa Komine (born 1978), J ...
utilizes sabermetrics to coach Bart's Little League Baseball team.


See also

*
Analytics (ice hockey) In ice hockey, analytics is the analysis of the characteristics of hockey players and teams through the use of statistics and other tools to gain a greater understanding of the effects of their performance. Three commonly used basic statistics in i ...
, the
ice hockey Ice hockey (or simply hockey) is a team sport played on ice skates, usually on an ice skating rink with lines and markings specific to the sport. It belongs to a family of sports called hockey. In ice hockey, two opposing teams use ice h ...
equivalent *
Advanced statistics in basketball Advanced statistics (also known as analytics or APBRmetrics) in basketball refers to analyzing basketball statistics through objective evidence. APBRmetrics is a cousin to the study of baseball statistics, known as sabermetrics, and similarly tak ...
, the
basketball Basketball is a team sport in which two teams, most commonly of five players each, opposing one another on a rectangular court, compete with the primary objective of shooting a basketball (approximately in diameter) through the defender's h ...
equivalent *
Fielding Bible Award A Fielding Bible Award recognizes the best defensive player for each fielding position in Major League Baseball (MLB) based on statistical analysis. John Dewan and Baseball Info Solutions conduct the annual selection process, which commenced in 2 ...
* Kyle Boddy, founder of Driveline Baseball *
PITCHf/x PITCHf/x is a system created and maintained by Sportvision that tracks the speeds and trajectories of pitched baseballs. This system, which made its debut in the 2006 Major League Baseball (MLB) postseason, is installed in every MLB stadium. The ...
* Pitch quantification * Sports analytics *
Statcast Statcast is a high-speed, high-accuracy, automated tool developed to analyze player movements and athletic abilities in Major League Baseball (MLB). Statcast was introduced to all thirty MLB stadiums in 2015, a year now considered the beginning ...
* ''
The Hardball Times The Hardball Times (abbreviated as THT) is a website which publishes news, original comments and statistical analysis of baseball each week Monday through Friday, in addition to the Hardball Times Annual book which features essays by leading sabe ...
'' * Theorycraft * ''
Total Baseball ''Total Baseball'' (latest edition , first published 1989) is a baseball encyclopedia first compiled by John Thorn and Pete Palmer in 1989. The latest edition, published in 2004, is its eighth.John Thorn John A. Thorn (born April 17, 1947) is a German-born sports historian, author, publisher, and cultural commentator. Since March 1, 2011, he has been the Official Baseball Historian for Major League Baseball. Personal profile Thorn was born in ...
and
Pete Palmer Pete Palmer (born January 30, 1938) is an American sports statistician and encyclopedia editor. He is a major contributor to the applied mathematical field referred to as sabermetrics. Along with the Bill James '' Baseball Abstracts'', Palmer ...
* '' Whatever Happened to the Hall of Fame?'' by Bill James


References

;Notes


External links


The Society for American Baseball Research (SABR)

This Guy's Quest to Track Every Shot in the NBA Changed Basketball Forever
– Wired, Mark McClusky, 28 October 2014 {{Sports rating systems Baseball culture Baseball statistics Baseball strategy Bill James Sports science