HOME

TheInfoList



OR:

Kaggle, a subsidiary of Google LLC, is an online community of
data scientist Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured and unstructured data, and apply knowledge from data across a bro ...
s and
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
practitioners. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges. Kaggle was first launched in 2010 by offering machine learning competitions and now also offers a public data platform, a cloud-based workbench for data science, and Artificial Intelligence education. Its key personnel were Anthony Goldbloom and Jeremy Howard. Nicholas Gruen was the founding chair succeeded by Max Levchin. Equity was raised in 2011 valuing the company at $25.2 million. On 8 March 2017, Google announced that they were acquiring Kaggle.


Kaggle community

In June 2017, Kaggle claimed it surpassed 1 million registered users and as of 2021 over 8 million. The users come from 194 countries. By March 2017, the Two Sigma Investments fund was running a competition on Kaggle to code a trading algorithm.


Overview

# The competition host prepares the data and a description of the problem; the host may choose whether it's going to be rewarded with money or by unpaid. # Participants experiment with different techniques and compete against each other to produce the best models. Work is shared publicly through Kaggle Kernels to achieve a better benchmark and to inspire new ideas. Submissions can be made through Kaggle Kernels, through manual upload or using the Kaggle API. For most competitions, submissions are scored immediately (based on their predictive accuracy relative to a hidden solution file) and summarized on a live leaderboard. # After the deadline passes, the competition host pays the prize money in exchange for "a worldwide, perpetual, irrevocable and royalty-free license ..to use the winning Entry", i.e. the algorithm, software and related
intellectual property Intellectual property (IP) is a category of property that includes intangible creations of the human intellect. There are many types of intellectual property, and some countries recognize more than others. The best-known types are patents, cop ...
developed, which is "non-exclusive unless otherwise specified". Alongside its public competitions, Kaggle also offers private competitions limited to Kaggle's top participants. Kaggle offers a free tool for data science teachers to run academic machine-learning competitions. Kaggle also hosts recruiting competitions in which data scientists compete for a chance to interview leading data science companies like
Facebook Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin ...
, Winton Capital, and
Walmart Walmart Inc. (; formerly Wal-Mart Stores, Inc.) is an American multinational retail corporation that operates a chain of hypermarkets (also called supercenters), discount department stores, and grocery stores from the United States, headquarter ...
.


Competitions

Hundreds of machine-learning competitions were run on Kaggle since the company was founded. Competitions have ranged from improving gesture recognition for Microsoft Kinect to making a football AI for
Manchester City Manchester () is a city in Greater Manchester, England. It had a population of 552,000 in 2021. It is bordered by the Cheshire Plain to the south, the Pennines to the north and east, and the neighbouring city of Salford to the west. The two ...
to improving the search for the
Higgs boson The Higgs boson, sometimes called the Higgs particle, is an elementary particle in the Standard Model of particle physics produced by the quantum excitation of the Higgs field, one of the fields in particle physics theory. In the St ...
at CERN. Competitions have resulted in many successful projects including furthering the state of the art in
HIV The human immunodeficiency viruses (HIV) are two species of '' Lentivirus'' (a subgroup of retrovirus) that infect humans. Over time, they cause acquired immunodeficiency syndrome (AIDS), a condition in which progressive failure of the immu ...
research,
chess Chess is a board game for two players, called White and Black, each controlling an army of chess pieces in their color, with the objective to checkmate the opponent's king. It is sometimes called international chess or Western chess to dist ...
ratings and
traffic Traffic comprises pedestrians, vehicles, ridden or herded animals, trains, and other conveyances that use public ways (roads) for travel and transportation. Traffic laws govern and regulate traffic, while rules of the road include traffi ...
forecasting. Geoffrey Hinton and George Dahl used deep
neural networks A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up of biological ...
to win a competition hosted by Merck. And Vlad Mnih (one of Hinton's students) used deep neural networks to win a competition hosted by Adzuna. This resulted in the technique being taken up by others in the Kaggle community. Tianqi Chen from the University of Washington also used Kaggle to show the power of XGBoost, which has since taken over from
Random Forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of th ...
as one of the main methods used to win Kaggle competitions. Several academic papers have been published on the basis of findings made in Kaggle competitions. A key to this is the effect of the live leaderboard, which encourages participants to continue innovating beyond existing best practices. The winning methods are frequently written up on the Kaggle blog
''Kaggle Winner's Blog''


Financials

In March 2017,
Fei-Fei Li Fei-Fei Li (; born 1976) is a Chinese-American computer scientist who is known for establishing ImageNet, the dataset that enabled rapid advances in computer vision in the 2010s. She is the Sequoia Capital Professor of Computer Science at Sta ...
, Chief Scientist at Google, announced that
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
was acquiring Kaggle during her keynote at Google Next.


See also

* Data science competition platform * Anthony Goldbloom


References


Further reading


"Competition shines light on dark matter", Office of Science and Technology Policy, Whitehouse website, June 2011"May the best algorithm win...", ''The Wall Street Journal'', March 2011
* ttp://www.nature.com/nbt/journal/v29/n9/full/nbt.1968.html "Verification of systems biology research in the age of collaborative competition", ''Nature Nanotechnology'', September 2011 {{Google Cloud 2010 establishments in California 2017 mergers and acquisitions Analytics companies Applied machine learning Computer science competitions Crowdsourcing Forecasting competitions Google acquisitions Google Cloud Programming contests