S2CID (identifier)
   HOME

TheInfoList



OR:

Semantic Scholar is an
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
–powered research tool for scientific literature developed at the
Allen Institute for AI The Allen Institute for AI (abbreviated AI2) is a research institute founded by late Microsoft co-founder Paul Allen. The institute seeks to achieve scientific breakthroughs by constructing AI systems with reasoning, learning, and reading capabi ...
and publicly released in November 2015. It uses advances in natural language processing to provide summaries for scholarly papers. The Semantic Scholar team is actively researching the use of artificial-intelligence in natural language processing,
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
, Human-Computer interaction, and information retrieval. Semantic Scholar began as a database surrounding the topics of
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
,
geoscience Earth science or geoscience includes all fields of natural science related to the planet Earth. This is a branch of science dealing with the physical, chemical, and biological complex constitutions and synergistic linkages of Earth's four sphere ...
, and
neuroscience Neuroscience is the scientific study of the nervous system (the brain, spinal cord, and peripheral nervous system), its functions and disorders. It is a multidisciplinary science that combines physiology, anatomy, molecular biology, developme ...
. However, in 2017 the system began including biomedical literature in its corpus. As of September 2022, they now include over 200 million publications from all fields of science.


Technology

Semantic Scholar provides a one-sentence summary of
scientific literature : ''For a broader class of literature, see Academic publishing.'' Scientific literature comprises scholarly publications that report original empirical and theoretical work in the natural and social sciences. Within an academic field, scie ...
. One of its aims was to address the challenge of reading numerous titles and lengthy abstracts on mobile devices. It also seeks to ensure that the three million scientific papers published yearly reach readers, since it is estimated that only half of this literature are ever read. Artificial intelligence is used to capture the essence of a paper, generating it through an "abstractive" technique. The project uses a combination of
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
, natural language processing, and
machine vision Machine vision (MV) is the technology and methods used to provide imaging-based automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision refers to ...
to add a layer of semantic analysis to the traditional methods of
citation analysis Citation analysis is the examination of the frequency, patterns, and graphs of citations in documents. It uses the directed graph of citations — links from one document to another document — to reveal properties of the documents. A t ...
, and to extract relevant figures,
tables Table may refer to: * Table (furniture), a piece of furniture with a flat surface and one or more legs * Table (landform), a flat area of land * Table (information), a data arrangement with rows and columns * Table (database), how the table d ...
, entities, and venues from papers. In contrast with
Google Scholar Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes ...
and
PubMed PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine (NLM) at the National Institutes of Health maintain t ...
, Semantic Scholar is designed to highlight the most important and influential elements of a paper. The AI technology is designed to identify hidden connections and links between research topics. Like the previously cited search engines, Semantic Scholar also exploits graph structures, which include the Microsoft Academic Knowledge Graph, Springer Nature's
SciGraph SciGraph is a search engine tool developed by Springer Nature. The technology, which is considered a Linked Open Data (LOD) platform, collects information that covers the research landscape, which includes research projects, publications, conferen ...
, and the Semantic Scholar Corpus. Each paper hosted by Semantic Scholar is assigned a unique identifier called the Semantic Scholar Corpus ID (abbreviated S2CID). The following entry is an example: :: Semantic Scholar is free to use and unlike similar search engines (i.e.
Google Scholar Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes ...
) does not search for material that is behind a paywall. One study compared the search abilities of Semantic Scholar through a systematic approach, and found the search engine to be 98.88% accurate when attempting to uncover the data. The same study examined other Semantic Scholar functions, including tools to survey metadata as well as several citation tools.


Number of users and publications

As of January 2018, following a 2017 project that added biomedical papers and topic summaries, the Semantic Scholar corpus included more than 40 million papers from
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
and
biomedicine Biomedicine (also referred to as Western medicine, mainstream medicine or conventional medicine)
. In March 2018, Doug Raymond, who developed
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
initiatives for the
Amazon Alexa Amazon Alexa, also known simply as Alexa, is a virtual assistant technology largely based on a Polish speech synthesiser named Ivona, bought by Amazon in 2013. It was first used in the Amazon Echo smart speaker and the Echo Dot, Echo Studio ...
platform, was hired to lead the Semantic Scholar project. As of August 2019, the number of included papers metadata (not the actual PDFs) had grown to more than 173 million after the addition of the
Microsoft Academic Graph Microsoft Academic was a free internet-based academic search engines for academic publications and literature, developed by Microsoft Research, shut down in 2022. At the same time, OpenAlex launched and claimed to be a successor to Microsoft Ac ...
records. In 2020, a partnership between Semantic Scholar and the University of Chicago Press Journals made all articles published under the University of Chicago Press available in the Semantic Scholar corpus. At the end of 2020, Semantic Scholar had indexed 190 million papers. In 2020, users of Semantic Scholar reached seven million a month.


See also

* * * * List of academic databases and search engines *


References


External links

* {{Authority control Bibliographic databases in computer science Scholarly search services Applications of artificial intelligence