The Computer Science Ontology (CSO) is an automatically generated
taxonomy of research topics in the field of
Computer Science. It was produced by the
Open University in collaboration with
Springer Nature
Springer Nature or the Springer Nature Group is a German-British academic publishing company created by the May 2015 merger of Springer Science+Business Media and Holtzbrinck Publishing Group's Nature Publishing Group, Palgrave Macmillan, and Macm ...
by running an information extraction system over a large corpus of scientific articles. Several branches were manually improved by domain experts. The current version (CSO 3.2) includes about 14K research topics and 160K semantic relationships.
[Salatino, A.A., Thanapalasingam, T., Mannocci, A., Birukou, A., Osborne, F. and Motta, E. (2019) The Computer Science Ontology: A Comprehensive Automatically-Generated Taxonomy of Research Areas, Data Intelligence]
/ref>
CSO is available in Web_Ontology_Language, OWL, Turtle, and N-Triples.
It is aligned with several other knowledge graphs
Knowledge can be defined as awareness of facts or as practical skills, and may also refer to familiarity with objects or situations. Knowledge of facts, also called propositional knowledge, is often defined as true belief that is distinc ...
, including DBpedia, Wikidata, YAGO, Freebase, and Cyc. New versions of CSO are regularly released on the CSO Portal.
CSO is mostly used to characterise scientific papers and other documents according to their research areas, in order to enable different kinds of analytics. The CSO Classifier is an open-source python tool for automatically annotating documents with CSO.
Applications
* Recommender Systems.
* Computing the semantic similarity of documents.
* Extracting metadata
Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive metadata – the descriptive ...
from video lecture subtitles
Subtitles and captions are lines of dialogue or other text displayed at the bottom of the screen in films, television programs, video games or other visual media. They can be transcriptions of the screenplay, translations of it, or informati ...
.
* Performing bibliometrics analysis.[Zhang, X., Chandrasegaran, S. and Ma, K.L., 2020. ConceptScope: Organizing and Visualizing Knowledge in Documents based on Domain Ontology. arXiv preprint arXiv:2003.05108]
See also
* Ontology (information science)
In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
* Semantic Web
* Knowledge graph
* DBpedia
* YAGO
* Freebase
* Cyc
* ACM Computing Classification System
The ACM Computing Classification System (CCS) is a subject classification system for computing devised by the Association for Computing Machinery (ACM). The system is comparable to the Mathematics Subject Classification (MSC) in scope, aims, and st ...
* Mathematics Subject Classification (MSC)
* Physics and Astronomy Classification Scheme (PACS)
* PhySH (Physics Subject Headings)
References
External links
* {{Official website, https://cso.kmi.open.ac.uk/
Artificial intelligence
Computer science in the United Kingdom
Knowledge bases
Knowledge representation
Ontology (information science)