The Bulgarian WordNet (BulNet) is an electronic multilingual dictionary of
synonym
A synonym is a word, morpheme, or phrase that means exactly or nearly the same as another word, morpheme, or phrase in a given language. For example, in the English language, the words ''begin'', ''start'', ''commence'', and ''initiate'' are al ...
sets along with their explanatory definitions and sets of semantic relations with other words in the language.
It follows the Princeton
WordNet
WordNet is a lexical database of semantic relations between words in more than 200 languages. WordNet links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into ''synsets'' with short definit ...
(PWN) framework which implements the traditional
semantic network
A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices ...
s whose structure consists of nodes and relations between the nodes.
General information
BulNet was started within the EU-funded project
BalkaNet - a Multilingual Semantic Network of the Balkan Languages. After BalkaNet's completion. development of BulNet continued with Bulgarian government support.
Contents of BulNet
Categories
As of 2015, BulNet contained more than 80,000
synonym
A synonym is a word, morpheme, or phrase that means exactly or nearly the same as another word, morpheme, or phrase in a given language. For example, in the English language, the words ''begin'', ''start'', ''commence'', and ''initiate'' are al ...
sets distributed into nine parts of speech - nouns, verbs, adjectives, adverbs, pronouns,
prepositions
Prepositions and postpositions, together called adpositions (or broadly, in traditional grammar, simply prepositions), are a class of words used to express spatial or temporal relations (''in'', ''under'', ''towards'', ''before'') or mark various ...
,
conjunctions
Conjunction may refer to:
* Conjunction (grammar), a part of speech
* Logical conjunction, a mathematical operator
** Conjunction introduction, a rule of inference of propositional logic
* Conjunction (astronomy)
In astronomy, a conjunction occ ...
, particles and
interjections
An interjection is a word or expression that occurs as an utterance on its own and expresses a spontaneous feeling or reaction. It is a diverse category, encompassing many different parts of speech, such as exclamations ''(ouch!'', ''wow!''), curse ...
.
The words included in BulNet have been selected according to different criteria. The main criteria are the
frequency analysis
In cryptanalysis, frequency analysis (also known as counting letters) is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers.
Frequency analysis is based on ...
of the word occurrences in large text corpora and the inclusion of
synsets
In metadata, a synonym ring or synset, is a group of data elements that are considered semantically equivalent for the purposes of information retrieval. These data elements are frequently found in different metadata registries. Although a group ...
. The synsets include those already featured in the wordnets of other languages and synsets that correspond to high-frequency word senses found in parallel corpora.
Synsets
Each synset encodes the relation of equivalence between a number of
lexical
Lexical may refer to:
Linguistics
* Lexical corpus or lexis, a complete set of all words in a language
* Lexical item, a basic unit of lexicographical classification
* Lexicon, the vocabulary of a person, language, or branch of knowledge
* Lexica ...
items — LITERALS (at least one should be explicitly represented in the SYNSET), each of them having a unique meaning (specified by the value of SENSE) — which pertain to one and the same part of speech (specified as the value of POS) and represent one and the same lexical meaning (specified as the value of DEF). Each synset is linked to its counterpart in PWN 3.0 by means of a unique identification number - ID. The common synsets in the Balkan languages are marked as common concepts subsets —
BCS.
In a
monolingual
Monoglottism (Greek μόνος ''monos'', "alone, solitary", + γλῶττα , "tongue, language") or, more commonly, monolingualism or unilingualism, is the condition of being able to speak only a single language, as opposed to multilingualism. ...
database, a synset should be linked to at least one other synset through an intralingual relation. Non-obligatory information may also be encoded such as examples of usage, stylistic peculiarities, morphological or syntactic properties, author and last edit details.
Semantic relations
The large number of relations encoded in BulNet effectively illustrates the language's semantic and derivational richness that offers diverse opportunities for numerous applications of the multilingual database. BulNet offers linguistic solutions at the semantic level such as options for synonym selection, queries for semantic relations of a word in the language's lexical system (
antonymy
In lexical semantics, opposites are words lying in an inherently incompatible binary relationship. For example, something that is ''long'' entails that it is not ''short''. It is referred to as a 'binary' relationship because there are two members ...
,
holonymy
In linguistics, meronymy () is a semantic relation between a meronym denoting a part and a holonym denoting a whole. In simpler terms, a meronym is in a ''part-of'' relationship with its holonym. For example, ''finger'' is a meronym of ''hand' ...
, etc.), explanatory definition queries and translation equivalents for a lexical item.
BulNet is an electronic multilingual dictionary of synonym sets along with their explanatory definitions and sets of semantic relations with other words in the language.
[Koeva, S. Derivational and morphosemantic relations in Bulgarian Wordnet. In Intelligent Information Systems, XVI, Warsaw, Academic Publishing House, 2008, 359—389. . ][Tsvetana Dimitrova, Ekaterina Tarpomanova and Borislav Rizov. Coping with Derivation in the Bulgarian Wordnet. In: Heili Orav, Christiane Fellbaum and Piek Vossen (Eds.) Proceedings of the Seventh Global Wordnet Conference, Tartu, Estonia, 2014, pp. 109-117]
Hydra
Hydra is an OS-independent system designed for wordnet development, validation and exploration. The program enables users to browse and edit any number of monolingual wordnets at a time. The individual wordnets are synchronised, so that equivalent synonym sets, or synsets, may be viewed and explored in parallel.
[Borislav Rizov. Hydra: A Software System for Wordnet. In: Heili Orav, Christiane Fellbaum and Piek Vossen (Eds.) Proceedings of the Seventh Global Wordnet Conference, Tartu, Estonia, 2014, pp. 142-147]
References
{{Reflist
Sources
BulNet
External links
BulNet search engineHydraBulNet in META-SHAREBulSemCor - Bulgarian sense-annotated corpus* BulNC:
Bulgarian National Corpus
Computational linguistics
Lexical databases