HOME





ChaSen
ChaSen is a morphological parser for the Japanese language. This tool for analyzing morphemes was developed at the Matsumoto laboratory, Nara Institute of Science and Technology. See also * MeCab MeCab is an open-source text segmentation library for use with text written in the Japanese language originally developed by the Nara Institute of Science and Technology and currently maintained by Taku Kudou (工藤拓) as part of his work on th ... References External links ChaSen home pageNara Institute of Science and Technology Matsumoto Laboratory Natural language processing Japanese language {{Japonic-lang-stub ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


MeCab
MeCab is an open-source text segmentation library for use with text written in the Japanese language originally developed by the Nara Institute of Science and Technology and currently maintained by Taku Kudou (工藤拓) as part of his work on the Google Japanese Input project. The name derives from the developer's favorite food, (和布蕪), a Japanese dish made from wakame leaves. The software was originally based on ChaSen and was developed under the name ChaSenTNG, but now it is developed independently from ChaSen and was rewritten from scratch. MeCab's analysis accuracy is comparable to ChaSen, and its analysis speed is 3–4 times faster on average. MeCab can analyze and segment a sentence into its parts of speech. There are several dictionaries available for MeCab, but IPADIC is the most commonly used one as with ChaSen. In 2007, Google used MeCab to generate n-gram data for a large corpus of Japanese text, which it published on its Google Japan blog. MeCab is als ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Nara Institute Of Science And Technology
) , city = Ikoma ( Kansai Science City) , state = Nara , country = Japan , postgrad = 1,043 , administrative_staff= 374 , campus = Suburban,139,967 m², , mascot = None , free_label = , free = , endowment= US$-- billion(JP¥-- billion) , websitewww.naist.jp} , abbreviated as NAIST, is a Japanese national university located in Ikoma, Nara of Kansai Science City. It was founded in 1991 with a focus on research and consists solely of graduate schools in three integrated areas: Biological Sciences, Information Sciences, and Material Sciences. NAIST is one of the most prestigious research institutions in Japan. In the "Evaluation of Achievements Related to the 2nd Medium-term Goals and Plans" (2010-2015) conducted by the Japanese government for national universities, NAIST was evaluated as exceedingly superior especially concerning research levels. (One of 5 institutions from the 86 national universities). In 2010, NAIST ranked first overall among the 86 Japanese nati ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Morphological Parsing
Morphological parsing, in natural language processing, is the process of determining the morphemes from which a given word is constructed. It must be able to distinguish between orthographic rules and morphological rules. For example, the word 'foxes' can be decomposed into 'fox' (the stem), and 'es' (a suffix indicating plurality). The generally accepted approach to morphological parsing is through the use of a finite state transducer (FST), which inputs words and outputs their stem and modifiers. The FST is initially created through algorithmic parsing of some word source, such as a dictionary, complete with modifier markups. Another approach is through the use of an indexed lookup method, which uses a constructed radix tree. This is not an often-taken route because it breaks down for morphologically complex languages. With the advancement of neural networks in natural language processing, it became less common to use FST for morphological analysis, especially for language ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Japanese Language
is spoken natively by about 128 million people, primarily by Japanese people and primarily in Japan, the only country where it is the national language. Japanese belongs to the Japonic or Japanese- Ryukyuan language family. There have been many attempts to group the Japonic languages with other families such as the Ainu, Austroasiatic, Koreanic, and the now-discredited Altaic, but none of these proposals has gained widespread acceptance. Little is known of the language's prehistory, or when it first appeared in Japan. Chinese documents from the 3rd century AD recorded a few Japanese words, but substantial Old Japanese texts did not appear until the 8th century. From the Heian period (794–1185), there was a massive influx of Sino-Japanese vocabulary into the language, affecting the phonology of Early Middle Japanese. Late Middle Japanese (1185–1600) saw extensive grammatical changes and the first appearance of European loanwords. The basis of the standard dial ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Morpheme
A morpheme is the smallest meaningful constituent of a linguistic expression. The field of linguistic study dedicated to morphemes is called morphology. In English, morphemes are often but not necessarily words. Morphemes that stand alone are considered roots (such as the morpheme ''cat''); other morphemes, called affixes, are found only in combination with other morphemes. For example, the ''-s'' in ''cats'' indicates the concept of plurality but is always bound to another concept to indicate a specific kind of plurality. This distinction is not universal and does not apply to, for example, Latin, in which many roots cannot stand alone. For instance, the Latin root ''reg-'' (‘king’) must always be suffixed with a case marker: ''rex'' (''reg-s''), ''reg-is'', ''reg-i'', etc. For a language like Latin, a root can be defined as the main lexical morpheme of a word. These sample English words have the following morphological analyses: * "Unbreakable" is composed of three ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Natural Language Processing
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves. Challenges in natural language processing frequently involve speech recognition, natural-language understanding, and natural-language generation. History Natural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled " Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]