The Apache OpenNLP library is a

machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as

language detection In natural language processing, language identification or language guessing is the problem of determining which natural language given content is in. Computational approaches to this problem view it as a special case of text categorization, sol ...

tokenization Tokenization may refer to: * Tokenization (lexical analysis) in language processing * Tokenization (data security) in the field of data security * Word segmentation * Tokenism Tokenism is the practice of making only a perfunctory or symbolic ef ...

, sentence segmentation,

part-of-speech tagging In corpus linguistics, part-of-speech tagging (POS tagging or PoS tagging or POST), also called grammatical tagging is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definiti ...

, named entity extraction,

chunking Chunking may mean: * Chunking (division), an approach for doing simple mathematical division sums, by repeated subtraction * Chunking (computational linguistics), a method for parsing natural language sentences into partial syntactic structures * ...

parsing Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from Lati ...

and coreference resolution. These tasks are usually required to build more advanced text processing services.Apache OpenNLP Proposal
/ref>

References

External links

Apache OpenNLP Website
{{Apache Software Foundation Natural language processing Statistical natural language processing Natural language processing toolkits

OpenNLP The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as language detection, tokenization, sentence segmentation, part-of-speech tagging, named en ...

Java (programming language) libraries Cross-platform software 2004 software

See also

References

External links