MALLET is a
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
"Machine Learning for Language Toolkit".
Description
MALLET is an integrated collection of Java code useful for statistical
natural language processing
Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
,
document classification
Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more Class (philosophy), classes or Categorization, categories. This may be do ...
,
cluster analysis
Cluster analysis or clustering is the data analyzing technique in which task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more Similarity measure, similar (in some specific sense defined by the ...
,
information extraction,
topic model
In statistics and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden ...
ing and other
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
applications to text.
History
MALLET was developed primarily by
Andrew McCallum, of the
University of Massachusetts Amherst
The University of Massachusetts Amherst (UMass Amherst) is a public land-grant research university in Amherst, Massachusetts, United States. It is the flagship campus of the University of Massachusetts system and was founded in 1863 as the ...
, with assistance from graduate students and faculty from both UMASS and the
University of Pennsylvania
The University of Pennsylvania (Penn or UPenn) is a Private university, private Ivy League research university in Philadelphia, Pennsylvania, United States. One of nine colonial colleges, it was chartered in 1755 through the efforts of f ...
.
See also
External links
Official website of the projectat the University of Massachusetts Amherst
* Th
Topic Modeling Toolis an independently developed GUI that outputs MALLET results in CSV and HTML files
Free artificial intelligence applications
Natural language processing toolkits
Free software programmed in Java (programming language)
Java (programming language) libraries
Data mining and machine learning software
{{prog-lang-stub