NetOwl is a suite of multilingual text and identity analytics products that analyze

big data Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...

in the form of text data – reports, web,

social media Social media are interactive technologies that facilitate the Content creation, creation, information exchange, sharing and news aggregator, aggregation of Content (media), content (such as ideas, interests, and other forms of expression) amongs ...

, etc. – as well as structured entity data about people, organizations, places, and things. NetOwl utilizes artificial intelligence (AI)-based approaches, including

natural language processing Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...

(NLP),

machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

(ML), and

computational linguistics Computational linguistics is an interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics ...

, to extract entities, relationships, and events; to perform

sentiment analysis Sentiment analysis (also known as opinion mining or emotion AI) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subje ...

; to assign latitude/longitude to geographical references in text; to translate names written in foreign languages; and to perform name matching and

identity resolution Record linkage (also known as data matching, data linkage, entity resolution, and many other terms) is the task of finding records in a data set that refer to the same entity across different data sources (e.g., data files, books, websites, and d ...

."SRA International."
Washington Post. Retrieved 2013-07-02.Zelenko, Dmitry, and Chinatsu Aone
“Discriminative Methods for Transliteration.”
In Proceedings of 2006 Conference Empirical Applications of Natural Language Processing (2006). Retrieved 2013-05-20.Maybury, Mark (2012)
Multimedia Information Extraction
Hoboken, New Jersey: John Wiley & Sons, Inc., p. 18. Retrieved 2013-07-02. NetOwl's uses include

semantic search Semantic search denotes search with meaning, as distinguished from lexical search where the search engine looks for literal matches of the query words or variants of them, without understanding the overall meaning of the query. Semantic search seek ...

and discovery, geospatial analysis,Smith, Susan
“Notes from the GEOINT 2007 Symposium.”
GISCafe (2007-10-29). Retrieved 2013-07-02. intelligence analysis, content enrichment,Guess, Angela (2012-01-19)
"LexisNexis Releases New Version of Lexis Advance".
semanticweb.com. Retrieved 2013-07-28. compliance monitoring,Aone, Chinatsu, et al
“Assentor®: an NLP-based Solution to E-mail Monitoring.”
In Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence (2000), pp. 945-540. Retrieved 2013-05-20. cyber threat monitoring, risk management, and bioinformatics.

History

The first NetOwl product was NetOwl Extractor, which was initially released in 1996. Since then, Extractor has added many new capabilities, including relationship and event extraction, categorization, name translation, geotagging, and sentiment analysis, as well as entity extraction in other languages. Other products were added later to the NetOwl suite, namely TextMiner, NameMatcher, and EntityMatcher. NetOwl has participated in several 3rd party-sponsored text and entity analytics software benchmarking events. NetOwl Extractor was the top-scoring named entity extraction system at the

DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adva ...

-sponsored

Message Understanding Conference The Message Understanding Conferences (MUC) for computing and computer science, were initiated and financed by DARPA (Defense Advanced Research Projects Agency) to encourage the development of new and better methods of information extraction. The ...

MUC-6 and the top-scoring link and event extraction system in MUC-7. It was also the top-scoring system at several of the

NIST The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical s ...

-sponsored Automatic Content Extraction (ACE) evaluation tasks.The ACE 2005 (ACE'05) Evaluation Plan.
Retrieved 2013-05-20. NetOwl NameMatcher was the top-scoring system at th
MITRE Challenge
for Multicultural Person Name Matching.

Products

The NetOwl suite includes, among others, the following text and entity analytics products:

Text Analytics

NetOwl Extractor performs

entity extraction Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pr ...

from unstructured texts using

(NLP),

(ML), and

. Extractor also performs semantic relationship and event extraction as well as

geotagging Geotagging, or GeoTagging, is the process of adding geographical identification metadata to various media such as a geotagged photograph or video, websites, SMS messages, QR Codes or RgSSfeeds and is a form of geospatial metadata. This data ...

of text. It is used for a variety of data sources including both traditional sources (e.g., news, reports, web pages, email) and social media (e.g., Twitter, Facebook, chats, blogs). It runs on a variety of Big Data analytics platforms, including

Apache Hadoop Apache Hadoop () is a collection of open-source software utilities for reliable, scalable, distributed computing. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop wa ...

and LexisNexis’s High-Performance Computer Cluster (

HPCC HPCC (High-Performance Computing Cluster), also known as DAS (Data Analytics Supercomputer), is an open source, data-intensive computing system platform developed by LexisNexis Risk Solutions. The HPCC platform incorporates a software architect ...

) technology. It has been integrated with a number of 3rd party analytical tools such as Esri ArcGIS and Google Earth/Maps.

Identity Analytics

NetOwl NameMatcher and EntityMatcher perform name matching and identity resolution for large multicultural and multilingual entity databases using

(ML) and

approaches. They are used for applications such as anti-money laundering (AML), watch lists,

regulatory compliance In general, compliance means conforming to a rule, such as a specification, policy, standard or law. Compliance has traditionally been explained by reference to deterrence theory, according to which punishing a behavior will decrease the viol ...

, fraud detection, etc.

References

{{reflist, 2

External links

NetOwl website
Natural language processing software Natural language processing Data mining and machine learning software