Diffeo (company)
   HOME

TheInfoList



OR:

Diffeo, Inc. ( ), is a software company that developed a collaborative intelligence text mining product for defense, intelligence and financial services customers. The Diffeo product is a recommender engine that analyzes text in a user's working documents, such as draft emails and web pages, identifying named entities and proposing related entities. Diffeo was founded in 2012 and was acquired by
Salesforce Salesforce, Inc. is an American cloud-based software company headquartered in San Francisco, California. It provides applications focused on sales, customer service, marketing automation, e-commerce, analytics, artificial intelligence, and ap ...
in 2019. The company grew out of
NIST The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical s ...
's
Text Retrieval Conference The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or ''tracks.'' It is co-sponsored by the National Institute of Standards and Technology (NIST) and ...
where the founding team organized the Knowledge Base Acceleration (KBA) evaluation to measure the effectiveness of recommender engines.


History


Founding

The company was founded by three Hertz Fellows, Dan Roberts, Max Kleiman-Weiner, and John Frank, a co-founder of
MetaCarta MetaCarta is a software company that developed one of the first search engines to use a map to find unstructured data, unstructured documents. The product uses natural language processing to georeference text for customers in Defense industry, de ...
. The name Diffeo comes from a shortening of
diffeomorphism In mathematics, a diffeomorphism is an isomorphism of differentiable manifolds. It is an invertible function that maps one differentiable manifold to another such that both the function and its inverse are continuously differentiable. Definit ...
, which two of the cofounders were learning about in a class about blackholes by
Andrew Strominger Andrew Eben Strominger (; born 1955) is an American theoretical physicist who is the director of Harvard's Center for the Fundamental Laws of Nature. He has made significant contributions to quantum gravity and string theory. These include his ...
. Diffeo was one of the first residents in
hack/reduce hack/reduce is a 501(c)(3) non-profit created to cultivate a community of big data experts in the Boston area.About
. hack/re ...
.


Funding

In 2016, the company raised a seed round of approximately two million dollars from investors including Basis Technology and Carahsoft. Also in 2016, Diffeo acquired Meta, a search engine company founded by Jason Briggs, Emily Pavlini, and Aaron Taylor through a business plan competition at
Williams College Williams College is a Private college, private liberal arts colleges in the United States, liberal arts college in Williamstown, Massachusetts, United States. It was established as a men's college in 1793 with funds from the estate of Ephraim ...
.


Research

Diffeo's research focused on
recommender engine A recommender system (RecSys), or a recommendation system (sometimes replacing ''system'' with terms such as ''platform'', ''engine'', or ''algorithm'') and sometimes only called "the algorithm" or "algorithm", is a subclass of information fi ...
s and evaluation protocols for measuring the benefits of
recommender engine A recommender system (RecSys), or a recommendation system (sometimes replacing ''system'' with terms such as ''platform'', ''engine'', or ''algorithm'') and sometimes only called "the algorithm" or "algorithm", is a subclass of information fi ...
s for end users. As part of running the Knowledge Base Acceleration (KBA) track in NIST's
Text Retrieval Conference The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or ''tracks.'' It is co-sponsored by the National Institute of Standards and Technology (NIST) and ...
from 2012 to 2014, the co-founders of Diffeo released a public dataset of timestamped news and blogs spanning approximately 12,000 hours. The KBA track aimed to measure approaches to accelerating the assimilation of knowledge into
knowledge bases In computer science, a knowledge base (KB) is a set of sentences, each sentence given in a knowledge representation language, with interfaces to tell new sentences and to ask questions about what is known, where either of these interfaces might ...
like Wikipedia. The company's researchers published papers and open source code on machine learning techniques including Jacobian
regularization Regularization may refer to: * Regularization (linguistics) * Regularization (mathematics) * Regularization (physics) * Regularization (solid modeling) * Regularization Law, an Israeli law intended to retroactively legalize settlements See also ...
, singular spectrum analysis, and
hierarchical agglomerative clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two ...
for entity disambiguation.


Post-Acquisition

In 2021, Salesforce announced an AI-powered assistant that helps B2B sales people with their deals. Briggs, who was previously CEO at Diffeo, is the Senior Director of Product Management, and helped in the creation of this AI assistant. This technology comes from Salesforce's acquisition of Diffeo, which also brought them Briggs.


Product & technology

The Diffeo product, Diffeo Enterprise HierCoref (DEHC), is a
recommender engine A recommender system (RecSys), or a recommendation system (sometimes replacing ''system'' with terms such as ''platform'', ''engine'', or ''algorithm'') and sometimes only called "the algorithm" or "algorithm", is a subclass of information fi ...
that allows users to "invite" an agent into their work
documents A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ', which denotes a "teaching" or "lesson": ...
in order to identify
named entities Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pr ...
and recommend related entities that it identifies by
crawling Crawl, The Crawl, or crawling may refer to: Biology * Crawling, any type of tetrapod quadrupedal locomotion with the torso persistently touching or very close to the ground. ** Crawling (human), any of several types of human quadrupedal gait * ...
the Web and the user's data repositories. For example, the product has plugins that enable it to analyze a user's
emails Electronic mail (usually shortened to email; alternatively hyphenated e-mail) is a method of transmitting and receiving digital messages using electronic devices over a computer network. It was conceived in the late–20th century as the ...
and
web pages A web page (or webpage) is a World Wide Web, Web document that is accessed in a web browser. A website typically consists of many web pages hyperlink, linked together under a common domain name. The term "web page" is therefore a metaphor of pap ...
open in their
web browser A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...
. The company's user meetings, called ''The AI<>Tradecraft Forum,'' brought together speakers from the information extraction industry and the
US Intelligence Community The United States Intelligence Community (IC) is a group of separate US federal government, U.S. federal government intelligence agencies and subordinate organizations that work to conduct Intelligence assessment, intelligence activities which ...
, including NGA,
United States Army The United States Army (USA) is the primary Land warfare, land service branch of the United States Department of Defense. It is designated as the Army of the United States in the United States Constitution.Article II, section 2, clause 1 of th ...
,
AFOSI The Air Force Office of Special Investigations (OSI or AFOSI) is a U.S. federal law enforcement agency that reports directly to the Secretary of the Air Force. OSI is also a U.S. Air Force field operating agency under the administrative guid ...
, and
NSA The National Security Agency (NSA) is an intelligence agency of the United States Department of Defense, under the authority of the director of national intelligence (DNI). The NSA is responsible for global monitoring, collection, and proces ...
.


Awards

Diffeo won the 2019
MassChallenge MassChallenge is a global, zero-equity startup accelerator, founded in Boston, Massachusetts, in 2009. MassChallenge is headquartered in Boston's Seaport District in the MassMutual Building, and has additional U.S. locations in Texas, as well int ...
FinTech grand prize, was selected into the 2018 FinTech Innovation Lab and was one of 13 companies in the 2017 Salesforce AI Incubator. Diffeo won the
Hertz Foundation The Fannie and John Hertz Foundation is an American non-profit organization that awards fellowships to Ph.D. students in the applied physical, biological and engineering sciences. The fellowship begins with up to $250,000 of financial support ...
's 2015 Newman Entrepreneurial Initiative. The company was also a performer in DARPA's Memex program, and won the grand prize in the NGA Disparate Data Challenge.


See also

*
Collaborative intelligence Collaborative intelligence is distinguished from collective intelligence in three key ways: First, in collective intelligence there is a central controller who poses the question, collects responses from a crowd of anonymous responders, and uses a ...
*
Text Retrieval Conference The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or ''tracks.'' It is co-sponsored by the National Institute of Standards and Technology (NIST) and ...
*
Recommender engine A recommender system (RecSys), or a recommendation system (sometimes replacing ''system'' with terms such as ''platform'', ''engine'', or ''algorithm'') and sometimes only called "the algorithm" or "algorithm", is a subclass of information fi ...
*
Named-entity recognition Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pr ...
*
Salesforce Salesforce, Inc. is an American cloud-based software company headquartered in San Francisco, California. It provides applications focused on sales, customer service, marketing automation, e-commerce, analytics, artificial intelligence, and ap ...


External links

*
Diffeo on Github.com
* Hierarchical agglomerative clustering library written in Rust: https://github.com/diffeo/kodama * https://trec-kba.org/ * https://trec-dd.org/ * TREC KBA Streamcorpus at http://s3.amazonaws.com/aws-publicdatasets/trec/kba/index.html * TREC KBA corpus information at NIST https://trec.nist.gov/data/kba.html


References

{{Reflist Salesforce Data and information visualization software Business software companies Business intelligence companies Data companies American companies established in 2012 Software companies established in 2012 2019 mergers and acquisitions Defunct software companies of the United States