HOME



picture info

Solr
Solr (pronounced "solar") is an open-source enterprise-search platform, written in Java. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features and rich document (e.g., Word, PDF) handling. Providing distributed search and index replication, Solr is designed for scalability and fault tolerance. Solr is widely used for enterprise search and analytics use cases and has an active development community and regular releases. Solr runs as a standalone full-text search server. It uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it usable from most popular programming languages. Solr's external configuration allows it to be tailored to many types of applications without Java coding, and it has a plugin architecture to support more advanced customization. Apache Solr is developed in an open, collaborativ ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Carrot2
Carrot² is an open source search results clustering engine. It can automatically cluster small collections of documents, e.g. search results or document abstracts, into thematic categories. Carrot² is written in Java and distributed under the BSD license. History The initial version of Carrot² was implemented in 2001 by Dawid Weiss as part of his MSc thesis to validate the applicability of the STC clustering algorithm to clustering search results in Polish. In 2003, a number of other search results clustering algorithms were added, including Lingo, a novel text clustering algorithm designed specifically for clustering of search results. While the source code of Carrot² was available since 2002, it was only in 2006 when version 1.0 was officially released. In the same year, version 2.0 was released with improved user interface and extended tool set. In 2009, version 3.0 brought significant improvements in clustering quality, simplified API and new GUI application for tuning ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Lucene
Apache Lucene is a free and open-source search engine software library, originally written in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License. Lucene is widely used as a standard foundation for production search applications. Lucene has been ported to other programming languages including Object Pascal, Perl, C#, C++, Python, Ruby and PHP. History Doug Cutting originally wrote Lucene in 1999. Lucene was his fifth search engine. He had previously written two while at Xerox PARC, one at Apple, and a fourth at Excite. It was initially available for download from its home at the SourceForge web site. It joined the Apache Software Foundation's Jakarta family of open-source Java products in September 2001 and became its own top-level Apache project in February 2005. The name Lucene is Doug Cutting's wife's middle name and her maternal grandmother's first name. Lucene formerly included a number of sub-p ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Full Text Search
In text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database. Full-text search is distinguished from searches based on metadata or on parts of the original texts represented in databases (such as titles, abstracts, selected sections, or bibliographical references). In a full-text search, a search engine examines all of the words in every stored document as it tries to match search criteria (for example, text specified by a user). Full-text-searching techniques appeared in the 1960s, for example IBM STAIRS from 1969, and became common in online bibliographic databases in the 1990s. Many websites and application programs (such as word processing software) provide full-text-search capabilities. Some web search engines, such as the former AltaVista, employ full-text-search techniques, while others index only a portion of the web pages examined by their indexing systems. Indexing When dealing with a sm ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Apache Solr Logo
The Apache ( ) are several Southern Athabaskan language-speaking peoples of the Southwest, the Southern Plains and Northern Mexico. They are linguistically related to the Navajo. They migrated from the Athabascan homelands in the north into the Southwest between 1000 and 1500 CE. Apache bands include the Chiricahua, Jicarilla, Lipan, Mescalero, Mimbreño, Salinero, Plains, and Western Apache ( Aravaipa, Pinaleño, Coyotero, and Tonto). Today, Apache tribes and reservations are headquartered in Arizona, New Mexico, Texas, and Oklahoma, while in Mexico the Apache are settled in Sonora, Chihuahua, Coahuila and areas of Tamaulipas. Each tribe is politically autonomous. Historically, the Apache homelands have consisted of high mountains, sheltered and watered valleys, deep canyons, deserts, and the southern Great Plains, including areas in what is now Eastern Arizona, Northern Mexico (Sonora and Chihuahua) and New Mexico, West Texas, and Southern Colorado. These areas are c ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Hortonworks
Hortonworks, Inc. was a data software company based in Santa Clara, California that developed and supported open-source software (primarily around Apache Hadoop) designed to manage big data and associated processing. Hortonworks software was used to build enterprise data services and applications such as IoT (connected cars, for example), single view of X (such as customer, risk, patient), and advanced analytics and machine learning (such as next best action and realtime cybersecurity). Hortonworks had three interoperable product lines: * Hortonworks Data Platform (HDP): based on Apache Hadoop, Apache Hive, Apache Spark * Hortonworks DataFlow (HDF): based on Apache NiFi, Apache Storm, Apache Kafka * Hortonworks DataPlane services (DPS): based on Apache Atlas and Cloudbreak and a pluggable architecture into which partners such as IBM can add their services. In January 2019, Hortonworks completed its merger with Cloudera. History Hortonworks was formed in June 2011 as an inde ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Cloudera
Cloudera, Inc. is an American data lake software company. History Cloudera, Inc. was formed on June 27, 2008 in Burlingame, California by Christophe Bisciglia, Amr Awadallah, Jeff Hammerbacher, and chief executive Mike Olson. Prior to Cloudera, Bisciglia, Awadallah, and Hammerbacher were engineers at Google, Yahoo!, and Facebook respectively, and Olson was a database executive at Oracle after his previous company Sleepycat was acquired by Oracle in 2006. The four were joined in 2009 by Doug Cutting, a co-founder of Hadoop. Cloudera originally offered a free product based on Hadoop, earning revenue by selling support and consulting services around it. In March 2009, the company began offering a commercial distribution of Hadoop. In 2009 the company received a $5 million investment led by Accel Partners. This was followed by a $25 million funding round in October 2010 and a $40M funding round in November 2011. In June 2013, Olson transitioned from CEO to Chairman of the Bo ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Hadoop
Apache Hadoop () is a collection of Open-source software, open-source software utilities for reliable, scalable, distributed computing. It provides a software framework for Clustered file system, distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework. Overview The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part which is a MapReduce programming model. Hadoop splits files into large blocks and distributes them across nodes in a cluster. It then transfers JAR (file format), packaged code into nodes to process the data in parallel. This appro ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Enterprise Content Management
Enterprise content management (ECM) extends the concept of content management by adding a timeline for each content item and, possibly, enforcing processes for its creation, approval, and distribution. Systems using ECM generally provide a secure repository for managed items, analog or digital. They also include one (or more) methods for importing content to manage new items, and several presentation methods to make items available for use. Although ECM content may be protected by digital rights management (DRM), it is not required. ECM is distinguished from general content management by its cognizance of the processes and procedures of the enterprise for which it is created. Definitions * Late 2005: The technology was used to capture, manage, store, preserve, and deliver content and documents related to organizational processes * Early 2006: ECM tools and strategies allowed the management of an organization's unstructured information, wherever that information exists. * Early 2 ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Content Management System
A content management system (CMS) is computer software used to manage the creation and modification of digital content ( content management).''Managing Enterprise Content: A Unified Content Strategy''. Ann Rockley, Pamela Kostur, Steve Manning. New Riders, 2003. It is typically used for enterprise content management (ECM) and web content management (WCM). ECM typically supports multiple users in a collaborative environment, by integrating document management, digital asset management, and record retention. Alternatively, WCM is the collaborative authoring for websites and may include text and embed graphics, photos, video, audio, maps, and program code that display content and interact with the user. ECM typically includes a WCM function. Structure A CMS typically has two major components: a content management application (CMA), as the front-end user interface that allows a user, even with limited expertise, to add, modify, and remove content from a website without the interve ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Kubernetes
Kubernetes (), also known as K8s is an open-source software, open-source OS-level virtualization, container orchestration (computing), orchestration system for automating software deployment, scaling, and management. Originally designed by Google, the project is now maintained by a worldwide community of contributors, and the trademark is held by the Cloud Native Computing Foundation. The name ''Kubernetes'' originates from the Greek language, Greek κυβερνήτης (kubernḗtēs), meaning 'governor', 'helmsman' or 'pilot'. ''Kubernetes'' is often abbreviated as ''K8s'', counting the eight letters between the ''K'' and the ''s'' (a numeronym). Kubernetes assembles one or more computers, either virtual machines or bare machine, bare metal, into a Computer cluster, cluster which can run workloads in containers. It works with various container runtimes, such as containerd and Cloud Native Computing Foundation#CRI-O, CRI-O. Its suitability for running and managing workloads of ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Apache Zeppelin
The Apache ( ) are several Southern Athabaskan language-speaking peoples of the Southwest, the Southern Plains and Northern Mexico. They are linguistically related to the Navajo. They migrated from the Athabascan homelands in the north into the Southwest between 1000 and 1500 CE. Apache bands include the Chiricahua, Jicarilla, Lipan, Mescalero, Mimbreño, Salinero, Plains, and Western Apache ( Aravaipa, Pinaleño, Coyotero, and Tonto). Today, Apache tribes and reservations are headquartered in Arizona, New Mexico, Texas, and Oklahoma, while in Mexico the Apache are settled in Sonora, Chihuahua, Coahuila and areas of Tamaulipas. Each tribe is politically autonomous. Historically, the Apache homelands have consisted of high mountains, sheltered and watered valleys, deep canyons, deserts, and the southern Great Plains, including areas in what is now Eastern Arizona, Northern Mexico (Sonora and Chihuahua) and New Mexico, West Texas, and Southern Colorado. These areas are c ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]