Pentaho
   HOME

TheInfoList



OR:

Pentaho is
business intelligence Business intelligence (BI) comprises the strategies and technologies used by enterprises for the data analysis and management of business information. Common functions of business intelligence technologies include reporting, online analytical p ...
(BI) software that provides
data integration Data integration involves combining data residing in different sources and providing users with a unified view of them. This process becomes significant in a variety of situations, which include both commercial (such as when two similar companies ...
, OLAP services,
report A report is a document that presents information in an organized format for a specific audience and purpose. Although summaries of reports may be delivered orally, complete reports are almost always in the form of written documents. Usage In ...
ing, information dashboards, data mining and
extract, transform, load In computing, extract, transform, load (ETL) is a three-phase process where data is extracted, transformed (cleaned, sanitized, scrubbed) and loaded into an output data container. The data can be collated from one or more sources and it can also ...
(ETL) capabilities. Its headquarters are in
Orlando Orlando () is a city in the U.S. state of Florida and is the county seat of Orange County. In Central Florida, it is the center of the Orlando metropolitan area, which had a population of 2,509,831, according to U.S. Census Bureau figures re ...
,
Florida Florida is a state located in the Southeastern region of the United States. Florida is bordered to the west by the Gulf of Mexico, to the northwest by Alabama, to the north by Georgia, to the east by the Bahamas and Atlantic Ocean, and to ...
. Pentaho was acquired by
Hitachi Data Systems Hitachi Data Systems (HDS) was a provider of modular mid-range and high-end computer data storage systems, software and services. Its operations are now a part of Hitachi Vantara. It was a wholly owned subsidiary of Hitachi Ltd. and part of ...
in 2015 and in 2017 became part of Hitachi Vantara.


Overview

Pentaho is a Java framework to create Business Intelligence solutions. Although most known for its Business Analysis Server (formerly known as Business Intelligence Server), the Pentaho software is indeed a couple of Java classes with specific functionality. On top of those Java classes one can build any BI solution. The only exception to this model is the ETL tool Pentaho Data Integration - PDI (formerly known as Kettle.) PDI is a set of softwares used to design data flows that can be run either in a server or standalone processes. PDI encompasses Kitchen, a job and transformation runner, and Spoon, a graphical user interface to design such jobs and transformations. Features such as reporting and OLAP are achieved by integrating subprojects into the Pentaho framework, like Mondrian OLAP engine and jFree Report. For some time by now those projects have been brought into Pentaho's curating. Some of those subprojects even have standalone clients like Pentaho Report Designer, a front-end for jFree Reports, and Pentaho Schema Workbench, a GUI to write XMLs used by Mondrian to serve OLAP cubes. Pentaho offers enterprise and community editions of those softwares. The enterprise software is obtained through an annual subscription and contains extra features and support not found in the community edition. Pentaho's core offering is frequently enhanced by add-on products, usually in the form of plug-ins, from the company and the broader community of users.


Products


Server applications

Pentaho Enterprise Edition (EE) and Pentaho Community Edition (CE).


Desktop/client applications


Community driven, open-source Pentaho server plug-ins

All of these plug-ins function with Pentaho Enterprise Edition (EE) and Pentaho Community Edition (CE).


Licensing

Pentaho follows an
open core The open-core model is a business model for the monetization of commercially produced open-source software. Coined by Andrew Lampitt in 2008, the open-core model primarily involves offering a "core" or feature-limited version of a software pro ...
business model. It provides two different editions of Pentaho Business Analytics: a community edition and an enterprise edition. The enterprise edition needs to be purchased on a subscription model. The subscription model includes support, services, and product enhancements via annual subscription.Torben Pedersen and Mukesh Mohania.
Data Warehousing and Knowledge Discovery
" Heidelberg, Germany: Springer Science and Business Media, 2009. . p.296-298. Retrieved April 6, 2012.
The enterprise edition is available under a commercial license. Enterprise license goes with 3 levels o

Enterprise, Premium and Standard. The community edition is a free open source product licensed under the
GNU General Public License The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general ...
version 2.0 (GPLv2),
GNU Lesser General Public License The GNU Lesser General Public License (LGPL) is a free-software license published by the Free Software Foundation (FSF). The license allows developers and companies to use and integrate a software component released under the LGPL into their own ...
version 2.0 (LGPLv2), and
Mozilla Public License The Mozilla Public License (MPL) is a free and open-source weak copyleft license for most Mozilla Foundation software such as Firefox and Thunderbird The MPL license is developed and maintained by Mozilla, which seeks to balance the concerns ...
1.1 (MPL 1.1).


Recognition

* InfoWorld Bossie Award 2008, 2009, 2010, 2011, 2012 * Ventana Research Leadership Award 2010 for StoneGate Senior Care * CRN Emerging Technology Vendor 201

* ROI Awards 2012 - Nucleus ResearchNucleus Research
/ref>


See also

*
Nutch Apache Nutch is a highly extensible and scalable open source web crawler software project. Features Nutch is coded entirely in the Java programming language, but data is written in language-independent formats. It has a highly modular archite ...
- an effort to build an open source search engine based on
Lucene Apache Lucene is a free and open-source search engine software library, originally written in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License. Lucene is widely used as ...
and
Hadoop Apache Hadoop () is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage an ...
, also created by Doug Cutting * Apache Accumulo - Secure Big Table *
HBase HBase is an open-source non-relational distributed database modeled after Google's Bigtable and written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File Sys ...
-
Bigtable Bigtable is a fully managed wide-column and key-value NoSQL database service for large analytical and operational workloads as part of the Google Cloud portfolio. History Bigtable development began in 2004.. It is now used by a number of Googl ...
-model database *
Hypertable Hypertable was an open-source software project to implement a database management system inspired by publications on the design of Google's Bigtable. Hypertable runs on top of a distributed file system such as the Apache HDFS, GlusterFS or the ...
- HBase alternative *
MapReduce MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster. A MapReduce program is composed of a ''map'' procedure, which performs filtering ...
- Google's fundamental data filtering algorithm *
Apache Mahout Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily on linear algebra. In the past, many of the implementations use the ...
- machine learning algorithms implemented on Hadoop *
Apache Cassandra Cassandra is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassand ...
- a column-oriented database that supports access from Hadoop *
HPCC HPCC (High-Performance Computing Cluster), also known as DAS (Data Analytics Supercomputer), is an open source, data-intensive computing system platform developed by LexisNexis Risk Solutions. The HPCC platform incorporates a software archite ...
-
LexisNexis LexisNexis is a part of the RELX corporation that sells data analytics products and various databases that are accessed through online portals, including portals for computer-assisted legal research (CALR), newspaper search, and consumer info ...
Risk Solutions High Performance Computing Cluster *
Sector/Sphere Sector/Sphere is an open source software suite for high-performance distributed data storage and processing. It can be broadly compared to Google's GFS and MapReduce technology. Sector is a distributed file system targeting data storage over a ...
- open-source distributed storage and processing *
Cloud computing Cloud computing is the on-demand availability of computer system resources, especially data storage ( cloud storage) and computing power, without direct active management by the user. Large clouds often have functions distributed over mu ...
* Big data *
Data-intensive computing Data-intensive computing is a class of parallel computing applications which use a data parallel approach to process large volumes of data typically terabytes or petabytes in size and typically referred to as big data. Computing applications which ...


References


External links

* {{Authority control Business intelligence companies Free business software Free reporting software Extract, transform, load tools