HOME

TheInfoList



OR:

Apache Apex is a
YARN Yarn is a long continuous length of interlocked fibres, used in sewing, crocheting, knitting, weaving, embroidery, ropemaking, and the production of textiles. Thread is a type of yarn intended for sewing by hand or machine. Modern manufac ...
-native platform that unifies stream and
batch processing Computerized batch processing is a method of running software programs called jobs in batches automatically. While users are required to submit the jobs, no other interaction by the user is required to process the batch. Batches may automatically ...
. It processes big data-in-motion in a way that is scalable, performant,
fault-tolerant Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of one or more faults within some of its components. If its operating quality decreases at all, the decrease is proportional to the ...
, stateful, secure, distributed, and easily operable. Apache Apex was named a top-level project by The Apache Software Foundation on April 25, 2016. As of September 2019, it is no longer actively developed.


Overview

Apache Apex is developed under the Apache License 2.0. The project was driven by the San Jose, California-based start-up company DataTorrent. There are two parts of Apache Apex: Apex Core and Apex Malhar. Apex Core is the platform or framework for building distributed applications on
Hadoop Apache Hadoop () is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage ...
. The core Apex platform is supplemented by Malhar, a library of connector and logic functions, enabling rapid application development. These input and output operators provide templates to sources and sinks such as
Alluxio Alluxio is an open-source virtual distributed file system (VDFS). Initially as research project "Tachyon", Alluxio was created at the University of California, Berkeley's AMPLab as Haoyuan Li's Ph.D. Thesis, advised by Professor Scott Shenker ...
, S3,
HDFS Apache Hadoop () is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage an ...
, NFS, FTP,
Kafka Franz Kafka (3 July 1883 – 3 June 1924) was a German-speaking Bohemian novelist and short-story writer, widely regarded as one of the major figures of 20th-century literature. His work fuses elements of realism and the fantastic. It typ ...
, ActiveMQ,
RabbitMQ RabbitMQ is an open-source message-broker software (sometimes called message-oriented middleware) that originally implemented the Advanced Message Queuing Protocol (AMQP) and has since been extended with a plug-in architecture to support Stre ...
, JMS,
Cassandra Cassandra or Kassandra (; Ancient Greek: Κασσάνδρα, , also , and sometimes referred to as Alexandra) in Greek mythology was a Trojan priestess dedicated to the god Apollo and fated by him to utter true prophecies but never to be believe ...
,
MongoDB MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and licensed under the Ser ...
,
Redis Redis (; Remote Dictionary Server) is an in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Redis supports different kinds of abstract data structures, suc ...
,
HBase HBase is an open-source non-relational distributed database modeled after Google's Bigtable and written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed Fil ...
,
CouchDB Apache CouchDB is an open-source document-oriented NoSQL database, implemented in Erlang. CouchDB uses multiple formats and protocols to store, transfer, and process its data. It uses JSON to store data, JavaScript as its query language using ...
, generic
JDBC Java Database Connectivity (JDBC) is an application programming interface (API) for the programming language Java, which defines how a client may access a database. It is a Java-based data access technology used for Java database connectivity. ...
, and other database connectors.


History

DataTorrent has developed the platform since 2012 and then decided to open source the core that became Apache Apex. It entered incubation in August 2015 and became Apache Software Foundation top level project within 8 months. DataTorrent itself shut down in May 2018. As of September 2019, Apache Apex is no longer being developed.


Apex Big Data World

Apex Big Data World is a conference about Apache Apex. The first conference of Apex Big Data World took place in 2017. They were held in Pune, India and Mountain View, California, USA.


References


External links

* {{DEFAULTSORT:Apex Apache Software Foundation projects Free software programmed in Java (programming language) Apache Software Foundation Software using the Apache license Free system software Distributed stream processing