HOME

TheInfoList



OR:

Apache Iceberg is a high performance
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
format for large analytic
tables Table may refer to: * Table (furniture), a piece of furniture with a flat surface and one or more legs * Table (landform), a flat area of land * Table (information), a data arrangement with rows and columns * Table (database), how the table data ...
. Iceberg enables the use of SQL tables for big data while making it possible for engines like Spark,
Trino Trino ( pms, Trin) is a ''comune'' (municipality) in the Province of Vercelli in the Italian region Piedmont, located about northeast of Turin and about southwest of Vercelli, at the foot of the Montferrat hills. Trino borders the following mun ...
, Flink,
Presto Presto may refer to: Computing * Presto (browser engine), an engine previously used in the Opera web browser * Presto (operating system), a Linux-based OS by Xandros * Presto (SQL query engine), a distributed query engine * Presto (animation s ...
, Hive,
Impala The impala or rooibok (''Aepyceros melampus'') is a medium-sized antelope found in eastern and southern Africa. The only extant member of the genus ''Aepyceros'' and tribe Aepycerotini, it was first described to European audiences by Ger ...
, StarRocks, Doris, and
Pig The pig (''Sus domesticus''), often called swine, hog, or domestic pig when distinguishing from other members of the genus ''Sus'', is an omnivorous, domesticated, even-toed, hoofed mammal. It is variously considered a subspecies of ''Sus ...
to safely work with the same tables, at the same time. Iceberg is released under the Apache License. Iceberg addresses the performance and usability challenges of Apache Hive tables in large and demanding data lake environments. Vendors currently supporting Apache Iceberg tables include Buster, CelerData,
Cloudera Cloudera, Inc. is an American software company providing enterprise data management systems that make significant use of Apache Hadoop. As of January 31, 2021, the company had approximately 1,800 customers. History Cloudera, Inc. was formed on ...
, Crunchy Data, Dremio, IOMETE,
Snowflake A snowflake is a single ice crystal that has achieved a sufficient size, and may have amalgamated with others, which falls through the Earth's atmosphere as snow.Knight, C.; Knight, N. (1973). Snow crystals. Scientific American, vol. 228, no. ...
, Starburst, Tabular,
AWS Amazon Web Services, Inc. (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis. These cloud computing web services provide di ...
, and Google Cloud.


History

Iceberg was started at
Netflix Netflix, Inc. is an American subscription video on-demand over-the-top streaming service and production company based in Los Gatos, California. Founded in 1997 by Reed Hastings and Marc Randolph in Scotts Valley, California, it offers a ...
by Ryan Blue and Dan Weeks. Hive was used by many different services and engines in the Netflix infrastructure. Hive was never able to guarantee correctness and did not provide stable atomic transactions. Many at Netflix avoided using these services and making changes to the data to avert unintended consequences from the Hive format. Ryan Blue set out to address three issues that faced the Hive table by creating Iceberg: # Ensure the correctness of the data and support ACID transactions. # Improve performance by enabling finer-grained operations to be done at the file granularity for optimal writes. # Simplify and abstract general operation and maintenance of tables. Iceberg development started in 2017. The project was open-sourced and donated to the Apache Software Foundation in November 2018. In May 2020, the Iceberg project graduated to become a top-level Apache project. Iceberg is used by multiple companies including
Airbnb Airbnb, Inc. ( ), based in San Francisco, California, operates an online marketplace focused on short-term homestays and experiences. The company acts as a broker and charges a commission from each booking. The company was founded in 2008 by ...
,
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus '' Malus''. The tree originated in Central Asia, where its wild ances ...
,
Expedia Expedia Inc. is an online travel agency owned by Expedia Group, an American online travel shopping company based in Seattle. The website and mobile app can be used to book airline tickets, hotel reservations, car rentals, cruise ships, and ...
,
LinkedIn LinkedIn () is an American business and employment-oriented online service that operates via websites and mobile apps. Launched on May 5, 2003, the platform is primarily used for professional networking and career development, and allows job s ...
,
Adobe Adobe ( ; ) is a building material made from earth and organic materials. is Spanish for '' mudbrick''. In some English-speaking regions of Spanish heritage, such as the Southwestern United States, the term is used to refer to any kind of ...
,
Lyft Lyft, Inc. offers mobility as a service, ride-hailing, vehicles for hire, motorized scooters, a bicycle-sharing system, rental cars, and food delivery in the United States and select cities in Canada. Lyft sets fares, which vary using a ...
, and many more.


See also

*
List of Apache Software Foundation projects This list of Apache Software Foundation projects contains the software development projects of the Apache Software Foundation (ASF). Besides the projects, there are a few other distinct areas of Apache: * Incubator: for aspiring ASF projects *Att ...


References

{{Reflist Apache Software Foundation projects SQL Free system software Hadoop Cloud platforms Java platform