Alluxio
Alluxio is an Open-source software, open-source virtual filesystem, virtual distributed file system (VDFS). Initially as research project "Tachyon", Alluxio was created at the UC Berkeley, University of California, Berkeley's AMPLab as Haoyuan Li's Ph.D. Thesis, advised by Professor Scott Shenker & Professor Ion Stoica. Alluxio is situated between computation and storage in the big data analytics stack. It provides a data abstraction layer for computation frameworks, enabling applications to connect to numerous storage systems through a common interface. The software is published under the Apache License. Data Driven Applications, such as Data Analytics, Machine Learning, and AI, use APIs (such as Hadoop HDFS API, S3 API, FUSE API) provided by Alluxio to interact with data from various storage systems at a fast speed. Popular frameworks running on top of Alluxio include Apache Spark, Presto_(SQL_query_engine), Presto, TensorFlow, Trino (SQL query engine), Trino, Apache Hive, and P ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Haoyuan Li
Haoyuan (H.Y.) Li is a computer scientist and entrepreneur specializing in distributed systems, big data, and cloud computing. He is best known for proposing Virtual Distributed File System (VDFS), and creating an open-source data orchestration system, Alluxio. He is the Founder, Chairman, and CEO of Alluxio, Inc, a company commercializing the Alluxio Data Orchestration Technology. He is also an adjunct professor at Peking University. He is a frequent speaker on the topic of AI, big data, cloud computing, and open source at conferences. Biography Li was born and raised in China. He attended Peking University, where he received a BS in Computer Science. While at university, he participated in programming contests representing Peking University, and placed 11th worldwide (bronze medal) in ACM ICPC 2005 and 13rd place worldwide in 2006. He then studied at Cornell University, where he received a MS in Computer Science. He received his Computer Science PhD from the UC Berkeley AMPLab ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Presto (SQL Query Engine)
Presto (including PrestoDB, and PrestoSQL which was re-branded to Trino) is a distributed query engine for big data using the SQL query language. Its architecture allows users to query data sources such as Hadoop, Cassandra, Kafka, AWS S3, Alluxio, MySQL, MongoDB and Teradata, and allows use of multiple data sources within a query. Presto is community-driven open-source software released under the Apache License. History Presto was originally designed and developed at Facebook, Inc. (later renamed Meta) for their data analysts to run interactive queries on its large data warehouse in Apache Hadoop. The first four developers were Martin Traverso, Dain Sundstrom, David Phillips, and Eric Hwang. Before Presto, the data analysts at Facebook relied on Apache Hive for running SQL analytics on their multi-petabyte data warehouse. Hive was deemed too slow for Facebook's scale and Presto was invented to fill the gap to run fast queries. Original development started in 2012 and deploye ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Ion Stoica
Ion Stoica (born ) is a Romanian–American computer scientist specializing in distributed systems, cloud computing and computer networking. He is a professor of computer science at the University of California, Berkeley and co-director of AMPLab. He co-founded Conviva and Databricks with other original developers of Apache Spark and Anyscale with other original developers of Ray. As of April 2025, Forbes ranked him and Matei Zaharia as the 3rd- richest people in Romania with a net worth of $2.7 billion. Education Stoica was born in Romania, where he grew up and attended Polytechnic University of Bucharest, receiving a MS in Electrical Engineering and Computer Science in 1989. He moved to the U.S. in 1994 to start a PhD at Old Dominion University with computer-science professor Hussein Abdel-Wahab. Together with Wahab, in 1995 he published the algorithm for earliest eligible virtual deadline first scheduling, which is the current process scheduler in the Linux kernel. In 1996 ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Apache Spark
Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of California, Berkeley's AMPLab starting in 2009, in 2013, the Spark codebase was donated to the Apache Software Foundation, which has maintained it since. Overview Apache Spark has its architectural foundation in the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. The Dataframe API was released as an abstraction on top of the RDD, followed by the Dataset API. In Spark 1.x, the RDD was the primary application programming interface (API), but as of Spark 2.x use of the Dataset API is encouraged even though the RDD API is not deprecated. The RDD technology still underlies the Dataset API. Spark and its RDDs were developed in 2012 in respon ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Distributed File System
A clustered file system (CFS) is a file system which is shared by being simultaneously Mount (computing), mounted on multiple Server (computing), servers. There are several approaches to computer cluster, clustering, most of which do not employ a clustered file system (only direct attached storage for each node). Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance. Shared-disk file system A shared-disk file system uses a storage area network (SAN) to allow multiple computers to gain direct disk access at the Block (data storage), block level. Access control and translation from file-level operations that applications use to block-level operations used by the SAN must take place on the client node. The mos ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Apache Hive
Apache Hive is a data warehouse software project. It is built on top of Apache Hadoop for providing data query and analysis. Hive gives an SQL-like Interface (computing), interface to query data stored in various databases and file systems that integrate with Hadoop. Traditional SQL queries must be implemented in the MapReduce Java API to execute SQL applications and queries over distributed data. Hive provides the necessary SQL abstraction to integrate SQL-like queries (#HiveQL, HiveQL) into the underlying Java without the need to implement queries in the low-level Java API. Hive facilitates the integration of SQL-based querying languages with Hadoop, which is commonly used in data warehousing applications. While initially developed by Facebook, Inc., Facebook, Apache Hive is used and developed by other companies such as Netflix and the Financial Industry Regulatory Authority (FINRA). Amazon maintains a software fork of Apache Hive included in Apache Hadoop#On Amazon Elastic MapR ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
AMPLab
AMPLAB was a University of California, Berkeley lab focused on big data analytics located in Soda Hall. The name stands for the Algorithms, Machines and People Lab. It has been publishing papers since 2008 and was officially launched in 2011. The AMPLab was co-directed by Professor Michael J. Franklin, Michael I. Jordan, and Ion Stoica. While AMPLab has worked on a wide variety of big data projects (known as BDAS, the Berkeley Data Analytics Stack), many know it as the lab that invented Apache Mesos, and Apache Spark, and Alluxio Alluxio is an Open-source software, open-source virtual filesystem, virtual distributed file system (VDFS). Initially as research project "Tachyon", Alluxio was created at the UC Berkeley, University of California, Berkeley's AMPLab as Haoyuan Li' .... Berkeley launched RISELab as the successor to AMPLab in 2017. References External links * Computer science institutes in the United States University of California, Berkeley Research institutes ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Comcast
Comcast Corporation, formerly known as Comcast Holdings,Before the AT&T Broadband, AT&T merger in 2001, the parent company was Comcast Holdings Corporation. Comcast Holdings Corporation now refers to a subsidiary of Comcast Corporation, not the parent company (seeBloomberg profile on Comcast Holdings Corporation. Technically, the current parent company was founded December 7, 2001 as CAB Holdings Corporation, which changed its name to AT&T Comcast Corporation before finally taking on the Comcast Corporation name (seeNov 2002 8K/A Form anNov 2002 S-4). is an American Multinational corporation, multinational mass media, telecommunications, and entertainment conglomerate. Headquartered at the Comcast Center in Philadelphia, the company was ranked 51st in the Forbes Global 2000, ''Forbes'' Global 2000 in 2023. It is the List of telephone operating companies, fourth-largest telecommunications company by worldwide revenue, after Deutsche Telekom, China Mobile, and Verizon. Comcast i ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
China Unicom
China United Network Communications Group (China Unicom) is a Chinese state-owned telecommunications operator. Originally founded (on January 6 2009) as a wireless paging and GSM mobile operator, it currently provides a range of services including mobile network, long-distance, local calling, data communication, Internet services, and IP telephony. As of 2022, China Unicom is the third-largest wireless carrier in China and the sixth largest mobile provider in the world. History China Unicom (then known as ) was founded as a state-owned enterprise in 1994 by the Ministry of Railways, the Ministry of Electronics and the Ministry of Electric Power Industry; the establishment was approved by the State Council in December 1993. China Unicom was among six state-owned companies that built the communications infrastructure and assisted in financing the Ministry of Industry and Information Technology's Connecting Every Village Project, which began in 2004. The project aimed to p ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Barclays
Barclays PLC (, occasionally ) is a British multinational universal bank, headquartered in London, England. Barclays operates as two divisions, Barclays UK and Barclays International, supported by a service company, Barclays Execution Services. Barclays traces its origins to the goldsmith banking business established in the City of London in 1690. James Barclay became a partner in the business in 1736. In 1896, twelve banks in London and the English provinces, including Goslings Bank, Backhouse's Bank and Gurney, Peckover and Company, united as a joint-stock bank under the name Barclays and Co. Over the following decades, Barclays expanded to become a nationwide bank. In 1967, Barclays deployed the world's first cash dispenser. Barclays has made numerous corporate acquisitions, including of London, Provincial and South Western Bank in 1918, British Linen Bank in 1919, Mercantile Credit in 1975, the Woolwich in 2000 and the North American operations of Lehman Brothers i ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Baidu
Baidu, Inc. ( ; ) is a Chinese multinational technology company specializing in Internet services and artificial intelligence. It holds a dominant position in China's search engine market (via Baidu Search), and provides a wide variety of other internet services such as Baidu App (Baidu's flagship app for search and newsfeed), Baidu Baike (an online user created Wikipedia-like encyclopedia), iQIYI (a video streaming service), and Baidu Tieba (a keyword-based discussion forum similar to Reddit). Besides its core internet search business, Baidu has diversified into several high-growth areas. The company is a leading player in autonomous driving (Baidu Apollo), and smart consumer electronics (Xiaodu). With over a decade of investment in artificial intelligence, Baidu is one of the few tech companies globally to offer a full-service AI stack, including software, chips, cloud infrastructure, foundation models, and applications. The holding company of the group is incorpo ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Amazon Web Services
Amazon Web Services, Inc. (AWS) is a subsidiary of Amazon.com, Amazon that provides Software as a service, on-demand cloud computing computing platform, platforms and Application programming interface, APIs to individuals, companies, and governments, on a metered, pay-as-you-go basis. Clients will often use this in combination with elasticity (system resource), autoscaling (a process that allows a client to use more computing in times of high application usage, and then scale down to reduce costs when there is less traffic). These cloud computing web services provide various services related to networking, compute, storage, middleware, Internet of things, IoT and other processing capacity, as well as software tools via AWS server farms. This frees clients from managing, scaling, and patching hardware and operating systems. One of the foundational services is Amazon Elastic Compute Cloud (EC2), which allows users to have at their disposal a Virtualization, virtual Computer clus ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |