DBRX
   HOME

TheInfoList



OR:

DBRX is an open-sourced
large language model A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are g ...
(LLM) developed by Mosaic under its parent company
Databricks Databricks, Inc. is a global data, analytics, and artificial intelligence (AI) company, founded in 2013 by the original creators of Apache Spark. The company provides a cloud-based platform to help enterprises build, scale, and govern data an ...
, released on March 27, 2024. It is a mixture-of-experts
transformer In electrical engineering, a transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple Electrical network, circuits. A varying current in any coil of the transformer produces ...
model, with 132 billion parameters in total. 36 billion parameters (4 out of 16 experts) are active for each token. The released model comes in either a base
foundation model In artificial intelligence (AI), a foundation model (FM), also known as large X model (LxM), is a machine learning or deep learning model trained on vast datasets so that it can be applied across a wide range of use cases.Competition and Markets ...
version or an instruction-tuned variant. At the time of its release, DBRX outperformed other prominent open-source models such as
Meta Meta most commonly refers to: * Meta (prefix), a common affix and word in English ( in Greek) * Meta Platforms, an American multinational technology conglomerate (formerly ''Facebook, Inc.'') Meta or META may also refer to: Businesses * Meta (ac ...
's
LLaMA 2 Llama (Large Language Model Meta AI, formerly stylized as LLaMA) is a family of large language models (LLMs) released by Meta AI starting in February 2023. The latest version is Llama 4, released in April 2025. Llama models come in different s ...
, Mistral AI's Mixtral, and xAI's
Grok ''Grok'' () is a neologism coined by the American writer Robert A. Heinlein for his 1961 science fiction novel '' Stranger in a Strange Land''. While the ''Oxford English Dictionary'' summarizes the meaning of ''grok'' as "to understand intuit ...
, in several benchmarks ranging from language understanding, programming ability and mathematics. It was trained for 2.5 months on 3,072 Nvidia H100s connected by 3.2 terabytes per second bandwidth (
InfiniBand InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used ...
), for a training cost of US$10M USD.


References

{{Generative AI Large language models