Scribe was a server for aggregating
log data streamed in real-time from many
servers. It was designed to be
scalable, extensible without client-side modification, and robust to failure of the network or any specific machine.
Scribe was developed at
Facebook
Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
and released in 2008 as
open source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
.
Scribe servers are arranged in a directed graph, with each server knowing only about the next server in the graph. This
network topology
Network topology is the arrangement of the elements (Data link, links, Node (networking), nodes, etc.) of a communication network. Network topology can be used to define or describe the arrangement of various types of telecommunication networks, ...
allows for adding extra layers of
fan-in as a system grows, and batching messages before sending them between datacenters, without having any code that explicitly needs to understand datacenter topology, only a simple configuration.
[https://www.facebook.com/note.php?note_id=32008268919&id=9445547199 ]
Scribe was designed to consider reliability but to not require heavyweight protocols and expansive disk usage. Scribe spools data to disk on any node to handle intermittent connectivity node failure, but doesn't sync a log file for every message. This creates a possibility of a small amount of data loss in the event of a crash or catastrophic hardware failure. However, this degree of reliability is often suitable for most Facebook
use cases
In both software and systems engineering, a use case is a structured description of a system’s behavior as it responds to requests from external actors, aiming to achieve a specific goal. It is used to define and validate functional requireme ...
.
See also
Apache Flume*
Fluentd
Fluentd is a cross platform, cross-platform open-source software, open-source data collection software project originally developed at Treasure Data. It is written primarily in the C (Programming Language), C programming language with a thin-Ruby ...
: Log Everything in JSON
Enabling Facebook’s Log Infrastructure with Fluentd
Notes and references
External links
Open Source - Facebook DevelopersThe real value of Scribe for open sourceScribe project on GitHub
Facebook software
2008 software
Free software programmed in C
Free software programmed in PHP
Software using the Apache license
{{Network-software-stub