Scribe was a server for aggregating
log data streamed in real-time from many
servers. It was designed to be
scalable, extensible without client-side modification, and robust to failure of the network or any specific machine.
Scribe was developed at
Facebook
Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin Mosk ...
and released in 2008 as
open source.
Scribe servers are arranged in a directed graph, with each server knowing only about the next server in the graph. This
network topology
Network topology is the arrangement of the elements ( links, nodes, etc.) of a communication network. Network topology can be used to define or describe the arrangement of various types of telecommunication networks, including command and contr ...
allows for adding extra layers of
fan-in
Fan-in is the number of inputs a logic gate can handle. For instance the fan-in for the AND gate shown in the figure is 3. Physical logic gates with a large fan-in tend to be slower than those with a small fan-in. This is because the complexity o ...
as a system grows, and batching messages before sending them between datacenters, without having any code that explicitly needs to understand datacenter topology, only a simple configuration.
[https://www.facebook.com/note.php?note_id=32008268919&id=9445547199 ]
Scribe was designed to consider reliability but to not require heavyweight protocols and expansive disk usage. Scribe spools data to disk on any node to handle intermittent connectivity node failure, but doesn't sync a log file for every message. This creates a possibility of a small amount of data loss in the event of a crash or catastrophic hardware failure. However, this degree of reliability is often suitable for most Facebook
use cases.
See also
Apache Flume*
Fluentd
Fluentd is a cross platform open-source data collection software project originally developed at Treasure Data. It is written primarily in the Ruby programming language.
Overview
Fluentd was positioned for " big data", semi- or un-structured ...
: Log Everything in JSON
Enabling Facebook’s Log Infrastructure with Fluentd
Notes and references
External links
Open Source - Facebook DevelopersThe real value of Scribe for open sourceScribe project on GitHub
Free software
Facebook software
2008 software
{{Network-software-stub