Berkeley Packet Filter
   HOME

TheInfoList



OR:

The Berkeley Packet Filter (BPF) is a technology used in certain computer operating systems for programs that need to, among other things, analyze network traffic. It provides a raw interface to
data link layer The data link layer, or layer 2, is the second layer of the seven-layer OSI model of computer networking. This layer is the protocol layer that transfers data between nodes on a network segment across the physical layer. The data link layer ...
s, permitting raw link-layer packets to be sent and received. In addition, if the driver for the network interface supports
promiscuous mode In computer networking, promiscuous mode is a mode for a wired network interface controller (NIC) or wireless network interface controller (WNIC) that causes the controller to pass all traffic it receives to the central processing unit (CPU) rath ...
, it allows the interface to be put into that mode so that all packets on the
network Network, networking and networked may refer to: Science and technology * Network theory, the study of graphs as a representation of relations between discrete objects * Network science, an academic field that studies complex networks Mathematics ...
can be received, even those destined to other hosts. BPF supports filtering packets, allowing a
userspace A modern computer operating system usually segregates virtual memory into user space and kernel space. Primarily, this separation serves to provide memory protection and hardware protection from malicious or errant software behaviour. Kernel ...
process A process is a series or set of activities that interact to produce a result; it may occur once-only or be recurrent or periodic. Things called a process include: Business and management *Business process, activities that produce a specific se ...
to supply a filter program that specifies which packets it wants to receive. For example, a
tcpdump tcpdump is a data-network packet analyzer computer program that runs under a command line interface. It allows the user to display TCP/IP and other packets being transmitted or received over a network to which the computer is attached. Distribut ...
process may want to receive only packets that initiate a TCP connection. BPF returns only packets that pass the filter that the process supplies. This avoids copying unwanted packets from the
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also i ...
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learn ...
to the process, greatly improving performance. The filter program is in the form of instructions for a
virtual machine In computing, a virtual machine (VM) is the virtualization/ emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized h ...
, which are interpreted, or compiled into machine code by a just-in-time (JIT) mechanism and executed, in the kernel. BPF is sometimes used to refer to just the filtering mechanism, rather than to the entire interface. Some systems, such as
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, w ...
and Tru64 UNIX, provide a raw interface to the data link layer other than the BPF raw interface but use the BPF filtering mechanisms for that raw interface. The BPF filtering mechanism is available on most
Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating systems. The Linux kernel provides an extended version of the BPF filtering mechanism, called eBPF, which uses a JIT mechanism, and which is used for packet filtering, as well as for other purposes in the kernel. eBPF is also available for Microsoft Windows.


Raw data-link interface

BPF provides
pseudo-device In Unix-like operating systems, a device file or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow a ...
s that can be bound to a network interface; reads from the device will read buffers full of packets received on the network interface, and writes to the device will inject packets on the network interface. In 2007, Robert Watson and Christian Peron added
zero-copy "Zero-copy" describes computer operations in which the CPU does not perform the task of copying data from one memory area to another or in which unnecessary data copies are avoided. This is frequently used to save CPU cycles and memory bandwi ...
buffer extensions to the BPF implementation in the FreeBSD operating system, allowing kernel packet capture in the device driver interrupt handler to write directly to user process memory in order to avoid the requirement for two copies for all packet data received via the BPF device. While one copy remains in the receipt path for user processes, this preserves the independence of different BPF device consumers, as well as allowing the packing of headers into the BPF buffer rather than copying complete packet data.


Filtering

BPF's filtering capabilities are implemented as an interpreter for a
machine language In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very ...
for the BPF
virtual machine In computing, a virtual machine (VM) is the virtualization/ emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized h ...
, a 32-bit machine with fixed-length instructions, one accumulator, and one
index register An index register in a computer's CPU is a processor register (or an assigned memory location) used for pointing to operand addresses during the run of a program. It is useful for stepping through strings and arrays. It can also be used for hol ...
. Programs in that language can fetch data from the packet, perform arithmetic operations on data from the packet, and compare the results against constants or against data in the packet or test
bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...
s in the results, accepting or rejecting the packet based on the results of those tests. BPF is often extended by "overloading" the load (ld) and store (str) instructions. Traditional Unix-like BPF implementations can be used in userspace, despite being written for kernel-space. This is accomplished using
preprocessor In computer science, a preprocessor (or precompiler) is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by so ...
conditions.


Extensions and optimizations

Some projects use BPF instruction sets or execution techniques different from the originals. Some platforms, including FreeBSD, NetBSD, and WinPcap, use a just-in-time (JIT) compiler to convert BPF instructions into
native code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ver ...
in order to improve performance. Linux includes a BPF JIT compiler which is disabled by default. Kernel-mode interpreters for that same virtual machine language are used in raw data link layer mechanisms in other operating systems, such as Tru64 Unix, and for socket filters in the Linux kernel and in the WinPcap and
Npcap In the field of computer network administration, pcap is an application programming interface (API) for capturing network traffic. While the name is an abbreviation of ''packet capture'', that is not the API's proper name. Unix-like systems ...
packet capture mechanism. Since version 3.18, the Linux kernel includes an extended BPF virtual machine with ten 64-bit registers, termed extended BPF (eBPF). It can be used for non-networking purposes, such as for attaching eBPF programs to various tracepoints. Since kernel version 3.19, eBPF filters can be attached to sockets, and, since kernel version 4.1, to
traffic control Traffic management is a key branch within logistics. It concerns the planning control and purchasing of transport services needed to physically move vehicles (for example aircraft, road vehicles, rolling stock and watercraft) and freight. Traffi ...
classifiers for the ingress and egress networking data path. The original and obsolete version has been retroactively renamed to classic BPF (cBPF). Nowadays, the Linux kernel runs eBPF only and loaded cBPF bytecode is transparently translated into an eBPF representation in the kernel before program execution. All bytecode is verified before running to prevent denial-of-service attacks. Until Linux 5.3, the verifier prohibited the use of loops, to prevent potentially unbounded execution times; loops with bounded execution time are now permitted in more recent kernels. A user-mode interpreter for BPF is provided with the libpcap/WinPcap/Npcap implementation of the
pcap In the field of computer network administration, pcap is an application programming interface (API) for capturing network traffic. While the name is an abbreviation of ''packet capture'', that is not the API's proper name. Unix-like systems ...
API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
, so that, when capturing packets on systems without kernel-mode support for that filtering mechanism, packets can be filtered in user mode; code using the pcap API will work on both types of systems, although, on systems where the filtering is done in user mode, all packets, including those that will be filtered out, are copied from the kernel to user space. That interpreter can also be used when reading a file containing packets captured using pcap. Another user-mode interpreter is uBPF, which supports JIT and eBPF (without cBPF). Its code has been reused to provide eBPF support in non-Linux systems. Microsoft's eBPF on Windows builds on uBPF and the PREVAIL formal verifier. rBPF, a Rust rewrite of uBPF, is used by the Solana blockchain platform as the execution engine.


Programming

Classic BPF is generally emitted by a program from some very high-level textual rule describing the pattern to match. One such representation is found in libpcap. Classic BPF and eBPF can also be written either directly as
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
, or using an assembly language for a textual representation. Notable assemblers include Linux kernel's tool (cBPF), (cBPF), and the assembler (eBPF). The command can also act as a disassembler for both flavors of BPF. The assembly languages are not necessarily compatible with each other. eBPF bytecode has recently become a target of higher-level languages.
LLVM LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate repre ...
added eBPF support in 2014, and GCC followed in 2019. Both toolkits allow compiling C and other supported languages to eBPF. A subset of P4 can also be compiled into eBPF using BCC, an LLVM-based compiler kit.


History

The original paper was written by Steven McCanne and
Van Jacobson Van Jacobson (born 1950) is an American computer scientist, renowned for his work on TCP/IP network performance and scaling.
in 1992 while at
Lawrence Berkeley Laboratory Lawrence Berkeley National Laboratory (LBNL), commonly referred to as the Berkeley Lab, is a United States national laboratory that is owned by, and conducts scientific research on behalf of, the United States Department of Energy. Located in ...
. In August 2003,
SCO Group The SCO Group (often referred to SCO and later called The TSG Group) was an American software company in existence from 2002 to 2012 that became known for owning Unix operating system assets that had belonged to the Santa Cruz Operation (the ...
publicly claimed that the Linux kernel was infringing Unix code which they owned. Programmers quickly discovered that one example they gave was the Berkeley Packet Filter, which in fact SCO never owned. SCO has not explained or acknowledged the mistake but the ongoing legal action may eventually force an answer.


Security concerns

The Spectre attack could leverage the Linux kernel's eBPF interpreter or JIT compiler to extract data from other kernel processes. A JIT hardening feature in the kernel mitigates this vulnerability. Chinese computer security group Pangu Lab said the
NSA The National Security Agency (NSA) is a national-level intelligence agency of the United States Department of Defense, under the authority of the Director of National Intelligence (DNI). The NSA is responsible for global monitoring, collecti ...
used BPF to conceal network communications as part of a complex Linux backdoor.


See also

*
Data link layer The data link layer, or layer 2, is the second layer of the seven-layer OSI model of computer networking. This layer is the protocol layer that transfers data between nodes on a network segment across the physical layer. The data link layer ...
* Proof-carrying code * Express Data Path


References


Further reading

*


External links

* – an example of conventional BPF
eBPF.io - Introduction, Tutorials & Community Resources

bpfc, a Berkeley Packet Filter compiler, Linux BPF JIT disassembler
(part of netsniff-ng)

for Linux kernel
Linux filter documentation
for both cBPF and eBPF bytecode formats * {{GitHub, https://github.com/microsoft/ebpf-for-windows Internet Protocol based network software Packets (information technology)