FAISS (Facebook AI Similarity Search) is an

open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...

library A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...

for similarity search and clustering of vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in

RAM Ram, ram, or RAM most commonly refers to: * A male sheep * Random-access memory, computer memory * Ram Trucks, US, since 2009 ** List of vehicles named Dodge Ram, trucks and vans ** Ram Pickup, produced by Ram Trucks Ram, ram, or RAM may also ref ...

. It also contains supporting code for evaluation and parameter tuning. FAISS is written in C++ with complete wrappers for Python and C. Some of the most useful algorithms are implemented on the GPU using

CUDA In computing, CUDA (Compute Unified Device Architecture) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated gene ...

Features

FAISS is organized as a toolbox that contains a variety of indexing methods that commonly involve a chain of components (preprocessing, compression, non-exhaustive search, etc.). The scope of the library is intentionally limited to focus on ANNS algorithmic implementation and to avoid facilities related to

database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...

functionality,

distributed computing Distributed computing is a field of computer science that studies distributed systems, defined as computer systems whose inter-communicating components are located on different networked computers. The components of a distributed system commu ...

feature extraction Feature may refer to: Computing * Feature recognition, could be a hole, pocket, or notch * Feature (computer vision), could be an edge, corner or blob * Feature (machine learning), in statistics: individual measurable properties of the phenome ...

algorithms. FAISS is designed with the following assumptions: * Primary data type for vector representation is FP32. The support of other floating-point formats, such as

BF16 The bfloat16 (brain floating point) floating-point format is a computer number format occupying 16 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. This format is a shortened (16-bi ...

and

FP16 In computing, half precision (sometimes called FP16 or float16) is a binary floating-point computer number format that occupies 16 bits (two bytes in modern computers) in computer memory. It is intended for storage of floating-point values in a ...

, is provided. * Prefer batches of input queries over a single input query for the search. * Emphasize on allowing users to write a fast prototyping code using its Python wrappers. * The code should be as open as possible, so that users can access all the implementation details of the indexes. The following major categories of indexing methods are supported: *

Brute-force search In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of Iteration#Computing, systematically checking all possible candida ...

* Inverted-lists based indices * Graph indices, including (Hierarchical navigable small world) HNSW and Navigating Spread-out Graph (NSG) * Locality-sensitive hashing (LSH) The following families of

vector quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. Developed in the early 1980s by Robert M. Gray, it was ori ...

methods are supported: * Binary Quantization * Scalar Quantization (SQ) *

Product Quantization Product may refer to: Business * Product (business), an item that can be offered to a market to satisfy the desire or need of a customer. * Product (project management), a deliverable or set of deliverables that contribute to a business solution ...

(PQ), including Polysemous Codes, Optimized Product Quantization (OPQ) and Quicker ADC (PQFastScan) * Additive Quantization (AQ), including Residual Quantization (RQ) and Local Search Quantization (LSQ) * Neural Quantization, including QINCO FAISS focuses on

euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of the line segment between them. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, and therefore is o ...

and inner product distance for floating-point data. The limited support of other distances (

manhattan distance Taxicab geometry or Manhattan geometry is geometry where the familiar Euclidean distance is ignored, and the distance between two point (geometry), points is instead defined to be the sum of the absolute differences of their respective Cartesian ...

, Lp distance, etc.) is also available. FAISS code supports multithreading via

OpenMP OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, ...

, utilizes

BLAS Blas is mainly a Spanish given name and surname, related to Blaise. It may refer to Places *Piz Blas, mountain in Switzerland * San Blas (disambiguation) People * Ricardo Blas Jr. (born 1986) Judo athlete from Guam * Blas Antonio Sáenz (fl. 18 ...

via OpenBLAS or Intel MKL, and also uses custom

SIMD Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...

kernels for

x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...

and ARM Neon CPUs. Besides the similarity search, FAISS provides the following useful facilities: *

k-means clustering ''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition of a set, partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster (statistics), cluste ...

* Random-matrix rotations for spreading the variance over all the dimensions without changing the measured distances *

Principal component analysis Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing. The data is linearly transformed onto a new coordinate system such that th ...

Data deduplication In computing, data deduplication is a technique for eliminating duplicate copies of repeating data. Successful implementation of the technique can improve storage utilization, which may in turn lower capital expenditure by reducing the overall amou ...

, which is especially useful for image datasets. FAISS has a standalone ''Vector Codec'' functionality for the

lossy compression In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size ...

of vectors, allowing to trade the representation accuracy for the binary size.

Applications

Typical FAISS applications include recommender systems,

data mining Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and ...

text retrieval Document retrieval is defined as the matching of some stated user query against a set of free-text records. These records could be any type of mainly natural language, unstructured text, such as newspaper articles, real estate records or paragraphs ...

and content moderation. FAISS was reported to index 1.5 trillion 144-dimensional vectors for internal

Meta Platforms Meta Platforms, Inc. is an American multinational technology company headquartered in Menlo Park, California. Meta owns and operates several prominent social media platforms and communication services, including Facebook, Instagram, Threads ...

applications. FAISS is used in vector databases as a core component of a search engine ( OpenSearch, Milvus, Vearch). FAISS is often considered as a baseline in similarity search benchmarks. FAISS has an integration with Haystack,

LangChain LangChain is a software framework that helps facilitate the integration of large language models (LLMs) into applications. As a language model integration framework, LangChain's use-cases largely overlap with those of language models in general, ...

frameworks. Various advanced code snippets for FAISS can be found on it
snippets wiki page
an
case studies wiki page

References

External links

* * {{GitHub, facebookresearch/faiss
Official FAISS wiki

Guidelines to choose a FAISS index

Autofaiss
- automatically create Faiss knn indices with the most optimal similarity search parameters Free software programmed in C++ Software using the MIT license

Features

Applications

See also

References

External links