FaceNet is a
facial recognition system
A facial recognition system is a technology capable of matching a human face from a digital image or a video frame against a database of faces. Such a system is typically employed to authenticate users through ID verification services, and wor ...
developed by Florian Schroff, Dmitry Kalenichenko and James Philbina, a group of researchers affiliated with
Google
Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
. The system was first presented at the 2015
IEEE
The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operati ...
Conference on Computer Vision and Pattern Recognition.
The system uses a deep convolutional neural network to learn a
mapping
Mapping may refer to:
* Mapping (cartography), the process of making a map
* Mapping (mathematics), a synonym for a mathematical function and its generalizations
** Mapping (logic), a synonym for functional predicate
Types of mapping
* Animated m ...
(also called an
embedding
In mathematics, an embedding (or imbedding) is one instance of some mathematical structure contained within another instance, such as a group that is a subgroup.
When some object X is said to be embedded in another object Y, the embedding is giv ...
) from a set of face images to a 128-dimensional
Euclidean space
Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are Euclidean sp ...
, and assesses the
similarity
Similarity may refer to:
In mathematics and computing
* Similarity (geometry), the property of sharing the same shape
* Matrix similarity, a relation between matrices
* Similarity measure, a function that quantifies the similarity of two objects
* ...
between faces based on the square of the Euclidean distance between the images' corresponding normalized vectors in the 128-dimensional Euclidean space. The system uses the
triplet loss
Triplet loss is a loss function for machine learning algorithms where a reference input (called anchor) is compared to a matching input (called positive) and a non-matching input (called negative). The distance from the anchor to the positive is mi ...
function as its cost function and introduced a new online triplet mining method. The system achieved an accuracy of 99.63%, which is the highest score to date on the Labeled Faces in the Wild dataset using the ''unrestricted with labeled outside data'' protocol.
Structure
Basic structure
The structure of FaceNet is represented schematically in Figure 1.

For training, researchers used input batches of about 1800 images. For each identity represented in the input batches, there were 40 similar images of that identity and several randomly selected images of other identities. These batches were fed to a deep convolutional neural network, which was trained using
stochastic gradient descent
Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of ...
with standard
backpropagation
In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions gener ...
and the
Adaptive Gradient Optimizer (AdaGrad) algorithm. The
learning rate
In machine learning and statistics, the learning rate is a Hyperparameter (machine learning), tuning parameter in an Mathematical optimization, optimization algorithm that determines the step size at each iteration while moving toward a minimum of ...
was initially set at 0.05, which was later lowered while finalizing the model.
Structure of the CNN
The researchers used two types of architectures, which they called NN1 and NN2, and explored their trade-offs. The practical differences between the models lie in the difference of parameters and FLOPS. The details of the NN1 model are presented in the table below.
Triplet loss
A key innovation of the system was the
triplet loss function and its associated mining method. This function was has since become central in a variety of other
one-shot learning
One-shot learning is an object categorization problem, found mostly in computer vision. Whereas most machine learning-based object categorization algorithms require training on hundreds or thousands of examples, one-shot learning aims to classify ...
problems.
Performance
On the widely used Labeled Faces in the Wild (LFW) dataset, the FaceNet system achieved an accuracy of 99.63% which is the highest score on LFW in the unrestricted with labeled outside data protocol.
On YouTube Faces DB the system achieved an accuracy of 95.12%.
See also
*
DeepFace DeepFace is a deep learning facial recognition system created by a research group at Facebook. It identifies human faces in digital images. The program employs a nine-layer neural network with over 120 million connection weights and was trained on ...
*
FindFace
FindFace is a face recognition technology developed by the Russian company NtechLab that specializes in neural circuit, neural network tools. The company provides a line of services for the state and various business sectors based on FindFace algo ...
Further reading
*
*
*For a discussion on the vulnerabilities of Facenet-based face recognition algorithms in applications to the
Deepfake
Deepfakes (a portmanteau of " deep learning" and "fake") are synthetic media in which a person in an existing image or video is replaced with someone else's likeness. While the act of creating fake content is not new, deepfakes leverage powerf ...
videos:
*For a discussion on applying FaceNet for verifying faces in Android:
Amazon
References
{{reflist
Facial recognition software
Image search
Deep learning software applications
Artificial neural networks