3D Face Morphable Model
   HOME

TheInfoList



OR:

In
computer vision Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
and
computer graphics Computer graphics deals with generating images and art with the aid of computers. Computer graphics is a core technology in digital photography, film, video games, digital art, cell phone and computer displays, and many specialized applications. ...
, the 3D Morphable Model (3DMM) is a generative technique that uses methods of
statistical shape analysis Statistical shape analysis is an analysis of the geometry, geometrical properties of some given set of shapes by statistical methods. For instance, it could be used to quantify differences between male and female gorilla skull shapes, normal and pa ...
to model 3D objects. The model follows an analysis-by-synthesis approach over a dataset of 3D example shapes of a single class of objects (e.g., face, hand). The main prerequisite is that all the 3D shapes are in a dense point-to-point correspondence, namely each point has the same semantical meaning over all the shapes. In this way, we can extract meaningful statistics from the dataset and use it to represent new plausible shapes of the object's class. Given a 2D image, we can represent its 3D shape via a fitting process or generate novel shapes by directly sampling from the statistical shape distribution of that class. The question that initiated the research on 3DMMs was to understand how a visual system could handle the vast variety of images produced by a single class of objects and how these can be represented. The primary assumption in developing 3DMMs was that prior knowledge about object classes was crucial in vision. 3D Face Morphable Models are the most popular 3DMMs since they were the first to be developed in the field of
facial recognition Facial recognition or face recognition may refer to: *Face detection, often a step done before facial recognition *Face perception, the process by which the human brain understands and interprets the face *Pareidolia, which involves, in part, seein ...
. It has also been applied to the whole human body, the hand, the ear, cars, and animals.


3D Face Morphable Model

In
computer vision Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
and
computer graphics Computer graphics deals with generating images and art with the aid of computers. Computer graphics is a core technology in digital photography, film, video games, digital art, cell phone and computer displays, and many specialized applications. ...
, the 3D Face Morphable Model (3DFMM) is a generative technique for modeling textured 3D faces. The generation of new faces is based on a pre-existing
database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
of example faces acquired through a
3D scanning 3D scanning is the process of analyzing a real-world object or environment to collect three dimensional data of its shape and possibly its appearance (e.g. color). The collected data can then be used to construct digital 3D models. A 3D scanner ...
procedure. All these faces are in dense point-to-point correspondence, which enables the generation of a new realistic face (''morph'') by combining the acquired faces. A new 3D face can be inferred from one or multiple existing images of a face or by arbitrarily combining the example faces. 3DFMM provides a way to represent face shape and texture disentangled from external factors, such as camera parameters and illumination. The 3D Morphable Model (3DMM) is a general framework that has been applied to various objects other than faces, e.g., the whole human body, specific body parts, and animals. 3DMMs were first developed to solve vision tasks by representing objects in terms of the prior knowledge that can be gathered from that object class. The prior knowledge is statistically extracted from a database of 3D examples and used as a basis to represent or generate new plausible objects of that class. Its effectiveness lies in the ability to efficiently encode this prior information, enabling the solution of otherwise ill-posed problems (such as single-view 3D object reconstruction). Historically, face models have been the first example of morphable models, and the field of 3DFMM remains a very active field of research as today. In fact, 3DFMM has been successfully employed in
face recognition A facial recognition system is a technology potentially capable of matching a human face from a digital image or a Film frame, video frame against a database of faces. Such a system is typically employed to authenticate users through ID verif ...
,
entertainment industry Entertainment is a form of activity that holds the attention and Interest (emotion), interest of an audience or gives pleasure and delight. It can be an idea or a task, but it is more likely to be one of the activities or events that have deve ...
(
gaming Gaming may refer to: Games and sports The act of playing games, as in: * Legalized gambling, playing games of chance for money, often referred to in law as "gaming" * Playing a role-playing game, in which players assume fictional roles * Playing ...
and
extended reality Extended reality (XR) is both an umbrella term to refer to and interpolate between augmented reality (AR), mixed reality (MR), and virtual reality (VR), as well as to extrapolate (extend) beyond these, e.g. allowing us to see sound waves, rad ...
, virtual try on, face replacement, face reenactment),
digital forensics Digital forensics (sometimes known as digital forensic science) is a branch of forensic science encompassing the recovery, investigation, examination, and analysis of material found in digital devices, often in relation to mobile devices and com ...
, and medical applications.


Modeling

In general, 3D faces can be modeled by three variational components extracted from the face dataset: * shape model - model of the distribution of geometrical shape across different subjects * expression model - model of the distribution of geometrical shape across different facial expressions * appearance model - model of the distribution of surface textures (color and illumination)


Shape modeling

The 3DFMM uses statistical analysis to define a ''statistical shape space'', a vectorial space equipped with a
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
, or ''prior.'' To extract the ''prior'' from the example dataset, all the 3D faces must be in a dense point-to-point correspondence. This means that each point has the same semantical meaning on each face (e.g., nose tip, edge of the eye). In this way, by fixing a point, we can, for example, derive the probability distribution of the texture's
red channel Color digital images are made of pixels, and pixels are made of combinations of primary colors represented by a series of code. A channel in this context is the grayscale image of the same size as a color image, made of just one of these primary c ...
values over all the faces. A face shape S of n vertices is defined as the vector containing the 3D coordinates of the n vertices in a specified order, that is S \in \mathbb^. A shape space is regarded as a d-dimensional space that generates plausible 3D faces by performing a lower-dimensional (d \ll n) parametrization of the database. Thus, a shape S can be represented through a generator function \mathbf: \mathbb^d \rightarrow \mathbb^ by the parameters \mathbf \in \mathbb^d, \mathbf(\mathbf) = S \in \mathbb^. The most common statistical technique used in 3DFMM to generate the shape space is
Principal Component Analysis Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing. The data is linearly transformed onto a new coordinate system such that th ...
(PCA), that generates a basis that maximizes the variance of the data. Performing PCA, the generator function is linear and defined as \mathbf(\mathbf) = \mathbf + \mathbf\mathbfwhere \mathbf is the mean over the training data and \mathbf \in \mathbb^ is the matrix that contains the d most dominant eigenvectors. Using a unique generator function for the whole face leads to the imperfect representation of finer details. A solution is to use local models of the face by segmenting important parts such as the eyes, mouth, and nose.


Expression modeling

The modeling of the expression is performed by explicitly subdividing the representation of the identity from the facial expression. Depending on how identity and expression are combined, these methods can be classified as additive, multiplicative, and nonlinear. The additive model is defined as a linear model and the expression is an additive offset with respect to the identity \mathbf(\mathbf^s, \mathbf^w) = \mathbf + \mathbf^s\mathbf^s + \mathbf^e\mathbf^ewhere \mathbf^s,\mathbf^ and \mathbf^,\mathbf^e are the matrices basis and the coefficients vectors of the shape and expression space, respectively. With this model, given the 3D shape of a subject in a neutral expression \mathbf_ and in a particular expression \mathbf^, we can transfer the expression to a different subject by adding the offset \Delta_ = \mathbf^ - \mathbf^. Two PCAs can be performed to learn two different spaces for shape and expression. In a multiplicative model, shape and expression can be combined in different ways. For example, by exploiting d_e operators \mathbf_j: \mathbb^ \rightarrow \mathbb^ that transform a neutral expression into a target blendshape we can write\mathbf(\mathbf^s, \mathbf^e) = \sum_^w_j^e\mathbf_j(\mathbf(\mathbf^s) + \mathbf^s) + \mathbf_j^ewhere \mathbf^s and \mathbf^s_j are vectors to correct to the target expression. The nonlinear model uses nonlinear transformations to represent an expression.


Appearance modeling

The color information id often associated to each vertex of a 3D shape. This one-to-one correspondence allows us to represent appearance analogously to the linear shape model \mathbf(\mathbf^t) = \mathbf + \mathbf^\mathbf^where \mathbf^t is the coefficients vector defined over the basis matrix \mathbf^t. PCA can be again be used to learn the appearance space.


History

Facial recognition Facial recognition or face recognition may refer to: *Face detection, often a step done before facial recognition *Face perception, the process by which the human brain understands and interprets the face *Pareidolia, which involves, in part, seein ...
can be considered the field that originated the concepts that later on converged into the formalization of the morphable models. The
eigenface An eigenface ( ) is the name given to a set of eigenvectors when used in the computer vision problem of human face recognition. The approach of using eigenfaces for recognition was developed by Sirovich and Kirby and used by Matthew Turk and ...
approach used in face recognition represented faces in a vector space and used principal component analysis to identify the main modes of variation. However, this method had limitations: it was constrained to fixed poses and illumination and lacked an effective representation of shape differences. As a result, changes in the eigenvectors did not accurately represent shifts in facial structures but caused structures to fade in and out. To address these limitations, researchers added an eigendecomposition of 2D shape variations between faces. The original eigenface approach aligned images based on a single point, while new methods established correspondences on many points. Landmark-based face warping was introduced by Craw and Cameron (1991), and the first statistical shape model,
Active Shape Model Active shape models (ASMs) are statistical models of the shape of objects which iteratively deform to fit to an example of the object in a new image, developed by Tim Cootes and Chris Taylor in 1995. The shapes are constrained by the PDM ( point dis ...
, was proposed by Cootes et al. (1995). This model used shape alone, but
Active Appearance Model An active appearance model (AAM) is a computer vision algorithm for matching a statistical model of object shape and appearance to a new image. They are built during a training phase. A set of images, together with coordinates of landmarks that appe ...
by Cootes et al. (1998) combined shape and appearance. Since these 2D methods were effective only for fixed poses and illumination, they were extended by Vetter and Poggio (1997) to handle more diverse settings. Even though separating shape and texture was effective for face representation, handling pose and illumination variations required many separate models. On the other hand, advances in 3D computer graphics showed that simulating pose and illumination variations was straightforward. The combination of graphics methods with face modeling led to the first formulation of 3DMMs by Blanz and Vetter (1999). The analysis-by-synthesis approach enabled the mapping of the 3D and 2D domains and a new representation of 3D shape and appearance. Their work is the first to introduce a statistical model for faces that enabled 3D reconstruction from 2D images and a parametric face space for controlled manipulation. In the original definition of Blanz and Vetter, the shape of a face is represented as the vector S = (X_1, Y_1, Z_1, ..., X_n, Y_n, Z_n)^T \in \mathbb^ that contains the 3D coordinates of the n vertices. Similarly, the texture is represented as a vector T = (R_1, G_1, B_1, ..., R_n, G_n, B_n)^T \in \mathbb^ that contains the three RGB color channels associated with each corresponding vertex. Due to the full correspondence between exemplar 3D faces, new shapes \mathbf_ and textures \mathbf_ can be defined as a linear combination of the m example faces:\mathbf_ =\sum_^m a_i \mathbf_i \qquad \mathbf_ =\sum_^m b_i \mathbf_i \qquad \text \; \sum_^m a_i = \sum_^m b_i = 1Thus, a new face shape and texture is parametrized by the shape \mathbf = (a_1, a_2,..., a_m)^T and texture coefficients \mathbf = (b_1, b_2,..., b_m)^T. To extract the statistics from the dataset, they performed PCA to generate the shape space of dimension to d and used a linear model for shape and appearance modeling. In this case, a new model can be generated in the orthogonal basis using the shape and the texture eigenvector s_i and t_i, respectively: \mathbf_ = \mathbf + \sum_^m a_i \mathbf_i \qquad \mathbf_ = \mathbf + \sum_^m b_i \mathbf_i \qquad where \mathbf and \mathbf are the mean shape and texture of the dataset.


Publicly available databases

In the following table, we list the publicly available databases of human faces that can be used for the 3DFMM.


See also

*
Statistical shape analysis Statistical shape analysis is an analysis of the geometry, geometrical properties of some given set of shapes by statistical methods. For instance, it could be used to quantify differences between male and female gorilla skull shapes, normal and pa ...


References


External links

* {{Cite web , title=Curated List of 3D Morphable Model Software and Data , website=
GitHub GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
, url=https://github.com/3d-morphable-models/curated-list-of-awesome-3D-Morphable-Model-software-and-data , access-date=2024-07-11 Computer vision 3D computer graphics