Reconstruction of Personalized 3D Face Rigs from Monocular Video
We present a novel approach for the automatic creation of a personalized high-quality 3D face rig of an actor from just monocular video data, e.g. vintage movies. Our rig is based on three distinct layers that allow us to model the actor's facial shape as well as capture his person-specific expression characteristics at high fidelity, ranging from coarse-scale geometry to fine-scale static and transient detail on the scale of folds and wrinkles. At the heart of our approach is a parametric shape prior that encodes the plausible sub-space of facial identity and expression variations. Based on this prior, a coarse-scale reconstruction is obtained by means of a novel variational fitting approach. We represent person specific idiosyncrasies, which can not be represented in the restricted shape and expression space, by learning a set of medium-scale corrective shapes. Fine-scale skin detail, such as wrinkles, are captured from video via shading-based refinement, and a generative detail formation model is learned. Both the medium and fine-scale detail layers are coupled with the parametric prior by means of a novel sparse linear regression formulation. Once reconstructed, all layers of the face rig can be conveniently controlled by a low number of blendshape expression parameters, as widely used by animation artists. We show captured face rigs and their motions for several actors filmed in different monocular video formats, including legacy footage from YouTube, and demonstrate how they can be used for 3D animation and 2D video editing. Finally, we evaluate our approach qualitatively and quantitatively and compare to related state-of-the-art methods.