• Announcements

    Our 3D reconstruction method (3DI) was published in TPAMI

    3DI was published on Feb’24 issue of T-PAMI: https://ieeexplore.ieee.org/abstract/document/10330115 This paper shows how to implement the 3DI method efficiently, and provides experiments that show the ability of 3DI to capture person-specific facial morphology. Further, experiments show that 3DI stands out with its reconstruction performance that becomes increasingly more accurate as more frames per person are provided, whereas the performance of the compared deep learning-based methods tend to saturate earlier, with fewer number of frames.

  • Deep Learning

    On the Attention Mechanism

    What is revolutionary about the attention mechanism is that it allows to dynamically change the weight of a piece of information. This is not possible with, say, traditional RNNs or LSTMs. Attention mechanisms were proposed for sequence-to-sequence translation, where they made a significant difference by allowing networks to identify which words are more relevant with the $t$th word in the translation (i.e., soft alignment), and putting more weight to them even if they are far apart from the word that is…

  • Deep Learning

    LSTM Examples #1: Basic Time Series Prediction

    The results above illustrate how LSTM works with continuous input/output on simple 1D time series prediction tasks. We aim to estimate simple shapes by using past data. We follow an increasingly more complex scenarios. Results show the limitations of continuous time-series prediction via LSTM; the last slides show that even simple shapes cannot be accurately predicted. But the code is helpful to illustrate the basics of time-series prediction. Below we provide all the commands and the python script needed to generate…

  • Deep Learning

    Self-supervision on Deep Nets

    Arguably, the main reason that deep nets became so powerful is self-supervision. In many domains, from image, to text, to DNA analysis, the concept of self-supervision was sufficient to generate practically infinite “labelled” data for training deep models. The idea is simple yet extremely powerful: just hide some parts of the (unlabelled) data and turn the hidden parts into the labels to predict. Here are some notes (mostly to myself) about self-supervision. There are two standard ways to make self supervision:…

  • Computer Vision,  Research

    State of the art in 3D face reconstruction may be wrong

    Our IJCB’23 study exposed a significant problem with the benchmark procedures of 3D face reconstruction — something that should make any researcher in the field worry. That is, we showed that the standard metric for evaluating 3D face reconstruction methods, namely geometric error via Chamfer (i.e., nearest-neighbor) correspondence, is highly problematic. Results showed that the Chamfer error does not only significantly underestimate the true error, but it does so inconsistently across reconstruction methods, thus the ranking between methods can be artificially…

  • Computer Vision,  Research,  Software

    3DI: Face Reconstruction via Inequality Constraints

    We present the CUDA code of our optimization-based 3DMM fitting method (i.e., no learning), which was first presented in ECCV’20. The journal version of the paper (currently under revision) presented a significant speed up with a new optimization approach, making the method feasible for real-life applications. 3DI is an optimization-based 3DMM fitting (3D reconstruction) method that enforces inequality constraints on 3DMM parameters and landmarks, thus significantly restricts the search space and rules out implausible solutions (Figure 1). 3DI is not a…

  • Research,  Signal Processing,  Software

    SyncRef: Fast & Scalable Way to Find Synchronized Time Series

    In CVPR’20 presented a new method (with code) for finding the largest subset of synchronized time series from a given set of time series. Specifically, we aim to find the largest subset of time series such that all pairs of in the subset are correlated at least by a (given) threshold value $\rho$. This is an NP-hard problem and the exact solution is, in general, unfeasible. We propose a new method, called SyncRef, for finding an approximate solution in an efficient…

  • Computer Vision,  Research

    Is Pose & Expression Separable with WP Camera?

    This study on facial analysis pipelines shows the limitations of using a Weak Perspective (WP) camera when it comes to decoupling pose and expression from face videos. That is, when decoupling of facial pose and expression within images requires a camera model for 3D-to-2D mapping when done through 3D reconstruction. The weak perspective (WP) camera has been the most popular choice; it is the default, if not the only option, in state-of-the-art facial analysis methods and software. WP camera is justified…