The pillar of computer vision was always considered by many to be the problem of alignment. The past decades we made great progress with systems that can find a set of sparse correspondences between an image and a template model. The pipeline of using such systems comes with a sequence of steps, first being the object detection, then classification, sparse landmark registration, and finally the 3d model registration which finally gives us the per pixel correspondences between the image and the object of interest. This not only results in a complex system, but also makes it difficult to distil information that are encapsulated in the different stages.
In this presentation I will talk about an end-to-end trainable method which not only can run in real-time, but moreover gives us directly a dense registration of the object-at-hand. Due to its generic nature it can be easily employed for an array of objects like human faces, ears, and even be used for deformable objects with huge pose variability such as human bodies. Finally, I will talk about future possible continuations of this work to build a more robust system using ideas from domain adaptation.
George Trigeorgis is currently a machine learning researcher with Citadel, London. George obtained his Ph.D. in machine learning from Imperial College London (2013-2017) during which he co-authored an array of papers on the unification of component analysis techniques with deep learning with various applications on the areas of 3d reconstruction, object alignment, facial biometrics, and speech emotion recognition. His publications were in top-tier venues of his field such as T-PAMI, NIPS, ICML & CVPR and in 2017 he was awarded the prestigious Google fellowship for his contribution in the areas of machine perception, speech technology, and computer vision. More recently he is working with the asset management group Citadel, where he helps to build data-driven models that can identify potential investable opportunities in the markets.