3D Common Corruptions and Data Augmentation



Oğuzhan Fatih Kar (EPFL)

Oguzhan Fatih Kar is a Ph.D. student in computer science at the Swiss Federal Institute of Technology (EPFL), advised by Amir Zamir. He got his B.S. and M.S. degrees in electrical engineering at METU, Turkey in 2017 and 2019, respectively. His research focuses on building robust and general visual perception systems that can operate in the real world.



Short Abstract: Computer vision models deployed in the real world will encounter naturally occurring distribution shifts from their training data. These shifts range from lower-level distortions, such as motion blur and illumination changes, to semantic ones, like object occlusion. Each of them represents a possible failure mode of a model and has been frequently shown to result in profoundly unreliable predictions. Thus, understanding model failures against these shifts and developing better robustness mechanisms are critical before deploying these models in the real world. Our work presents a set of image transformations that can be used as corruptions to evaluate the robustness of models as well as data augmentation mechanisms for training neural networks. The primary distinction of the proposed transformations is that, unlike existing approaches such as Common Corruptions, the geometry and semantics of the scene is incorporated in the transformations -- thus leading to corruptions that are more likely to occur in the real world. In this talk, I will discuss several properties of these transformations, e.g. these transformations are `efficient' (can be computed on-the-fly), `extendable' (can be applied on most image datasets), expose vulnerability of existing models, and can effectively make models more robust when employed as `3D data augmentation' mechanisms.