Skip to main content Skip to main navigation

Machine Learning with Synthetic Data - Research Unit Augmented Vision receives Best Paper Award at ICPR 2022

Scientists from the Research Unit Augmented Vision received an award for the best industry paper at the 26th international Conference on Pattern Recognition (ICPR 2022). Their work focused on training learning systems with synthetic data.

Impossible Instance Extractor Triplet Autoencorder (II-E-TAE) model architecture.
Impossible Instance Extractor Triplet Autoencorder (II-E-TAE) model architecture.

The paper, titled "Autoencoder for Synthetic to Real Generalization: From Simple to More Complex Scenes" by Steve Dias Da Cruz, Prof. Didier Stricker, Dr. Bertram Taetz, and Thomas Stifter, focuses on training on synthetic-only data and elaborates in a solution-oriented manner on the potential of this approach in terms of cost efficiency and safety. 

The ICPR is the world's leading conference on pattern recognition, computer vision and image processing and took place in Montréal/ Québec from August 21th until 25th, 2022. 

Learning on synthetic data and transferring the resulting features to their real-world counterparts is an important challenge to save costs and significantly increase safety in machine learning. Synthetic data is generated by a computer algorithm. Unlike original data, they no longer have any personal reference. Nevertheless, the synthetic data is used to mirror the original data sets in order to make reliable statistical statements. 

According to the approach of the DFKI scientists, only synthetic images are used for training algorithms. On the one hand, this increases generalizability. Furthermore, the preservation of meaning on real data sets is improved with increasing visual complexity. Meaningfully, a new sampling technique is introduced to match semantically important parts of the image. Meanwhile, the other parts are randomly selected, resulting in salient feature extraction and neglecting the unimportant parts. This facilitates generalization to real data and also shows that the approach is better than fine-tuned classification models.


Researcher, DFKI

    Press contact:

    Christian Heyer

    Head of Corporate Communications, DFKI Kaiserslautern