Skip to main content Skip to main navigation



Efficient Data Processing for Deep Learning

  • Duration:
  • Application fields

The astonishing speed of developments in deep learning, especially in the field of computer vision, is largely based on the availability of increasingly powerful co-processors. They allow to process ever larger data sets and to test more hypotheses in a shorter time, as well as to increase the complexity of the models under consideration with each new hardware generation. The most frequently used co-processors are Graphics Processing Units (GPU) manufactured by NVIDIA. However, the undeniable successes of the past years are increasingly accompanied by problems in practice. While GPUs are increasing significantly in speed from generation to generation (e.g. through further parallelization or even integration of special computing units for deep learning), the speed of the main processors (Central Processing Unit; CPU) and hard disks is stagnating. When training artificial neural networks, this increasing speed imbalance manifests itself in a very low utilization of GPUs (often below 20% if the recommended procedures are followed).

Such low utilization almost always results from long waiting times for new input data. While modern GPUs could easily process hundreds or even thousands of frames per second for training, frameworks like PyTorch and Tensorflow provide less than 100 frames per second per CPU core. The recommendation to close this gap by using additional CPU resources often fails already today because too few CPU cores per computing node (server) and thus GPU are available. An even further increasing GPU density is already foreseeable. One of EDeL's goals is therefore to develop software solutions for the highest possible data throughput per CPU core for efficient training of deep learning models. This is to be achieved both by using optimized software and by relocating calculations, which currently often take place on CPUs, to GPUs. This will reduce idle times and thus increase the efficiency of the training of machine learning methods.

Another aspect that stands in the way of efficient use of hardware is the large variety of data sets and their formats. Even very similar data sets are often delivered in different formats, which makes it impossible to exchange them without adapting the training program. In addition, the above mentioned formats can usually only be read with insufficient speed or high CPU load, which in turn stands in the way of full use of the GPUs. Therefore another main goal of EDeL is the development of a unified storage format for data sets, which makes them exchangeable and especially fast loadable with low CPU load.

The planned developments are to be made available to the widest possible audience from research and industry. To this end, libraries are to be made available under an open source software license, as is customary in the deep learning sector.

Translated with (free version)


BMBF - Federal Ministry of Education and Research


BMBF - Federal Ministry of Education and Research

Publications about the project

Andrey Guzhov; Federico Raue; Jörn Hees; Andreas Dengel

In: 2022 IEEE International Conference on Acoustics, Speech and Signal Processing Proceedings. International Conference on Acoustics, Speech and Signal Processing (ICASSP-2022), May 7-13, Singapore, Singapore, Pages 976-980, ISBN 978-1-6654-0540-9, IEEE, 5/2022.

To the publication

Joachim Folz; Sebastian Palacio; Jörn Hees; Andreas Dengel

In: WACV. IEEE Winter Conference on Applications of Computer Vision (WACV-2020), March 2-5, Snowmass Village, Colorado, USA, IEEE, 2020.

To the publication