What actually are DeepFakes, Prof. Krüger?

07/14/2022 | Cognitive Assistants

There is currently a lot of talk in the media about so-called DeepFakes, i.e., content manipulated with the help of artificial intelligence. In this interview, Prof. Dr. Antonio Krüger, CEO of DFKI, explains deep fakes and how to recognize them and counter them.

1. What are DeepFakes?
The term DeepFake is a word combination of "deep," which refers to Deep Learning, an AI technique with underlying artificial neural networks, and "fake" for counterfeit. DeepFakes basically refers to media content manipulated using artificial intelligence methods, such as audio, photos, and video, in a quality that is not readily recognizable, if at all, as a fake.

Probably the best-known variant of DeepFakes is the so-called face swap. This means exchanging the face of a source person with the face of a target person in a picture or video. This can be used to fake statements and actions that the persons in question never said or performed.

2. How are such DeepFakes designed and programmed?
AI systems for generating face swap perfect the fake incrementally by using DeepLearning methods with artificial neural networks. In this process, a so-called encoder reads and learns a person's face by analyzing image material based on various biometric parameters and decomposing it into feature vectors. These are then combined layer by layer to form a model. By constantly comparing the model of the face generated by the encoder with the original, the AI system gradually optimizes the result. This often involves the use of so-called GAN technology (Generative Adversarial Networks), in which two neural networks progressively optimize the result in competition, so to speak. In the case of really good forgeries, the training includes 50,000 iterations and more. The decoder now generates the fake image or video by inserting the model of the face into the target format.

In the field of audio, AI technologies such as "Text-to-Speech" (TTS), a subset of "Natural Language Processing" (NLP), are already very far along in imitating voices more and more realistically. Synthetic voices are becoming more and more similar to human ones. There are already several available AI applications ("Voice Mimicry," "Lyrebird," "Voice Cloning"...), which, for example, deceptively imitate the voices of existing persons and which everybody can try out at home.

3. How can DeepFakes be recognized?
In the technical forensic field, the higher the resolution of a faked image or video, the more likely it is that a person will be able to recognize the fake with his or her own eyes and ears, as it were, on the basis of minimal artifacts and tiny errors without the support of a computer. In social media, however, where DeepFakes are mainly spread, they are usually videos and images of relatively poor quality. If at all, this rather low-resolution media content can only be unmasked as a fake by using special AI systems that are trained to do just that. Systems such as "Reality Defender" (AI Foundation) or "FaceForensics" (TU Munich) are strong assistance systems for media forensic experts. They include additional parameters and metadata such as spatial environment, voice, time and place of publication, etc., in their examination.

Regardless of this, media should not be consumed passively and credulously, but everyone should check content along the way for its cultural-factual plausibility.

4. What are the dangers associated with DeepFakes?
Humans are social creatures but not always honest. And publishing and spreading fake news and untrue claims is very old, probably as old as language itself. Humans are also, above all, audiovisual beings. When we hear what someone says, see what he does in a film sequence, this has a credible effect on us in the first step.

Today, the distribution of news and media content is no longer the responsibility of the classic media (print, TV, radio), whose overriding editorial control legitimized a presumption of truth to a certain extent. Through the Internet, especially social media, and through the rapid technical development in the area of mobile devices, not only is everyone now able to create audiovisual content, but also to distribute it with relevant reach. Apps such as "Reface," "Impressions," or "DeepFaceLab" are practically freely available; programming skills are only required to a limited extent so that theoretically, anyone can generate and distribute DeepFakes.

In fact, fake media content can cause immense damage if it is used for social engineering, opinion manipulation, and to influence politics, business, and society. One can think of using it, for example, to manipulate elections or to discredit individuals and companies.

5. How can DeepFakes be prevented and countered in the context of disinformation?
There is a need for scientific-technological, political, legal, and cultural activities. Technologically, work is being done in various ways to prevent DeepFakes or to facilitate their unmasking. In particular, the use of digital watermarks, blockchain technology, or the consistent certification of software is being examined. The EU Commission is working on the regulation of AI technology. Policymakers are looking to hold content distributors accountable to prevent a flood of DeepFakes. Companies such as Facebook, Microsoft, Google, and Amazon or government institutions such as the U.S. Department of Defense are investing large sums in developing DeepFake detection tools. Culturally, the aim is to teach media and information literacy in schools and to raise awareness among citizens so that, for example, election decisions cannot be influenced by fakes and false facts.

6. How do deep fakes affect trust in the media?
The line between permissible editing of media content and DeepFake is blurred. Moderate editing of photos has been common and well-known for years. However, the new dimension of DeepFakes makes FakeNews even more dangerous. It is to be expected that the credibility of media content will suffer, and trust in media, in general, will continue to erode. Striking statements by public figures such as politicians will then be subject to a general caveat. On the other hand, if they were filmed making sexist or racist remarks or lying, for example, they could always claim it was DeepFake. But this is where artificial intelligence will be able to make an effective contribution and support forensic experts in detecting media fakes.

Prof. Krüger, thank you very much for the interview!

Press contact:

DFKI Unternehmenskommunikation

communications@dfki.de