• DFKI

The VOICE Awards Corpus in Numbers

Number of dialogs and systems per year

YearDialogsSystems
200521334
200652942
200745226
200847129
200930518
Total1970120

Domains

All systems in the VOICE Awards Corpus were hand-classified for their domain, along two axes: First, the kind of interaction that is taking place ("goal domain") between the user and system, and second, the topic of the dialog system ("content domain").

Goal domainSystems
Banking23
Connect4
Data entry18
Game8
Information36
Order18
Transit8
Other40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Content domainSystems
Banking23
Cards5
Communication11
Flight3
Hotel3
Insurance2
Lotto2
Medical3
Meters3
Mobile phone13
Movies3
News4
Package tracking5
Prices2
Product ordering3
Public service4
Ringtones3
Sports8
Taxi4
Traffic2
Transit9
TV4
University1
Weather3
Other33

Comparing VOICE Awards with other corpora

With a sum of 1970 dialogs, the VOICE Awards Corpus is the biggest existing German human-machine-dialog corpus. It combines the advantages of several other corpora, like Verbmobil, which relies on dialogs, but is a human-human dialog corpus or Smartweb, which also is a huge human-machine corpus, but contains queries and no dialogs. Besides, the VOICE Awards Corpus is the only human-machine-dialog corpus which is annotated with dialog acts and many other informations, such as miscommunication and success measures.