DFKI-LT - A DevOps Manifesto for Speech Corpus Management
A DevOps Manifesto for Speech Corpus Management
2 28th Conference on Electronic Speech Signal Processing (ESSV),
In this paper, we introduce certain concepts from the DevOps philosophy, and more generally from the software development lifecycle. We argue that the separation between source code and how it is built and released for distribution can be applied to speech corpora as well. We draw a distinction between the developers and maintainers of a speech corpus on one hand, and the researchers who use it on the other. We propose conventions to efficiently manage corpus metadata like source code, and speech data like static assets that can be retrieved automatically. Finally, we mention several use cases which illustrate the merits of these conventions.
Files: BibTeX, Steiner_2.pdf, Steiner_2.pdf