In-Memory Distributed Training of Linear-Chain Conditional Random Fields, with an Application to Fine-Grained Named Entity Recognition

Robert Schwarzenberg, Leonhard Hennig, Holmer Hemsen

In: Springer Lecture Notes in Artificial Intelligence. Conference of the German Society for Computational Linguistics and Language Technology (GSCL-2017) September 13-14 Berlin Germany Springer International Publishing 1/2018.


Recognizing fine-grained named entities, i.e. "street" and "city" instead of just the coarse type "location", has been shown to increase task performance in several contexts. Fine-grained types, however, amplify the problem of data sparsity during training, which is why larger amounts of training data are needed. In this contribution we address scalability issues caused by the larger training sets. We distribute and parallelize feature extraction and parameter estimation in linear-chain conditional random fields, which are a popular choice for sequence labeling tasks such as named entity recognition (NER) and part of speech (POS) tagging. To this end, we employ the parallel stream processing framework Apache Flink which supports in-memory distributed iterations. Due to this feature, contrary to prior approaches, our system becomes iteration-aware during gradient descent. We experimentally demonstrate the scalability of our approach and also validate the parameters learned during distributed training in a fine-grained NER task.


Weitere Links

in-memory-distributed-training-gscl-2017.pdf (pdf, 149 KB)

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence