DFKI-LT - In-Memory Distributed Training of Linear-Chain Conditional Random Fields, with an Application to Fine-Grained Named Entity Recognition
In-Memory Distributed Training of Linear-Chain Conditional Random Fields, with an Application to Fine-Grained Named Entity Recognition
2 Springer Lecture Notes in Artificial Intelligence, Berlin, Germany, Springer International Publishing, 1/2018
Recognizing fine-grained named entities, i.e. "street" and "city" instead of just the coarse type "location", has been shown to increase task performance in several contexts. Fine-grained types, however, amplify the problem of data sparsity during training, which is why larger amounts of training data are needed. In this contribution we address scalability issues caused by the larger training sets. We distribute and parallelize feature extraction and parameter estimation in linear-chain conditional random fields, which are a popular choice for sequence labeling tasks such as named entity recognition (NER) and part of speech (POS) tagging. To this end, we employ the parallel stream processing framework Apache Flink which supports in-memory distributed iterations. Due to this feature, contrary to prior approaches, our system becomes iteration-aware during gradient descent. We experimentally demonstrate the scalability of our approach and also validate the parameters learned during distributed training in a fine-grained NER task.
Files: BibTeX, in-memory-distributed-training-gscl-2017.pdf, 978-3-319-73706-5_13