Advanced similarity measures using Word Embeddings and Siamese Networks in CBR

Kareem Amin, George Lancaster, Stelios Kapetanakis, Klaus-Dieter Althoff, Andreas Dengel, Miltos Petridis

In: Proceedings of the 2019 Intelligent Systems Conference. Intelligent Systems Conference (IntelliSys-2019) September 5-6 London United Kingdom Advances in Intelligent Systems and Computing Springer 9/2019.


Automatic fuzzy text processing, context extraction and disambiguation are three challenging research areas with high relevance to complex business domains. Business knowledge can be found in plain text message exchanges, emails, support tickets, internal chat messengers and other volatile means, making the decoding of text-based domain knowledge a challenging task. Traditional natural language pro- cessing approaches focus on a comprehensive representation of business knowledge and any relevant mappings. However, such approaches can be highly complex, not cost-effective and of high maintenance, especially in environments that experience frequent changes.This work applies LSTM Siamese Networks to measure text similarities in ambiguous domains. We implement the Manhattan LSTM (MaLSTM) Siamese neural network for semi-automatic knowledge acquisition of business knowledge and decoding of domain-relevant features that enable building similarity measures. Our aim is to minimize the e ort from human experts while extracting domain knowledge from rich text, containing context-free abbreviations, grammatically incorrect text and mixed language.

Weitere Links

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence