Relational Sequence Alignments and LogosAndreas Karwath; Kristian Kersting
In: Stephen H. Muggleton; Ramón P. Otero; Alireza Tamaddoni-Nezhad (Hrsg.). Inductive Logic Programming, 16th International Conference. International Conference on Inductive Logic Programming (ILP-2006), August 24-27, Santiago de Compostela, Spain, Pages 290-304, Lecture Notes in Computer Science, Vol. 4455, Springer, 2006.
The need to measure sequence similarity arises in many applicitation domains and often coincides with sequence alignment: the more similar two sequences are, the better they can be aligned. Aligning sequences not only shows how similar sequences are, it also shows where there are differences and correspondences between the sequences. Traditionally, the alignment has been considered for sequences of flat symbols only. Many real world sequences such as natural language sentences and protein secondary structures, however, exhibit rich internal structures. This is akin to the problem of dealing with structured examples studied in the field of inductive logic programming (ILP). In this paper, we introduce , which is a powerful, yet simple approach to align sequence of structured symbols using well-established ILP distance measures within traditional alignment methods. Although straight-forward, experiments on protein data and Medline abstracts show that this approach works well in practice, that the resulting alignments can indeed provide more information than flat ones, and that they are meaningful to experts when represented graphically.