Parallel Mining for Frequent Fragments on a Shared-Memory Multiprocessor - Results and Java-Obstacles -

Thorsten Meinl, Ingrid Fischer, Michael Philippsen

Abstract

Although in the last years about a dozen sophisticated algorithms for mining frequent subgraphs have been proposed, it still takes too long to search big databases with 100,000 graphs and more. Even the currently fastest algorithms like gSpan, FFSM, Gaston, or MoFa need hours to complete their tasks. This paper presents a thread-based parallel version of MoFa, [5] that achieves a speedup of about 7 on a shared-memory SMP system equipped with 12 processors. We discuss the design space of the parallelization, the results, the obstacles, that are caused by the irregular search space and by the current state of Java technology, and reason about ways to achieve even better speedups in future.

[article]