Journal of Computers, Vol 5, No 4 (2010), 565-572, Apr 2010
doi:10.4304/jcp.5.4.565-572
Mining Top-K Graph Patterns that Jointly Maximize Some Significance Measure
Abstract
Most of graph pattern mining algorithms focus on finding frequent subgraphs and its compact representations, such as closed frequent subgraphs and maximal frequent subgraphs. However, little attention has been paid to mining graph patterns with user-specified significance measure. In this paper, we study a new problem of mining top-k graph patterns that jointly maximize some significance measure from graph databases. Exploiting entropy and information gain, we give two problem formulations, EM and IGM. We first prove them to be NP-hard and then propose two efficient algorithms, PP-TopK and DM-TopK, to solve them. PP-TopK greedily selects top-k graph patterns among frequent subgraphs. DM-TopK integrates the pruning techniques into the mining framework, and directly mines top-k graph patterns from graph databases. Empirical results demonstrate the quality of our top-k graph patterns, and validate the efficiency and scalability of our algorithms.
Keywords
graph mining;graph database;frequent subgraph;top-k
References
Full Text: PDF


