Journal of Multimedia, Vol 4, No 5 (2009), 313-320, Oct 2009
doi:10.4304/jmm.4.5.313-320

A Multimodal Data Mining Framework for Revealing Common Sources of Spam Images

Chengcui Zhang, Wei-Bang Chen, Xin Chen, Richa Tiwari, Lin Yang, Gary Warner

Abstract


This paper proposes a multimodal framework that clusters spam images so that ones from the same spam source/cluster are grouped together. By identifying the common sources of spam images, we can provide evidence in tracking spam gangs. For this purpose, text recognition and visual feature extraction are performed. Subsequently, a two-level clustering method is applied where images with visually similar illustrations are first grouped together. Then the clustering result from the first level is further refined using the textual clues (if applicable) contained in spam images. Our experimental results show the effectiveness of the proposed framework.



Keywords


spam image, clustering, multimodal analysis, botnet, computer forensics

References



Full Text: PDF


Journal of Multimedia (JMM, ISSN 1796-2048)

Copyright @ 2006-2011 by ACADEMY PUBLISHER – All rights reserved.