Journal of Software, Vol 6, No 10 (2011), 1937-1944, Oct 2011
doi:10.4304/jsw.6.10.1937-1944

Multi-dimensional k-anonymity Based on Mapping for Protecting Privacy

Qian Wang, Cong Xu, Min Sun

Abstract


Data release has privacy disclosure risk if not taking any protection policy. Although attributes that clearly identify individuals, such as Name, Identity Number, are generally removed or decrypted, attackers can still link these databases with other released database on attributes (Quasi-identifiers) to re-identify individual’s private information. K-anonymity is a significant method for privacy protection in microdata release. However, it is a NP-hard problem for optimal k-anonymity on dataset with multiple attributes. Most partitions in k-anonymity at present are single-dimensional. Research on k-anonymity focuses on getting high quality anonymity while reducing the time complexity. This paper proposes a new multi-dimensional k-anonymity algorithm based on mapping and divide-and-conquer strategy. Multi-dimensional data are mapped to single-dimensional, and then k-anonymity on multiple attributes is implemented employing the divide-and-conquer strategy in polynomial time. Divided dimension selection is prioritized based on information dependency, which significantly reduces the information loss. The experiment shows that the proposed algorithm is feasible and performs much better in k-anonymity.


Keywords


privacy protection; k-anonymity; multi-dimension; mapping; partitioning

References


[1] Sweeney L. “K-Anonymity: A model for protecting privacy,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2002, 10 (5), pp. 557-570.
http://dx.doi.org/10.1142/S0218488502001648

[2] Rashid A.H. “Protect privacy of medical informatics using k-anonymization model,” Hegazy A.F. Informatics and Systems (INFOS), the 7th International Conference on, 2010, pp. 1-10.

[3] Sacharidis Dimitris, Mouratidis Kyriakos, Papadias Dimitris. “K-anonymity in the presence of external databases,” IEEE Transactions on Knowledge and Data Engineering, v 22, n 3, 2010, pp. 392-403.
http://dx.doi.org/10.1109/TKDE.2009.120

[4] Samarati P. “Protecting respondents’ identities in microdata release.,”Proc of the TKDE’ 01, 2001, pp. 1010- 1027.

[5] Samarati P, Sweeney L. “Generalizing data to provide anonymity when disclosing information,” Proc of the 17th ACMSIGMODSIGACT - SIGART Symposium on the Principles of Database Systems, Seattle,WA, USA, 1998, pp. 188.

[6] Sweeney L. “Achieving k-anonymity privacy protection using generalization and suppression,” International Journal of Uncertainty, Fuzziness and Knowledge2Based Systems, 2002, 10 (5), pp. 571-588.

[7] LeFevre K, DeWitt D J, Ramakrishnan R. “Incognito: Efficient full-domain k-anonymity,” ACM SIGMOD International Conference on Management of Data.Baltimore, USA: ACM, 2005, pp. 49-60.

[8] Iyengar V. “Transforming Data to Satisfy Privacy Constraints,” Proc. of the ACM SIGKDD. USA: [s. n.], 2002, pp. 279-287.

[9] Byun Ji-Won, Kamra Ashish, Bertino Elisa, Li Ninghui. “Efficient k-anonymization using clustering techniques,” Lecture Notes in Computer Science, 2007, pp.188-200.

[10] Bayardo R, Agrawal R. “Data privacy through optimal k-anonymization,” In ICDE, 2005.

[11] LeFevre K, DeWitt D J, Ramakrishnan R. “Mondrian multidimensional k-anonymity,” IEEE International Conference on Data Engineering.Atlanta,USA:IEEE,2006.

[12] Park Hyoungmin, Shim Kyuseok. “Approximate algorithms with generalizing attribute values for k-anonymity,” Information Systems, 2010, pp. 933-955.

[13] Meyerson A, Williams R. “On the Complexity of Optimal K-anonymity,” Proceedings of the ACM SIGMOD-SIGACTSIGART Conf. on Principles of Database Systems. New York, USA: ACM Press, 2004, pp. 223-228.

[14] Aggarwal G, Feder T. “Approximation Algorithms for K-anonymity,”Journal of Privacy Technology, 2005, 12(1), pp. 78-94.

[15] Gionis A, Tassa T. “k-Anonymization with Minimal Loss of Information,” Knowledge and Data Engineering, IEEE Transactions on, 2009, pp. 206-219.

[16] Qian Wang, Xiangling Shi. “(a, d)-Diversity: Privacy Protection Based on l-Diversity,” 2009 WRI World Congress on Software Engineering, WCSE 2009, pp. 367-372.

[17] C. Blake and C. Merz. UCI repository of machine learning databases, 1998. http://www.ics.uci.edu/~mlearn/M1-Repository.html.


Full Text: PDF


Journal of Software (JSW, ISSN 1796-217X)

Copyright @ 2006-2013 by ACADEMY PUBLISHER – All rights reserved.