Privacy against Aggregate Knowledge Attacks
This paper focuses on protecting the privacy of individuals in publication scenarios where the attacker is ex- pected to have only abstract or aggregate knowledge about each record. Whereas, data privacy research usually focuses on defining stricter privacy guarantees that assume increasingly more sophisticated attack scenarios, it is also important to have anonymization methods and guarantees that will address any attack scenario. Enforcing a stricter guarantee than required increases unnecessarily the information loss. Consider for example the publication of tax records, where attackers might only know the total income, and not its con- stituent parts. Traditional anonymization methods would pro- tect user privacy by creating equivalence classes of identical records. Alternatively, in this work we propose an anonymization technique that generalizes attributes, only as much as needed to guarantee that aggregate values over the complete record, will create equivalence classes of at size k. The experimental evaluation on real data shows that the proposed method produces anonymized data that lie closer to the original ones, with respect to traditional anonymization algorithms.