Assessing Disclosure Risk in Anonymized Datasets
• White Paper
Publisher
Software Engineering Institute
Topic or Tag
Abstract
Sharing of log data is a valuable step towards the improvement of network security. However, logs often contain sensitive information and organizations are hesitant to share them. Anonymization methods are used for increasing protection, lowering the disclosure risk to a level considered safe. Accordingly, a metric for anonymity is necessary to quantitatively assess the risk before releasing log data. In this paper, we propose a general framework for estimating disclosure risk using conditional entropy between the original and the anonymized datasets. We demonstrate our approach using network log files.
Part of a Collection
FloCon 2008 Collection
This content was created for a conference series or symposium and does not necessarily reflect the positions and views of the Software Engineering Institute.