You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In many cases entropy is a useful way to calculate how identifying fingerprintable surfaces are, but it can be misleading when applied to certain distributions. For example, in a population of a billion people where half are in one bucket and half are in singleton buckets, half are uniquely identifiable with only 16 bits of entropy.
I'd recommend adding a warning about this and mentioning it's always good when using entropy to check the percentage of the population in buckets of size less than n. You could also mention that differential privacy can be used to offer stronger guarantees of anonymity.
The text was updated successfully, but these errors were encountered:
Absolutely, we don't want a single "number of bits" to be the only consideration of entropy as that can be very misleading. Currently the document notes this re: entropy:
Consider both the possible variations and the likely distribution of values.
Could you suggest something with more detail about how we should think about entropy?
In many cases entropy is a useful way to calculate how identifying fingerprintable surfaces are, but it can be misleading when applied to certain distributions. For example, in a population of a billion people where half are in one bucket and half are in singleton buckets, half are uniquely identifiable with only 16 bits of entropy.
I'd recommend adding a warning about this and mentioning it's always good when using entropy to check the percentage of the population in buckets of size less than n. You could also mention that differential privacy can be used to offer stronger guarantees of anonymity.
The text was updated successfully, but these errors were encountered: