Should we add a warning about entropy? #40

JensenPaul · 2020-01-24T19:55:07Z

In many cases entropy is a useful way to calculate how identifying fingerprintable surfaces are, but it can be misleading when applied to certain distributions. For example, in a population of a billion people where half are in one bucket and half are in singleton buckets, half are uniquely identifiable with only 16 bits of entropy.

I'd recommend adding a warning about this and mentioning it's always good when using entropy to check the percentage of the population in buckets of size less than n. You could also mention that differential privacy can be used to offer stronger guarantees of anonymity.

npdoty · 2020-08-05T17:46:28Z

Absolutely, we don't want a single "number of bits" to be the only consideration of entropy as that can be very misleading. Currently the document notes this re: entropy:

Consider both the possible variations and the likely distribution of values.

Could you suggest something with more detail about how we should think about entropy?

tomrittervg · 2025-01-17T20:51:48Z

FWIW #69 adds a half-sentence addressing this I think.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we add a warning about entropy? #40

Should we add a warning about entropy? #40

JensenPaul commented Jan 24, 2020

npdoty commented Aug 5, 2020

tomrittervg commented Jan 17, 2025

Should we add a warning about entropy? #40

Should we add a warning about entropy? #40

Comments

JensenPaul commented Jan 24, 2020

npdoty commented Aug 5, 2020

tomrittervg commented Jan 17, 2025