Skip to content

Protection against guessing #9

@chrysn

Description

@chrysn

If I understand draft-mcnally-envelope-07 right, any receiver of an elided document can undo the elisions (and get all the cryptographic verification that indeed what they found is right) if they guess.

For example, if my data is

"Alice" [
    "knows": "Bob"
]

and I publish it as

"Alice" [
    "knows": ELIDED
]

then any receiver who knows who is in our class (Alice, Bob, Eve, Mallory and Sybil) can find out whom Alice knows by just trying all the available values.

This is similar to the trouble we've had in RDF/FoaF when foaf:mbox_sha1sum was popular: You may not be able to find the address in open space, but if you got your hands on the list of pseudonyms used as hotmail addresses those days, you could easily deanonymize an otherwise private email address. Similarly (although admittedly I didn't find any reliable sources on a quick search, but I do remember there was a case) censored documents having been uncensored simply because there was only a specific set of characters whose width would sum up to the distance of the words before and after, reducing the problem to a matter of finding anagrams.

There is an alternative, but it is costly (and actually that cost is one of the factors deterring me from using Gordian for CoRAL, but then if that cost is not there, neither is the elision benefit for these applications):

All elidible items (envelope, leaf, node, assertion IIUC) could get a salt prepended to them -- a random, maybe 128bit value. Thus I'd use (with abbreviated salts in parentheses, although they wouldn't usually be shown in envelope hierarchies)

(123)
(456)"Alice" [
    (789) (0ab)"knows": (cde)"Bob"
]

and publish

(123)
(456)"Alice" [
    (789) (0ab)"knows": ELIDED
]

Now someone who has an idea of whom Alice might know does not just have to guess that it is Bob (something that may just take a hand full of guesses) but also has to guess the salt associated with that ("cde" would actually be much longer).

This is particularly costly when the individual elided items are short (like, when you have a list of countless sensor data values), but that is also when the chances of guessing the items is the largest because the items are so small that guessing is feasible.

(Don't pin this on the term of "salt" … it neither fits "salt" precisely because there it is secret, or maybe it fits because it is as public as the salted value … at any rate it's not a nonce because there is no strong requirement that it is only used once).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions