← Back to context

Comment by vlovich123

17 hours ago

Yeah, the more robust protection is to widen to a constant. But in the general case that could require reflowing the pdf. But honestly single word redactions are really probably useless with cheap AI that can highly accurately fill in the gaps

Depends what you're trying to hide.

If the redaction is a person's name, and there's nothing else to give the person's identity away, single word redaction probably works reasonably well, AI or no AI.

  •   > If the redaction is a person's name
    

    I'm not sure if you're aware, but peoples names are variable in length. We are talking about a system that can identify single character differences. So that does reduce the search space, especially since names are not all possible letter permutations. Combine that with the fact that it isn't uncommon to see partial first letters show up. You can even see some instances in the Epstein files.

    Of course, you can also take this further. Even if you can't recover names you can get meta information about how many parties are involved by recognizing different length redactions correspond to different entities. While same length redaction doesn't guarantee same entity it is a hint.

    • It is also common for authors to misspell names (proper nouns) in an attempt to determine who leaks docs (and to force non-matches for FOIA requests).

      1 reply →