Comment by thangalin

16 hours ago

> I haven't seen any redaction system yet that protects against this.

The linked article suggests widening redacted areas more than needed with some randomization applied to the width. Strikes me that that wouldn't do much except add a few more possible solutions.

Yeah, the more robust protection is to widen to a constant. But in the general case that could require reflowing the pdf. But honestly single word redactions are really probably useless with cheap AI that can highly accurately fill in the gaps

  • Depends what you're trying to hide.

    If the redaction is a person's name, and there's nothing else to give the person's identity away, single word redaction probably works reasonably well, AI or no AI.

    •   > If the redaction is a person's name
      

      I'm not sure if you're aware, but peoples names are variable in length. We are talking about a system that can identify single character differences. So that does reduce the search space, especially since names are not all possible letter permutations. Combine that with the fact that it isn't uncommon to see partial first letters show up. You can even see some instances in the Epstein files.

      Of course, you can also take this further. Even if you can't recover names you can get meta information about how many parties are involved by recognizing different length redactions correspond to different entities. While same length redaction doesn't guarantee same entity it is a hint.

      2 replies →