← Back to context

Comment by hsbauauvhabzb

11 hours ago

Presumably with font kerning and pixel perfect recreation of the source, it would be possible to guess the word very accurately.

The strings oioioi and oooiii will have different widths in some fonts because character organisation matters a lot.

I suppose it gets a bit more complex again if you enable stuff like microtype, but even then you can probably measure how much inter-letter and inter-word spacing has been adjusted by just scanning other text in the same line.

I think the conclusion is honestly that PDF is an outdated format for keeping records that might have to be redacted in the future, like court documents. Something reflowable like epub could have the text replaced with constant-space black squares instead no hints leaked as someone mentioned in a parallel comment.

  • I’ve never heard anyone suggest PDF is a good format, and while I don’t know the spec, I imagine based on the acrobat cve list it’s an absolute clusterfuck.