It could still be identifiable, for example if the document has been prepared such that the intended recipient's identity is encoded into subtle modulation of the widths of spaces.
That's outside this threat model? The idea here is trying to foil outside analysis, not limit the document authors (which are allowed to add/update and even write openly 'the intended recipient's identity').
It could still be identifiable, for example if the document has been prepared such that the intended recipient's identity is encoded into subtle modulation of the widths of spaces.
That's outside this threat model? The idea here is trying to foil outside analysis, not limit the document authors (which are allowed to add/update and even write openly 'the intended recipient's identity').
Print and re-scan wouldn’t fix that though.
That was my point. If you want to erase its origin you need to semantically extract the contents and reduce them to their most basic representation.
Sure, but all those not-essential information hidden in PDFs format are removed
In PDF file format?