Comment by anigbrowl
18 days ago
I found this part interesting:
There are also other documents that appear to simulate a scanned document but completely lack the “real-world noise” expected with physical paper-based workflows. The much crisper images appear almost perfect without random artifacts or background noise, and with the exact same amount of image skew across multiple pages. Thanks to the borders around each page of text, page skew can easily be measured, such as with VOL00007\IMAGES\0001\EFTA00009229.pdf. It is highly likely these PDFs were created by rendering original content (from a digital document) to an image (e.g., via print to image or save to image functionality) and then applying image processing such as skew, downscaling, and color reduction.
GNOME Desktop users can put this in a Bash script in ~/.local/share/nautilus/ for more convincing looking fake PDF scans, accessible from your right-click menu. I do not recall where I copied it from originally to give credit so thanks, random internet person (probably on Stack Exchange). It works perfectly.
That seq is probably supposed to be $(seq 0.05 0.05 0.5). Right now it's always 0.05.
Note that you can get random numbers straight from bash with $RANDOM. It's 15 bit (0 to 32767) but good enough here; this would get between 0.05 and 0.5: $(printf "0.%.4d\n" $((500 + RANDOM % 4501)))
Shouldn't $ROTATION be set inside the loop and actually used in the magick command?
You know, now that you point it out that seems obvious. I think maybe I was experimenting with rotation and left that in, unused. I did this years ago. The loop works OK though. Thanks for the feedback (and now I have to finish editing that script ...)
Nothing about this is specific to GNOME, right? Imagemagick is cross-platform
I guess the Gnome-specific part is that Gnome comes with the Nautilus file browser, and the instructions add a script for Nautilus.
But yea, this will work as long as you have imagemagick and Nautilus installed.
2 replies →
I like https://lookscanned.io/
[flagged]
you sound as grumpy as my cat looks. there's no need for this language
[flagged]
The real question is: Which of the documents are the ones that are "simulating" scanned documents, and what political narrative do they reinforce?
The only reason I can think of for why someone would want to do this is to pass off fraudulent or AI generated images as real.
A simpler explanation could be wanting to skip the print->sign->scan ceremony required by some institutions.
This. Slip in a few thousand “fakes” with the trove of goods to be able to fabricate a narrative.
Another explanation is that it's simply one form of lazy ineffective obfuscation performed by inexperienced relative luddites in an attempt to walk the fine line between complying with the supreme court directive & not releasing anything useful.
Other investigations into the files have found oddities like redaction of the word "don't" indicating a haphazard find-&-replace approach to redaction, possibly LLM-aided.
The DOJ/Akamai online hosted search feature is also incomplete - potentially due to some of these "digitally scanned" files not being subject to OCR.
> to pass off fraudulent or AI generated images as real.
Possibly but I don't find it compelling, if only because a significant portion of the media reportage on the files has made claims that are entirely baseless - if there were a narrative to be sold one would expect such reportage to be actively leveraging such fraudulent images.
Very interesting. That document in particular seems to be an interview of A. Acosta by the DoJ from 2019. But what reason would the FBI have for pretending it's a scanned document, if it is genuine? Perhaps there's some aspect of Epstein's deal with Acosta that they'd rather not reveal to the public?
https://www.justice.gov/epstein/files/DataSet%207/EFTA000092...
Not that I can speak from personal experience or anything... But somebody on an email chain may have requested a scanned version of the document to ensure there is no metadata and the employee might have found it easier to just flatten the pdf and apply a graphical filter to make the document appear like a scanned document. There might even be a webtool available somewhere to do so, I wouldn't know...
[dead]
1 reply →
> the employee might have found it easier to just flatten the pdf and apply a graphical filter to make the document appear like a scanned document
Is that remotely plausible? I can't imaging faking a scan being easier than just walking down the hall to the copier room.
21 replies →
I am only guessing that they had to remove the document from a classified network in a way where data won't possibly leak
[dead]
Such a weird way to do it when it would be a vastly easier to just blow the document out to paper and re-scan it.
Vastly easier when you do it to one or a handful of documents.
But if you want to do it to 2000 documents...
But at that point why bother with the fakery? Why does it matter if it's obviously of digital origin? As long as it's rendered down to an image problem solved.
Was the motivation for this benign (an employee skirting regulations) or malicious?
4 reems (4×500) is hardly a lot for commercial equipment to handle - paper trays will take a reem at a time. Document analysis would still show some shenanigans were in play, but you'd get a bit of variation at least.
[dead]
I mean, I do that all the time when they ask me to print something, sign it, and then scan it.
Sign a blank paper, scan it, paste the original doc on it. Then keep the scan for future docs.
An easier trick I've used is just sign directly on the computer screen over the displayed document with a whiteboard marker and take a photo with my phone.