Comment by OneMorePerson

2 months ago

It's funny seeing this play out because in my personal life anytime I'm sharing a sensitive document where someone needs to see part of it but I don't want them to see the rest that's not relevant, I'll first block out/redact the text I don't want them to see (covering it, using a redacting highlighter thing, etc.), and then I'll screenshot the page and make that image a PDF.

I always felt paranoid (without any real evidence, just a guess) that there would always be a chance that anything done in software could be reversed somehow.

87 comments

OneMorePerson

GistNoesis 2 months ago

If it's not done properly, and you happen at any point in the chain to put black blocks on a compressed image (and PDF do compress internal images), you are leaking some bits of information in the shadow casted by the compression algorithm : (Self-plug : https://github.com/unrealwill/jpguncrop )

GistNoesis 2 months ago
And that's just in the non-adversarial simple case.
If you don't know the provenance of images you are putting black box on (for example because of a rogue employee intentionally wanting to leak them, or if the image sensor of your target had been compromised to leak some info by another team), your redaction can be rendered ineffective, as some images can be made uncroppable by construction .
(Self-plug : https://github.com/unrealwill/uncroppable )
And also be aware that compression is hiding everywhere : https://en.wikipedia.org/wiki/Compressed_sensing
- ThePowerOfFuet 2 months ago
  
  >Let's crop it anyway
  That is not cropping.
  https://en.wikipedia.org/wiki/Cropping_(image)
  >Cropping is the removal of unwanted _outer_ areas from a photographic or illustrated image.
  
  1 reply →
- layla5alive 2 months ago
  
  Right, using stenography to encode some parity bits into an image so that lost information can be reconstructed seems like an obvious approach - all sorts of approaches you could use, akin to FEC. Haven't looked at your site yet, will be interested to see what you've built :)
  Edit: I checked it out, nice, I like the lower res stenography approach, can work very nicely with good upscaling filters - gave it a star :)
  
  2 replies →
RobotToaster 2 months ago
Somewhat related, I once sent a FOI request to a government agency that decided the most secure way to redact documents was to print them, use a permanent marker, and then scan them. Unfortunately they used dye based markers over laser print, so simply throwing the document into Photoshop and turning up the contrast made it readable.
- cout 2 months ago
  
  I remember noticing that a teacher in high school had used white-out to hide the marks for the correct multiple choice answer on final exam practice questions before copying them. Then she literally cut-and-pasted questions from the practice questions for the final. I did mediocre on the essay, but got the highest score in the class on the multiple choice questions, because I could see little black dots where the white out was used.
RamRodification 2 months ago
I was thinking I understand what's going on but then I came to the image showing the diff and I don't understand at all how that diff can unredact anything.
- OtherShrezzing 2 months ago
  
  It's not that you can unredact them from scratch (you could never get the blue circle back from this software). It's that you can tell which of the redacted images is which of the origin images. Investigative teams often find themselves in a situation where they have all four images, but need to work out which redacted files are which of the origins. Take for example, where headed paper is otherwise entirely redacted.
  So with this technique, you can definitively say "Redacted-file-A is definitely a redacted version of Origin-file-A". Super useful for identifying forgeries in a stack of otherwise legitimate files.
  Also good for for saying "the date on origin-file-B is 1993, and the file you've presented as evidence is provable as origin-file-b, so you definitely know of [whatever event] in 1993".
  
  2 replies →
OneMorePerson 2 months ago

I'm trying to understand this cause it sounds fascinating but I don't get it. I don't have an advanced understanding of compression so that might be part of why.
If you compare an image to another image, you could guess by compression what is under the blocked part, that makes some sense to me conceptually, what I don't get is for the PDF specifically why does it compressing the black boxes I put have any risk? It's compressing the internal image which is just the black box part? Or are you saying the whole screenshot is an internal image?

userbinator 2 months ago

I'll just send an image and not bother with a PDF.

(Note there's also other metadata in a PDF, which you may not want your recipient to know either.)

PeterStuer 2 months ago
There's also metadata in the image files. What specifically would be sensitive in the pdf with screenshots metadata that is also not present in the sceenshot image metadata?
- userbinator 2 months ago
  
  PDF has something called an "info dictionary", which most mainstream PDF-writing software will fill out with various bits of info that you might not want known.
  Image files usually have substantially less metadata by default, unless it's one taken by a camera.
  
  1 reply →

agentifysh 2 months ago

it's absolutely bewildering how ridiculous everything has been so far in terms of competence and this really takes the cherry on the top near Christmas too.

how much lower can they go ?!

yetihehe 2 months ago
USA is still very high, so they can go much much lower, but I think they might go to some still lower places, finding them where we didn't even know such places could exist. Some ideas:
- Leave NATO
- Start openly supporting Russia and North Korea
- Arrest whole International Criminal Court
- Preventively invade China
- baby 2 months ago
  
  I'm convinced slavery will be reintroduced before 2028
  
  6 replies →
- rurban 2 months ago
  
  Reintroduce witch burning.
  Reintroduce death penalties on public squares.
  Taking Greenland and Venezuela is given, as they took most of Latin America already. Just the new Mexican president looks like the next thorn in their eyes. Too competent, too social, too anti-corruption.
  
  1 reply →
- RonanSoleste 2 months ago
  
  They effectively already left NATO and openly support Russia already. ICC members are already under fire and some had their microsoft account banned by Trump. Trump will invade Greenland and Canada first. China is less of an priority.
  
  22 replies →
- potato3732842 2 months ago
  
  Support for NATO within the US is Isreal-lite for different demographics. Pouring resources into it isn't without downsides.
- ycombigrator 2 months ago
  
  Trump is a born banana Republic dictator...
  
  1 reply →
- ThePowerOfFuet 2 months ago
  
  >Start openly supporting Russia
  Already done.
imiric 2 months ago
I'm not too concerned about the US. They've made their bed.
I'm more concerned with them dragging everyone else down, and someone much worse taking their place.
- tsunamifury 2 months ago
  
  Pray tell what western country right now isn’t in the same position?
coldstartops 2 months ago
Maybe it was always part of the plan. Plausible Deniability.
- pjc50 2 months ago
  
  Good Soldier Svejk working at the FBI decided to follow an illegal order as badly as possible.
lostlogin 2 months ago
The really interesting bit is whether they can go another term.
- vanviegen 2 months ago
  
  They seem to be ahead of schedule abolishing a working democracy before the midterms.
darubedarob 2 months ago
This low https://en.wikipedia.org/wiki/Child_abuse_in_Pakistan aka a society where child abuse is simply accepted and mainstream, with the child abuse of child labour and dhijhadism being just additional nightmare fuel on top.
- bamboozled 2 months ago
  
  If we survive long enough I do believe historians will look back on this period and state as a matter of fact, rape and child abuse were completely acceptable, because it seems it’s totally fine with our elected leaders. If these leaders were democratically elected there is only one conclusion to draw from it…

amelius 2 months ago

Maybe the person tasked with the redacting didn't agree so they chose the worst possible way to do it.

noduerme 2 months ago
Normally, I'd never attribute to intention what can be blamed on incompetence. Especially if the government is doing it. But sure, if I were the intern tasked with this job...
- amelius 2 months ago
  
  > Especially if the government is doing it.
  Also if doing it right means more work?
  
  1 reply →

jwrallie 2 months ago

I learned that a long time ago when I was a student and wanted to submit a pdf generated by a trial version of some software as an assignment and was trying to be clever and cover the watermark that said unregistered with a white box.

When opening the file in my slow computer, I could see all the rendering of the watermark happening in slow motion until the white box would pop up on top of the text.

barrkel 2 months ago
When I was a student, and using a shareware or trial version of some software and wanted some printed output from it without a watermark, I printed to postscript (chose a printer that supported postscript and the driver used it instead of rasterized images), but using a file instead of a printer.
I could then open up the postscript, delete the commands that rendered the watermark, save it, then I converted it to PDF so it would be easy to print.
- kccqzy 2 months ago
  
  You don't need PostScript for that. The PDF text commands are Tj and TJ, and rarely ' and ". They are easy to delete without going through PostScript. Tj means showing a simple text string. TJ means showing an array of strings possibly with space adjustments. ' means moving to the next line and showing a simple string. " means doing that and setting character spacing.
  
  3 replies →
tor825gl 2 months ago

It's actually quite easy to open the pdf and see that there are several different elements per page to the document, eg the main text, an image, the footer, the title.
Randomly removing these by trial and error will usually quite easily allow you to find the watermark and nix it, with the advantage that even a sophisticated recipient will not be able to find out from the pdf file what the watermark was.

TacticalCoder 2 months ago

I then convert the image to grayscale only. Then I apply a filter so that only 16 colors are used. And I then adjust brightness/contrast so that "white is really white". It's all scripted: "screenshot to PDF". One of my oldest shell script.

16 shades of grey (not 50) is plenty enough for text to still be smooth.

I do it for several reasons, one of them being I often take manual notes on official documents (which infuriates my wife btw) but then sometimes I need to then scan the documents and send them (local IRS / notary / bank / whatever). So I'll just scan then I'll fill rectangle with white where I took handnotes. Another reason is when there's paper printed on two sides, at scan times sometimes if the paper is thin / ink is thick, the other side shall show.

I wonder how that'd work vs adversarial inputs: never really thought about it.

nubg 2 months ago

care to share the script?

tetha 2 months ago

Personally, I only trust an image manipulation tool to put down solid colored blocks, or something that does not involve the source pixels when deciding on the redacted pixel. Formats like PDF are just so complicated to trust.

reed1234 2 months ago

And even being this careful, if the opacity is slightly off it could be undone

ge96 2 months ago

The one that was crazy to me is undoing a blur effect (based on its algo), so yeah I also will layer and screenshot something

crossroadsguy 2 months ago

This is what I do while sharing such images. I crop out those parts first and then take another screenshot. I do not even risk painting over and then take another screenshot. I have been doing this forever.

9dev 2 months ago

In practical terms, a more convenient way to achieve this is just printing the document to a PDF, which rasterises the visible layer into what the printer would see. Most pdf tools support this.

vanviegen 2 months ago

That seems like a dangerous approach. Though printer drivers do often use rasterization, especially when targeting cheap printers, many printers can render vector graphics and text as well. Print-to-PDF will often use the later approach, unless of course the source program always rasterizes it's output when sending it out to the printer driver, or the used Print-to-PDF driver is particularly stupid.
OneMorePerson 2 months ago

You can, but I don't trust software for these types of niche but critical tasks hah. Next thing I know I'd be reading a headline about how "bug in print to PDF actually retains XYZ metadata"

prameshbajra 2 months ago

I feel the same and do the same.

amelius 2 months ago

Me too.