Comment by nobody9999

15 hours ago

>Sure you can. Search and somehow mark the text (underline or similar) to make keywords hard to miss. Then proceed with the manual print, expunge, scan process.

I suppose a global search/replace to mark text for redaction as an initial step might not be a bad idea, but if one needs to make sure it's correct, that's not enough.

Don't bother with soft copy at all. Print a copy and have multiple individuals manually make redactions to the same copy with different color inks.

Once that initial phase is complete, partner up persons who didn't do the initial redactions review the paper text with the extant redactions and go through the documents together (each with their own copy of the same redactions), verbally and in ink noting redactions as well as text that should be redacted but isn't.

That process could then be repeated with different people to ensure nothing was missed.

We used to call this "proofreading" in the context of reports and other documents provided as work product to clients. It looks really bad when the product for which you're charging five to six figures isn't correct.

The use case was different, but the efficacy of such a process is perfect for something like redactions as well.

And yes, we had word processing and layout software which included search and replace. But if correctness is required, that's not good enough -- a word could be misspelled and missed by the search/replace, and/or a half dozen other ways an automated process could go wrong and either miss a redaction or redact something that shouldn't be.

As for the time and attention required, I suppose that depends upon how important it is to get right.

Is such a process necessary for all documents? No.

That said, if correctness is a priority, four (or more) text processing engines (human brains, in this case) with a set of engines working in tandem and other sets of engines working serially and independently to verify/correct any errors or omissions is an excellent process for ensuring the correctness of text.

I'd point out that the above process is one that's proven reliable over decades, even centuries -- and doesn't require exact strings or regular expressions.

Edit: Fixed prose ("other documents be provided" --> "other documents provided").