← Back to context

Comment by serf

2 years ago

do you have any actionable suggestions aside from moderator-approved messages or turning the forum read-only?

information suppression is pretty tough work, especially when you're trying to suppress the unknown.

Train a ML model as a classified document classifier and scan everything through that... now they just need to dig through their archives for the training set from all the past leaks.

  • Oh sure! Let me just pull up my training set of unredacted classified documents…

    Wait. I’ve got a knock on the door.

    • I mean I was partially joking about the origin of the dataset, but they likely could work with DoD at this point to get a model that is acceptable to put in place, after as many leaks as they have had.

      1 reply →

Isn't there standard text on these? On a cover page and/or headers/footers?

I get that the content of the documents is not known, but I would think the structure of them is known and could be matched on. Perhaps specific phrases that are only likely to appear in military documents, or a classification level in a header/footer, or even some specific combination of font, font weight, line spacing and indentation.

They could maybe even slurp up the text from PDF's they blocked to prevent someone from posting similar plaintext. Probably not, though, because an endpoint that says whether something is or isn't classified is basically a classified document generator with enough time or clever tricks.

Then just forward the reports to whatever country's military owns those docs (or let the company's government do that). I think War Thunder only needs to make a nominal effort; the various militaries of the world will take care of backing it up with a dire threat.

  • > Perhaps specific phrases that are only likely to appear in military documents, or a classification level in a header/footer, or even some specific combination of font, font weight, line spacing and indentation.

    Do you have any idea how difficult it would be to maintain a database of distinguishing sensitive marks on documents?

    Hell, some documents are so protected even the labeling is protected information.

    Infeasible.

    Edit: I don't mean infeasible or difficult from a technical standpoint. I mean procedurally, this isn't viable. You couldn't accomplish it in a way that wouldn't be so full of holes as to be functionally useless.

Probably the only thing they could do to fix this problem for good is to basically not include any modern equipment at all.

  • Even then, don't they have the type of userbase that would still keep arguing about equipment that is not in the game?