← Back to context

Comment by ertgbnm

2 days ago

> with AI-generated content excluded from pre-training.

> without distillation from third-party models

sounds like zero unless they are lying.

> with AI-generated content excluded from pre-training.

Though this is largely impossible these days, unless they pre-trained on pre-AI era data.

  • That could be. Just use pre-training for language understanding and let the post-training on synthetic data do the heavy lifting.

"how many of those shapes are rectangles?" "sounds like zero unless they are squares"

Adding "unless" to a statement makes it vacuous if the latter clause is weaker than the first clause. I find it hard to believe that a company willing to violate licenses would have scruples about lying about it.

  • Not vacuous, but tautological. Which is different, because tautologies can actually be quite directly informative. Whereas vacuous truths tend to be oblique.

    Also, “Microsoft is lying” is not a logically stronger statement, because they might be lying about something other than whether they distilled or trained on AI output.

  • Adding "unless" to a statement makes it vacuous if the latter clause is weaker than the first clause

    I think that's the point. "How do I say they're lying without outright saying they're lying?"

    It's a common rhetorical trick.

    • Or the speaker is just not in the mood to argue with someone whose response will be, "you trust anything Microsoft say?"