Comment by sillyfluke

3 hours ago

I would suggest slightly adjusting your expectations by factoring in the difference between video training data and text training data. Due to computation and cost limitations, the idea of video training data being polluted with AI video slop is less of a thing. Also, humans don't generate a lot of biology and physics defying fictional video relative to the abundance and generation ease of real-life video.

The main problem currently with LLM text is not that they create incoherent sentences, it's that what they purport to be statements of fact or general consensus often times aren't, because they are bullshit machines that become better and more accurate bullshitters the more context-accurate data they are fed. AI videos may still have issues with "looking plausible" whereas LLM text currently has less issues with "sounding plausible" and more issues with "being correct" with respect to reality. Which they have no direct connection to.

No one is penalizing an AI video generator for creating a scene that never happened in real life.