← Back to context

Comment by amelius

2 days ago

A dataset with only data from before 2024 will soon be worth billions.

2022. When chatgpt first came out. https://arstechnica.com/ai/2025/06/why-one-man-is-archiving-...

  • I’ve already gotten into the habit of sticking “before:2022” in YT if what I’m looking for doesn’t need to be recent.

    The AI slop/astroturfing of YT is near complete.

    • I would say around 2023. I refuse to believe the slop propagated that fast.

      And there's more than enough content for one person to consume. Very little reason to consume content newer than 2023.

Sythentic data is already being embraced. Turns out you actually can create good training data with these models.