Comment by gessha

9 days ago

The mechanical Turk over and over again

https://en.wikipedia.org/wiki/Mechanical_Turk

The funny thing is you could probably make money on Amazon Mechanical Turk by hooking it up to an LLM. We’re at this weird limbo point in history where the fraud could go either way, depending on what you think you’re paying for…

  • Mechanical Turk exists because there is a line below which people are cheaper, even for massively parallel tasks.

    If the LLM really costs less for the level of tasks that are paid for in MT right now, there sure would be a brief arbitrage period followed by the reajusting of that line I assume (of just MT shutting down if it doesn't make sense anymore)

    • You're forgetting completion typically isn't binary.

      Take juding response pairs for DPO for example, how do you ever prove someone used ChatGPT?

      ChatGPT is good enough to decide in a way that will feel internally consistent, and even if you ask MTurk users to provide their logic, ChatGPT can produce a convincing response. Eventually you're forced to start measuring noisy 2nd and 3rd order signals like "did the writing in their rationale sound like ChatGPT?"

      And what's especially tough is that this affects hard to verify tasks disproportionately, while those are exactly the kinds of tasks you'd generally want MTurk for.

      1 reply →

  • I was warned and then suspended from MTurk around a decade ago while testing a workflow for audio transcription that worked a little too well. Not sure if the policies are more flexible today, but there was a lot of low hanging fruit back then.