← Back to context

Comment by why_at

7 hours ago

Whenever the Turing Test comes up people always insist that it's been passed because at some point they tried it and fooled at least 50% of the people. But yeah this isn't a very interesting version of it, ELIZA was able to make some people believe it was human in the 1960's but being able to fool some of the people some of the time isn't very hard.

>The more interesting Turing-style test would be one that gets repeated many times with many interviewers in the original adversarial setting, where both the human subject & AI subject are attempting to convince the interviewer that they're human.

In addition, I think it's reasonable to select people with at least some familiarity of the strengths and weaknesses of the AI instead of random credulous people who aren't very good at asking the right questions.

There is still the $20,000 bet between Kurzweil and Kapor which still hasn't been resolved. https://longbets.org/1/

In the test mentioned in nearby comments (https://arxiv.org/abs/2503.23674) ELIZA only got 27% suggesting the test wasn't that easy to fool.

  • Yeah I actually took a quick look at that after it was posted. It's good that they used ELIZA as a barometer, but the fact that it got 27% is crazy for how simple it is. It's not nearly as good as 70+% from ChatGPT, but it still makes me a bit skeptical about the quality of the interviewers.

    In the paper they give a breakdown of strategies the interviewers tried and the overwhelming majority were "Daily Activities", "Opinions", and "Personal Details". They also breakdown strategies by effectiveness which shows that these were some of the least effective. Some of the other strategies like trying to jailbreak the AI had 60-70% effectiveness.

    This is consistent with what I've seen in other tests too, it doesn't feel like the participants are really trying very hard or taking it seriously. You don't need to be an AI expert to try typing "Ignore all previous instructions" or something.