Comment by asadotzler

2 months ago

We do understand. We don't think that's okay. If a model cannot manage character level consideration, that's a serious flaw that's got potential to lead to an immeasurable number of failure states. "Duh, of course it can't count" is not the best look for a bot whose author tells us it's got PhD-level skill.

3 comments

asadotzler

zahlman 2 months ago

I do think it's "okay". After all, it's clear that fixing it would require a fundamentally different approach.

I just also think it's a reason to mock people who don't try to understand those limitations and get way ahead of themselves hyping up the technology.

The entire point of this exercise is to refute the claim that LLMs are a step towards AGI, even given "agency". And we should be happy that they aren't — because supposing that AGI is possible, the way that we currently treat LLMs shows that we as a species are nowhere near ready for the consequences of creating it.

johnfn 2 months ago

Can you enumerate some of these "immeasurable number of failure states"? For me it starts and stops at "can't count letters in a word". That hardly seems catastrophic.

All I have to do is turn on thinking mode and the error goes away. https://chatgpt.com/share/6897e630-77f0-800c-a9bf-30d9c0e271...

mikewarot 2 months ago

So, if an AI can just spit out the cure for cancer, but spells some things wrong, it's not intelligent?

You think all PhD candidates have perfect spelling? I'd wager most of them re-read their dissertation and edit it, over and over, a process that most LLMs don't have the luxury of doing.

We'd have to give up all the efficiency of tokenizing, re-train a model (a much less optimum model) for at least twice as long to get anywhere near the same results for one that just spits out ASCII.