Comment by TiredOfLife
1 year ago
The thing is, how the tokenizing work is about as relevant to the person asking the question as name of the cat of the delivery guy who delivered the GPU that the llm runs on.
1 year ago
The thing is, how the tokenizing work is about as relevant to the person asking the question as name of the cat of the delivery guy who delivered the GPU that the llm runs on.
How the tokenizer works explains why a model can’t answer the question, what the name of the cat is doesn’t explain anything.
This is Hacker News, we are usually interested in how things work.
Indeed, I appreciate the explanation, it is certainly both interesting and informative to me, but to somewhat echo the person you are replying to - if I wanted a boat, and you offer me a boat, and it doesn’t float - the reasons for failure are perhaps full of interesting details, but perhaps the most important thing to focus on first - is to make the boat float, or stop offering it to people who are in need of a boat.
To paraphrase how this thread started - it was someone testing different boats to see whether they can simply float - and they couldn’t. And the reply was questioning the validity of testing boats whether they can simply float.
At least this is how it sounds to me when I am told that our AI overlords can’t figure out how many Rs are in the word “strawberry”.
At some point you need to just accept the details and limitations of things. We do this all the time. Why is your calculator giving only approximate result? Why can't your car go backwards as fast as forwards? Etc. It sucks that everyone gets exposed to the relatively low level implementation with LLM (almost the raw model), but that's the reality today.
1 reply →
The test problem is emblematic of a type of synthetic query that could fail but of limited import in actual usage.
For instance you could ask it for a JavaScript function to count any letter in any word and pass it r and strawberry and it would be far more useful.
Having edge cases doesn't mean its not useful it is neither a free assastant nor a coder who doesn't expect a paycheck. At this stage it's a tool that you can build on.
To engage with the analogy. A propeller is very useful but it doesn't replace the boat or the Captain.
5 replies →
It is however a highly relevant thing to be aware of when evaluating a LLM for 'intelligence', which was the context this was brought up in.
Without looking at the word 'strawberry', or spelling it one letter at a time, can you rattle off how many letters are in the word off the top of your head? No? That is what we are asking the LLM to do.