Comment by skeledrew 2 months ago What's the systematic flaw? 23 comments skeledrew Reply lottin 2 months ago The fact that it can't count. skeledrew 2 months ago That isn't a flaw though. Counting is orthogonal to the functioning of LLMs, which are merely completing patterns based on their training data and available context. If you want an LLM to count reliably, give it a tool. mdp2021 2 months ago Anything articulate (hence possibly convincing) which could be «merely [guessing]» should either be locked out of consequential questions, or fixed. 17 replies → minimaxir 2 months ago If a LLM can get IMO Gold but can’t count, that’s an issue. lottin 2 months ago I think the issue is that it was advertised as having PhD-level intelligence, while in fact it can't count the letters in a word. utopcell 2 months ago This particular LLM did not get an IMO Gold.
lottin 2 months ago The fact that it can't count. skeledrew 2 months ago That isn't a flaw though. Counting is orthogonal to the functioning of LLMs, which are merely completing patterns based on their training data and available context. If you want an LLM to count reliably, give it a tool. mdp2021 2 months ago Anything articulate (hence possibly convincing) which could be «merely [guessing]» should either be locked out of consequential questions, or fixed. 17 replies → minimaxir 2 months ago If a LLM can get IMO Gold but can’t count, that’s an issue. lottin 2 months ago I think the issue is that it was advertised as having PhD-level intelligence, while in fact it can't count the letters in a word. utopcell 2 months ago This particular LLM did not get an IMO Gold.
skeledrew 2 months ago That isn't a flaw though. Counting is orthogonal to the functioning of LLMs, which are merely completing patterns based on their training data and available context. If you want an LLM to count reliably, give it a tool. mdp2021 2 months ago Anything articulate (hence possibly convincing) which could be «merely [guessing]» should either be locked out of consequential questions, or fixed. 17 replies →
mdp2021 2 months ago Anything articulate (hence possibly convincing) which could be «merely [guessing]» should either be locked out of consequential questions, or fixed. 17 replies →
minimaxir 2 months ago If a LLM can get IMO Gold but can’t count, that’s an issue. lottin 2 months ago I think the issue is that it was advertised as having PhD-level intelligence, while in fact it can't count the letters in a word. utopcell 2 months ago This particular LLM did not get an IMO Gold.
lottin 2 months ago I think the issue is that it was advertised as having PhD-level intelligence, while in fact it can't count the letters in a word.
The fact that it can't count.
That isn't a flaw though. Counting is orthogonal to the functioning of LLMs, which are merely completing patterns based on their training data and available context. If you want an LLM to count reliably, give it a tool.
Anything articulate (hence possibly convincing) which could be «merely [guessing]» should either be locked out of consequential questions, or fixed.
17 replies →
If a LLM can get IMO Gold but can’t count, that’s an issue.
I think the issue is that it was advertised as having PhD-level intelligence, while in fact it can't count the letters in a word.
This particular LLM did not get an IMO Gold.