Comment by vidarh
2 months ago
It is clear it is not, given we have examples of models that handles these cases.
I don't even know what you mean with "architecturally all checks are implemented and mandated". It suggests you may think these models work very differently to how they actually work.
> given we have examples of models that handles
The suggestions come from the failures, not from the success stories.
> what you mean with "architecturally all checks are implemented and mandated"
That NN-models have an explicit module which works as a conscious mind and does lucid ostensive reasoning ("pointing at things") reliably respected in their conclusion. That module must be stress-tested and proven as reliable. Success stories only result based are not enough.
> you may think these models work very differently to how they actually work
I am interested in how they should work.
> The suggestions come from the failures, not from the success stories.
That thinking is flawed. The successes conclusively proves that the issue isn't systemic because there is a solution.
> That NN-models have an explicit module which works as a conscious mind and does lucid ostensive reasoning ("pointing at things") reliably respected in their conclusion.
Well, this isn't how LLMs work.
> That module must be stress-tested and proven as reliable. Success stories only result based are not enough.
Humans aren't reliable. You're setting the bar at a level well beyond what is necessary, and almost certainly beyond what is possible.
> I am interested in how they should work.
We don't know how they should work, because we don't know what the optimal organisation is.
> The successes ... proves that the issue isn't systemic because there is a solution
The failures prove the possibility of the user not meeting said solution. The solution will have to be explicit, because we need to know if (practically) and how (scientifically) it works. And said solution will have to be convincing as working on all branches of the general problem, of which "not really counting" is just a hint - "not properly handling mental object" is what we fear, the «suggestion of a systemic issue» I mentioned.
> Well, this isn't how LLMs work
Yes, and that is an issue, because using implementation of deliriousness is an issue. They must be fixed - we need the real thing.
> Humans aren't reliable. You're setting the bar at a level well beyond what is necessary
The flaws met in humans prove nothing since the start ("My cousin speaks just like Eliza" // "Well don't ask her then"; "The Nobel prize failed" // "And it still remains a better consultant than others" etc.).
We implement automated versions of the qualities only incidentally found in humans - that's just because tools are created to enhance the problem solving practices we already tackled with what we had.
And in this case (LLMs), there are qualities found in nature that are not there and must be implemented not to have as tools the implementation of psychiatric cases: foremostly here, the conscious (as opposed to the intuitive unconscious).
> and almost certainly beyond what is possible
It's necessary. And I do not see what justified doubts about the possibility (already that we implemented the symbolic well before NNs, or that in early NNs the problem of the implementation of deterministic logic was crucial...). We are dealing with black boxes, we plainly have to understand them as required and perfection (complete) them.
> what the optimal organisation is
There are clear hints for that. The absence of a "complete" theory of mind is not a stopper - features to be implemented are clear to us.
> It suggests you may think these models work very differently to how they actually work.
It suggests to me the opposite: that he thinks there can be no solution that doesn't involve externally policing the system (which it quite clearly needs to solve other problems with trusting the output).
Given that we have a solution that doesn't require "externally policing the system" given that newer/bigger models handle it, that is clearly not the case.