Comment by handoflixue

2 months ago

https://claude.ai/share/e7fc2ea5-95a3-4a96-b0fa-c869fa8926e8

It's really not hard to get them to reach the correct answer on this class of problems. Want me to have it spell it backwards and strip out the vowels? I'll be surprised if you can find an example this model can't one shot.

(Can't see it now because of maintenance but of course I trust it - that some get it right is not the issue.)

> if you can find an example this model can't

Then we have a problem of understanding why some work and some do not, and we have a due diligence crucial problem of determining whether the class of issues indicated by the possibility of fault as shown by many models are fully overcome in the architectures of those which work, or whether the boundaries of the problem are just moved but still tainting other classes of results.