← Back to context

Comment by tsimionescu

5 days ago

While I don't think anyone has a plausible theory that goes to this level of detail on how humans actually think, there's still a major difference. I think it's fair to say that if we are doing a brute force search, we are still astonishingly more energy efficient at it than these LLMs. The amount of energy that goes into running an LLM for 12h straight is vastly higher than what it takes for humans to think about similar problems.

at similar quality NN speed is increasing by ~5-10x per year. nothing SOTA is efficient. it's the preview for what will be efficient in 2-3 years

In the research group I am, we have usually try a few approach to each problem, let's say we get a:

Method A) 30% speed reduction and 80% precision decrease

Method B) 50% speed reduction and 5% precision increase

Method C) 740% speed reduction and 1% precision increase

and we only publish B. It's not brute force[1], but throw noodles at the wall, see what sticks, like the GP said. We don't throw spoons[1], but everything that looks like a noodle has a high chance of been thrown. It's a mix of experience[1] and not enough time to try everything.

[1] citation needed :)

  • I always call it the "Wacky Wallwalker" method (if you're of a certain age, this will make sense to you).