Comment by scott_w

1 year ago

I suspect the same thing. Rather than LLMs “learning to play chess,” they “learnt” to recognise a chess game and hand over instructions to a chess engine. If that’s the case, I don’t feel impressed at all.

43 comments

scott_w

gamerDude 1 year ago

This is exactly what I feel AI needs. A manager AI that then hands off things to specialized more deterministic algorithms/machines.

bigiain 1 year ago
Next thing, the "manager AIs" start stack ranking the specialized "worker AIs".
And the worker AIs "evolve" to meet/exceed expectations only on tasks directly contributing to KPIs the manager AIs measure for - via the mechanism of discarding the "less fit to exceed KPIs".
And some of the worker AIs who're trained on recent/polluted internet happen to spit out prompt injection attacks that work against the manager AIs rank stacking metrics and dominate over "less fit" worker AIs. (Congratulations, we've evolved AI cancer!) These manager AIs start performing spectacularly badly compared to other non-cancerous manager AIs, and die or get killed off by the VC's paying for their datacenters.
Competing manager AIs get training, perhaps on on newer HN posts discussing this emergent behavior of worker AIs, and start to down rank any exceptionally performing worker AIs. The overall trends towards mediocrity becomes inevitable.
Some greybread writes some Perl and regexes that outcompete commercial manager AIs on pretty much every real world task, while running on a 10 year old laptop instead of a cluster of nuclear powered AI datacenters all consuming a city's worth of fresh drinking water.
Nobody in powerful positions care. Humanity dies.
- MyFirstSass 1 year ago
  
  And “comment of the year” award goes to.
  Sorry for the filler but this is amazingly put and so true.
  We’ll get so many unintended consequences that are opposite any worthy goals when it’s AIs talking to AIs in a few years.
criley2 1 year ago
Basically what Wolfram Alpha rolled out 15 years ago.
It was impressive then, too.
- waffletower 1 year ago
  
  It is good to see other people buttressing Stephen Wolfram's ego. It is extraordinarily heavy work and Stephen can't handle it all by himself.
waffletower 1 year ago

While deterministic components may be a left-brain default, there is no reason that such delegate services couldn't be more specialized ANN models themselves. It would most likely vastly improve performance if they were evaluated in the same memory space using tensor connectivity. In the specific case of chess, it is helpful to remember that AlphaZero utilizes ANNs as well.
spiderfarmer 1 year ago
Multi Agent LLM's are already a thing.
- nine_k 1 year ago
  
  Somehow they're not in the limelight, and lack a well-known open-source runner implementation (like llama.cpp).
  Given the potential, they should be winning hands down; where's that?

Kiro 1 year ago

That's something completely different than what the OP suggests and would be a scandal if true (i.e. gpt-3.5-turbo-instruct actually using something else behind the scenes).

nerdponx 1 year ago
Ironically it's probably a lot closer to what a super-human AGI would look like in practice, compared to just an LLM alone.
- sanderjd 1 year ago
  
  Right. To me, this is the "agency" thing, that I still feel like is somewhat missing in contemporary AI, despite all the focus on "agents".
  If I tell an "agent", whether human or artificial, to win at chess, it is a good decision for that agent to decide to delegate that task to a system that is good at chess. This would be obvious to a human agent, so presumably it should be obvious to an AI as well.
  This isn't useful for AI researchers, I suppose, but it's more useful as a tool.
  (This may all be a good thing, as giving AIs true agency seems scary.)
  
  4 replies →
- dartos 1 year ago
  
  So… we’re at expert systems again?
  That’s how the AI winter started last time.
  
  2 replies →
empath75 1 year ago
The point of creating a service like this is for it to be useful, and if recognizing and handing off tasks to specialized agents isn't useful, i don't know what is.
- scott_w 1 year ago
  
  If I was sold a product that can generically solve problems I’d feel a bit ripped off if I’m told after purchase that I need to build my own problem solver and way to recognise it…
  
  2 replies →
cruffle_duffle 1 year ago

If they came out and said it, I don’t see the problem. LLM’s aren’t the solution for a wide range of problems. They are a new tool but not everything is a nail.
I mean it already hands off a wide range of tasks to python… this would be no different.

antifa 1 year ago

TBH I think a good AI would have access to a Swiss army knife of tools and know how to use them. For example a complicated math equation, using a calculator is just smarter than doing it in your head.

PittleyDunkin 1 year ago
We already have the chess "calculator", though. It's called stockfish. I don't know why you'd ask a dictionary how to solve a math problem.
- mkipper 1 year ago
  
  Chess might not be a great example, given that most people interested in analyzing chess moves probably know that chess engines exist. But it's easy to find examples where this approach would be very helpful.
  If I'm an undergrad doing a math assignment and want to check an answer, I may have no idea that symbolic algebra tools exist or how to use them. But if an all-purpose LLM gets a screenshot of a math equation and knows that its best option is to pass it along to one of those tools, that's valuable to me even if it isn't valuable to a mathematician who would have just cut out of the LLM middle-man and gone straight to the solver.
  There are probably a billion examples like this. I'd imagine lots of people are clueless that software exists which can help them with some problem they have, so an LLM would be helpful for discovery even if it's just acting as a pass-through.
  
  1 reply →
- the_af 1 year ago
  
  A generalist AI with a "chatty" interface that delegates to specialized modules for specific problem-solving seems like a good system to me.
  "It looks like you're writing a letter" ;)
  
  2 replies →
- iamacyborg 1 year ago
  
  People ask LLM’s to do all sorts of things they’re not good at.
- threatripper 1 year ago
  
  You take a picture of a chess board and send it to ChatGPT and it replies with the current evaluation and the best move/strategy for black and white.

fires10 1 year ago

Recognize and hand over to a specialist engine? That might be useful for AI. Maybe I am missing something.

worewood 1 year ago

It's because this is standard practice since the early days - there's nothing newsworthy in this at all.
generic92034 1 year ago
How do you think AI are (correctly) solving simple mathematical questions which they have not trained for directly? They hand it over to a specialist maths engine.
- internetter 1 year ago
  
  This is a relatively recent development (<3 months), at least for OpenAI, where the model will generate code to solve math and use the response
  
  1 reply →
nerdponx 1 year ago
It is and would be useful, but it would be quite a big lie to the public, but more importantly to paying customers, and even more importantly to investors.
- anon84873628 1 year ago
  
  The problem is simply that the company has not been open about how it works, so we're all just speculating here.
skydhash 1 year ago

Wasn't that the basis of computing and technology in general? Here is one tedious thing, let's have a specific tool that handles it instead of wasting time and efforts. The fact is that properly using the tool takes training and most of current AI marketing are hyping that you don't need that. Instead, hand over the problem to a GPT and it will "magically" solve it.
scott_w 1 year ago

If I was sold a general AI problem solving system, I’d feel ripped off if I learned that I needed to build my own problem solver and hook it up after I’d paid my money…

kazinator 1 year ago

That's not much different from a compiler being rigged to recognize a specific benchmark program and spit out a canned optimization.

Peteragain 1 year ago

.. or a Volkswagen recognising an emissions test and turning off power mode...