Comment by quantumHazer

7 months ago

Cool, but don't get me wrong, isn't this essentially similar to Google's Co-Scientist, where multiple models are in a loop, passing context back and forth validating things? At its core, it's still a system of LLMs, which is impressive in execution but not fundamentally new.

LLMs are undoubtedly useful at tasks like code "optimisation" and detecting patterns or redundancies that humans might overlook, but this announcement feels like another polished, hypey blog post from Google.

What's also becoming increasingly confusing is their use of the "Alpha" branding. Originally, it was for breakthroughs like AlphaGo or AlphaFold, where there was a clear leap in performance and methodology. Now it's being applied to systems that, while sophisticated, don't really rise to the same level of impact.

edit: I missed the evaluator in my description, but an evaluation method is applied also in Co-Scientist:

"The AI co-scientist leverages test-time compute scaling to iteratively reason, evolve, and improve outputs. Key reasoning steps include self-play–based scientific debate for novel hypothesis generation, ranking tournaments for hypothesis comparison, and an "evolution" process for quality improvement."[0]

[0]: https://research.google/blog/accelerating-scientific-breakth...

They address this in the AlphaEvolve paper:

"While AI Co-Scientist represents scientific hypotheses and their evaluation criteria in natural language, AlphaEvolve focuses on evolving code, and directs evolution using programmatic evaluation functions. This choice enables us to substantially sidestep LLM hallucinations, which allows AlphaEvolve to carry on the evolution process for a large number of time steps."

  • If they ever do change their stance on that and give in to vibe coding, at least there is the opportunity to brilliantly rebrand as DeepVibe.

  • It is interesting how google turned the tide on GenAI race, and seems to be leading the pack, with not only great fundamental research, but also interesting model and products. To what extent these remain a niche/nice to have or become a sensation remains to be seen, but I hope if they don't reach hype status, they might be released to the open weights world.

    • People often forget that Google was behind Mu Zero, which IMO is the most important AI paper of the decade, not the Transformer one, because they effectively showed how models can learn how to search.

      For example, for self driving, it makes much more sense to treat it like a game, where the model learns the evolution of the surrounding environment, and learns how its own actions affect it, and can MCTS its way into correct behavior - specifically because once it learns the environment dynamics, it can internally simulate crashes and retrain itself.

      If this process is refined (namely the functions that control direction of training) , you can pretty much start training a model on the dataset of real world (sights, sounds, physical interactions, as well as digital ones), and as it learns the environment, it can be further and further refined, and then we get to the point where it can self evolve its decision making to be truly considered "intelligent".

    • > It is interesting how google turned the tide on GenAI race, and seems to be leading the pack

      I think this is perhaps due to Google combining Google Brain and DeepMind, and putting Demis Hassabis at the helm?

      I agree, Google is very much leading the pack in AI now. My worry is that they have mentioned recently that they are less inclined to release research into the open if they think it will give their competition a step-up. Demis is more scientist than business-man, so perhaps there's hope that he will be willing to continue to release research.

Few things are more Google than having two distinct teams building two distinct products that are essentially the same thing.