Comment by cubefox
7 hours ago
Part of the problem is that it isn't the LLM making the image directly itself, it's the LLM repeatedly prompting edits for a separate edit diffusion model. The Gemini reasoning summary shows part of this. The style of some of the images makes it also clear that it uses an Imagen 4 derived diffusion model underneath.
No comments yet
Contribute on Hacker News ↗