Comment by BoorishBears

8 hours ago

At the risk of sounding unfairly stubborn about something I'm not that familiar with, if they've been at it for 2 years I'm imagining a very different (much more difficult) pipeline than fine-tuning an image model with an LLM backbone

The jump in understanding that having a full sized LLM behind the generations enables here is massive: https://ghost.oxen.ai/fine-tuned-qwen-image-edit-vs-nano-ban...