Comment by ttul

6 months ago

Prompt understanding will only ever be as good as the language embeddings that are fed into the model’s input. Google’s hardware can host massive models that will never be run on your desktop GPU. By contrast, Flux and its kin have to make do with relatively tiny LLMs (Qwen Image uses a 7B-param LLM).

0 comments