Comment by AuryGlenz
23 days ago
Quality is increasing, but these small models have very little knowledge compared to their big brothers (Qwen Image/Full size Flux 2). As in characters, artists, specific items, etc.
23 days ago
Quality is increasing, but these small models have very little knowledge compared to their big brothers (Qwen Image/Full size Flux 2). As in characters, artists, specific items, etc.
Agreed - given what Tongyi-MAI Lab was able to accomplish with a 6b model - I would love to see what they could do with something larger. Somewhere in the range of 15-20b, between these smaller models (ZiT, Klein) and the significantly larger models (Flux.2 dev).
I smell the bias-variance tradeoff. By underfitting more, they get closer to the degenerate case of a model that only knows one perfect photo.
That's what LoRAs are for.
And small models are also much easier to fine tune than large ones.
I hate that excuse. I want the model to know who the Paw Patrol is without either finding a lora (which probably won't exist because they're mostly porn) or needing to make a dataset, tag it, and then train it myself.