← Back to context

Comment by throwaway314155

4 days ago

> as opposed to the weird choice to use CLIP and T5 in the original FLUX

This method was used in tons of image generation models. Not saying it's superior or even a good idea, but it definitely wasn't "weird".

Considering how little (and sometimes negative) benefit it provided in most of them compared to just using the biggest encoder model and having a null prompt on the rest (not just those using the specific combination Flux.1 did, but for most of the multi-encoder models), its actually pretty weird that people kept doing it.