Comment by cheald

3 months ago

I do (same username), but I haven't published any of this (and in fact my Github has sadly languished lately); I keep working on it with the intent to publish eventually. The big problem with models like this is that the training dynamics have so many degrees of freedom that every time I get close to something I want to publish I end up chasing down another set of rabbit holes.

https://gist.github.com/cheald/7d9a436b3f23f27b8d543d805b77f... - here's a quick dump of my SVDLora module though. I wrote it for use in OneTrainer though it should be adaptable to other frameworks easily enough. If you want to try it out, I'd love to hear what you find.

2 comments

cheald

ttul 3 months ago

This is super cool work. I’ve built some new sampling techniques for flow matching models that encourage the model to take a “second look” by rewinding sampling to a midpoint and then running the clock forward again. This worked really well with diffusion models (pre-DiT models like SDXL) and I was curious whether it would work with flow matching models like Qwen Image. Yes, it does, but the design is different because flow matching models aren’t de-noising pixels so much as they are simply following a vector field at each step like a ship being pushed by the wind.

cheald 3 months ago

Neat! Is that published anywhere?
It seems conceptually related to ddpm/ancestral sampling, no? Except they're just adding noise to the intermediate latent to simulate a "trajectory jump". How does your method compare?