Comment by woolion

2 months ago

I didn't go very far with my own benchmarks because my results were just so bad. But for example, here's a line art with the instruction to color it (I can't remember the prompt, I didn't take notes).

https://woolion.art/assets/img/ai/ai_editing.webp

It's original, ChatGPT, Flux.

Still, you can see that ChatGPT just throw everything out and does not do a minimal attempt at respecting style. Flux is quite bad, but it follows the design much more (although it gets completely confused by it) that it seems that with a whole lot of work you could get something out of it.

3 comments

woolion

vunderba 2 months ago

Yeah so NOVEL style transfer without the use of a trained LoRA is, to my knowledge, still a relatively unsolved problem. Even in SOTA models like Nano Banana Pro, if you attach several images with a distinct artistic style that is outside of its training data and use a prompt such as:

"Using the attached images as stylistic references, create an image of X"

It's fall down pretty hard.

https://imgur.com/a/o3htsKn

woolion 2 months ago
I'm pretty sure that some model at least advertised that it would work. I also think your example was in the training data at some point least, but I suspect these styles are kind of pruned when the models are steered towards "aesthetically pleasing" outputs which are often used as benchmarks. Thanks for the replies, it's quite informative.
- vunderba 2 months ago
  
  Sure! So that image was pretty zoomed out, I've gone ahead and attached some of the reference images in greater detail:
  https://imgur.com/a/failed-style-transfer-nb-pro-o3htsKn
  Now you should be able to see that the generated image is stylistically not even close to the references (which are early works by Yoichi Kotabe). Pay careful attention to the characters.
  With locally hostable models, you can try things like Reference/Shuffle ControlNets but that's not always successful either.