Comment by cannoneyed
1 day ago
That’s exactly right - the fine tuned Qwen model was able to generate seamless pixels most of the time, but you can find lots of places around the map where it failed.
More interestingly, not even the biggest smartest image models can tell if a seam exists or not (likely due to the way they represent image tokens internally)
Did you ever consider using something like https://github.com/jenissimo/unfake.js/ in your process, to make it more proper-pixel-art?
Maybe to process the Nano-Banana generated dataset before fine-tuning, and then also to fix the generated Qwen output?
I'm curious why you didn't do something like generate new tiles one at a time, but just expand the input area on the sides with already-generated neighbors. Looks like your infill model doesn't really care about tile sizes, and I doubt it really needs full adjacent tiles to match style. Why 2x2 tile inputs rather than say... generate new tiles one at a time, but add 50px of bordering tile on each side that already has a pixel art neighbor?
Yeah I actually did that quite a bit too. I didn't want to get too bogged down in the nitty gritty of the tiling algorithm because it's actually quite difficult to communicate via writing (which probably contributed to it being hard to get AI to implement).
The issue is that the overall style was not consistent from tile to tile, so you'd see some drift, particularly in the color - and you can see it in quite a few places on the map because of this.
Have you tried restraining the color palette by post-processing?
Oh that makes sense, thanks for explaining! And thanks for sharing your process and result! Interesting to see your process, and looking at the map really tickles my nostalgia
There would have to be some tiles which don't have all four neighbors generated yet.