Great work! The fact that it captures "long-range order" seemingly perfectly is something not many have been able to do before! And the "collapse" visualization is great fun to watch.
But is your algorithm really qualitatively all that different from previous search methods (e.g. Efros and Leung), if you are still (uniform random?) sampling over the input distribution of patches?
I notice also your input textures tend to have sharp boundaries (as is common in pixel art). It would be interesting to see the results when the input has a lot of "noise", such as high def images of rocks or clouds ;)
While I still prefer search methods because they are easy to implement (and give better results), "deep" methods are definitely gaining some ground:
Efros' and Leung's method doesn't satisfy the (C1) condition. The closest previous work is Paul Merrel's model synthesis.
WFC and texture synthesis serve similar purposes: they produce images similar to the input image. However, the definition of what is "similar" is different in each case. If you have a high def input with noise (like realistic rocks and clouds) then you really want to to use texture synthesis methods. If you have an indexed image with few colors and you want to capture... something like the inner rules of that image and long range correlations (if you have an output of a cellular automata, for example, or a dungeon), then you want to use WFC-like methods.
> something like the inner rules of that image and long range correlations
I assume that if you feed WFC a large input image, it just thinks of that as a very complex set of rules that are harder to satisfy than those of a small input?
Is there a way, then, to instead train the WFC algorithm on a large corpus of small, similar samples, such that it can try to derive the rules common to all the inputs in the corpus, and produce one image that "really" fits the rules, rather than just the ephemeral quirks from an individual sample?
Would there be, for example, a way to train WFC to produce outputs matching the level-design "aesthetic" of a given game, rather than just "continuing" a particular level?
You mentioned rotation and mirroring type bitmap ops in your algorithm. But how do we precisely apply a series of dynamic filters to the input to "evolve" the patch? Even using overlapping kernels it seems quite intensive! And prone to artefacts...
Didn't know where else to contact you, but I made a straight [Java port](https://github.com/Aqwis/JavaWaveFunctionCollapse) just as a fun exercise. Feel free to add it to your README.md if you want.
It would be interesting to apply this concept to wavelets (instead of pixels or voxels) in order to work on real-life pictures.
Also, 3 dimensions as in X, Y and time, to work on animated GIFs. Think about an infinite, never repeating Blue Ball Machine! http://m.imgur.com/5Flh68G
This is crazy, and I think it hints at the possibility of universe creation:
you start from a finite pattern, which becomes the 'rules' of your created universe.
Then, using this wave function collapse algorithm you expand it into an infinity where the possibilities are endless within the constraints of those generator rules
Fantastic stuff. I love it! As with most meachine learning stuff and these very cool ideas (of which I am very hazy on, esp their categorizations) I immediately want to use it for our lab's brain work.
We have a LOT of 3-D images (per voxel is ~200nmx200nmx~600nm for a lot of 1028x1028 .tiff images all stitched together) and would love to feed these images BACKWARDS into this. IE we have the 'far field', and we want the 'elements' that make it up. Say we have a large amount of data on the Pituitary gland, and we want to compare the 'elements' of the Pituitary to the 'elements' of the hippocampus. We know a lot of these differences, but there may be 'something else' in there that we humans are not seeing. This may be of great use to use for disease pathology like Lewy Body Disease precursors.
Sounds like an altogether easier problem, assuming you don't care to compensate for noise in your data somehow - supposing you want to identify all distinct m×n×k-sized "elements", simply use some appropriate rolling hash[1] (i.e. a hash of a window that you can update in constant time as you slide the window) as a key mapping to a list of "elements" you have seen so far with that hash, and only do pointwise comparison (to see if you have seen that exact pattern before) if the hashes match. Assuming you don't have too many distinct elements, this should give you performance close to linear in the size of your data.
Whew, now that was a wikipedia binge. Thanks for the leaping off point! Unfortunately, our data is VERY noisy. We can do some techniques to smooth the data, but the unfortunate part is that the things that matter in biology are under the diffraction limit of light. Inherently, what we want out of the data will then be noisy. Gestalt structures are less noisy and these techniques can help with that (think using a fiber-optic read-out as you go into the brain for surgery so you know what region you are in), at least I think. Also, the 'elements' of our data set are unknown, but likely very large. Maybe in the 100s to 1000s, not 5 or 10. It turns out the brain is complicated, who knew?!
Still, thanks a ton for this info! I think it can really help with some computational bio stuff another lab is working on here (viral similarity in genes in your DNA)
Nicely done! Your 3d extensions could be further extended (performance notwithstanding) to create some pretty cool procedurally generated VR experiences. Thanks for sharing.
Don't really understand the technique, but would like your thoughts on if it's possible to give a Penrose tile set as a seed and see if aperiodic order is generated.
I'm not sure, but I think that Penrose tilesets are what I call "easy": you can't run into a situation where you can't place a new tile. It would be great if someone here could confirm or deny this.
So if this is the case, then Penrose tilesets are not interesting to WFC, because you can produce arbitrary tilings with much simpler algorithms.
Right now though WFC is only working with square tiles, but it's not hard to generalize it to arbitrary shapes. Paul F. Harrison made a tiling program that supports hex tiles: http://logarithmic.net/pfh/ghost-diagrams See also the relevant paragraph in the readme (just search the word "easy").
Penrose tiles are "easy" on an unbounded canvas, but I'm pretty sure they're a 100%-probable "contradiction" (because they're aperiodic) on a bounded toroidal canvas.
I think you're right about Penrose tiles being easy. One could, however, potentially use this with polygonal tiles to find their Heesch numbers and maybe even make some advancements in solving some of the many unsolved problems associated with Heesch tiling.
Maybe it could be interesting to place Penrose tiles with the simple algorithm, but color them with your algorithm just like you're currently coloring squares.
As I understand it: this treats image generation as constraint satisfaction. The constraints are that each NxN patch appears in the source image. The satisfaction method is arc consistency https://en.wikipedia.org/wiki/AC-3_algorithm, except, when that settles down prematurely, pick the least-constrained patch and make a random valid choice, then continue. (If this leads to getting stuck, then give up instead of backtracking.)
Is that the idea? The description wasn't completely clear to me.
Yes, (C1) is a constraint problem. But we also want to satisfy (C2) as close as possible, otherwise we could have just colored some outputs in a single color.
Yes, I took that to mean, when you're making a random valid choice, the weights come from the input distribution.
I don't understand about "a single color" unless you mean setting all output pixels to the same color, which would only satisfy the constraints if the input has an NxN patch all one color.
I hope my comment didn't seem to imply that the work seemed unoriginal. I don't think that; I wanted to check my reading.
Amazing. Make sure to scroll through the whole README to see voxel models. Rendered using http://ephtracy.github.io/index.html which is in itself pretty cool too.
Excellent! I especially like the example third from the top (black/white "rooms"). This would be a great tool to generate maps for roguelikes in a more "organic" fashion. Not only that, but the bitmap source you use for input would provide wildly different results.
Given that and the simplicity of just providing a bitmap as input, this could be adapted well to provide customization to the player as well.
This requires no more up front work than a Markov-based technique, but produces results on par with an L-grammar. It may be generally applicable to the problem domains of both!
PatchMatch is an algorithm to quickly... match similar patches in an image, it is used in a lot of texture synthesis algos. See my answer to fitzwatermellow for the difference between texture synthesis and WFC. So yes, it's related.
Photoshop's implementation of PatchMatch handles constraints perfectly, yes.
Yeah, you a right, I'll upload slower gifs. Right now youtube video has the slowest speed, in fact it has segments with no frame-skipping at all: https://youtu.be/DOQTr2Xmlz0
The way I see it, this would be the perfect way to produce semi-altered terrain sprites automatically. Right now,the textures are pseudo-randomly rotated for things like dirt blocks, grass tops, and stone. [1] However, with this you could create a multitude of different textures based off a single source.
Plus, dungeons could be a lot more interesting. It'd be interesting to see what could be done with it...
I don't have much to contribute other than to say this is really amazing, and I want to throw all kinds of things at it and see what happens. Pardon my ignorance of how this works, does this algorithm have anything to do with symmetry breaking?
No, not really. ConvChain though is related to symmetry breaking, the same way as MCMC simulation of the Ising model is https://github.com/mxgmn/ConvChain
Source code is a 1-dimensional array. For 1-dimensional arrays WFC is just a Markov chain. 2 and higher dimensional arrays are much more interesting because they have cycles, and there is no canonical way to generalize Markov chains to higher dimensions.
This will be abstract, but you seem to know your abstract algebra -- is it possible to do this kind of thing with graphs? It should be, right? And we all know code can be constructed with graphs, so… voila, you can generate code, no?
Just to make sure I understand, if I were to use 1D WFC with 1xN tiles, would it be the same as an (N-1)th order Markov chain? Or would it be a 1st-order Markov chain with (N-1) simultaneous outputs?
The README says it uses real-valued probability amplitudes, not complex-valued wavefunction. Could the simulation be done with complex numbers and if so how would the results differ?
We need to interpret those coefficients somehow. Real coefficients can be interpreted as mixing of colors, but for complex ones I don't see a good interpretation.
Can this approach be used to generate "missing parts" in unfinished song? If composer has few good parts but is too lazy to finish the whole piece, for example? Or to "extend" Moonlight for example?
There are special approaches to generating music. The best for ratio of quality/complexity that I know of are Markov constraints https://www.youtube.com/watch?v=buXqNqBFd6E and WaveNet. I don't think WFC offers something useful and new for generation of music.
I wonder too =). But it'll run like forever on a high res image. For high res image you want to use something like texture synthesis, see my reply to fitzwatermellow for more.
I'm not experienced with the license law, but people told me that it's better to have license text in source files themselves, because I have samples in the repo that I have no idea who has rights for.
It's C#. I'm not familiar with it which is why I was surprised that after downloading Visual Studio Tools, I opened Developer Command Prompt, I did a `csc *.cs` and I was left with one nice executable, Main.exe.
Great work! The fact that it captures "long-range order" seemingly perfectly is something not many have been able to do before! And the "collapse" visualization is great fun to watch.
But is your algorithm really qualitatively all that different from previous search methods (e.g. Efros and Leung), if you are still (uniform random?) sampling over the input distribution of patches?
I notice also your input textures tend to have sharp boundaries (as is common in pixel art). It would be interesting to see the results when the input has a lot of "noise", such as high def images of rocks or clouds ;)
While I still prefer search methods because they are easy to implement (and give better results), "deep" methods are definitely gaining some ground:
Deep Textures
http://bethgelab.org/deeptextures/
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis
https://arxiv.org/abs/1601.04589
Thanks!
Efros' and Leung's method doesn't satisfy the (C1) condition. The closest previous work is Paul Merrel's model synthesis.
WFC and texture synthesis serve similar purposes: they produce images similar to the input image. However, the definition of what is "similar" is different in each case. If you have a high def input with noise (like realistic rocks and clouds) then you really want to to use texture synthesis methods. If you have an indexed image with few colors and you want to capture... something like the inner rules of that image and long range correlations (if you have an output of a cellular automata, for example, or a dungeon), then you want to use WFC-like methods.
Btw, I have classic texture synthesis algos in a different repo: https://github.com/mxgmn/SynTex
> something like the inner rules of that image and long range correlations
I assume that if you feed WFC a large input image, it just thinks of that as a very complex set of rules that are harder to satisfy than those of a small input?
Is there a way, then, to instead train the WFC algorithm on a large corpus of small, similar samples, such that it can try to derive the rules common to all the inputs in the corpus, and produce one image that "really" fits the rules, rather than just the ephemeral quirks from an individual sample?
Would there be, for example, a way to train WFC to produce outputs matching the level-design "aesthetic" of a given game, rather than just "continuing" a particular level?
5 replies →
predating Efros & Leung by many years: http://draves.org/fuse/
1 reply →
Thanks for the reply!
So what I'm really after is a way to iteratively "perturb" the sample in such a way that the resulting texture will smoothly reflect those changes.
Take a look at this rough illustration:
Op-Art Texture Synthesis
http://imgur.com/a/1IKN3
You mentioned rotation and mirroring type bitmap ops in your algorithm. But how do we precisely apply a series of dynamic filters to the input to "evolve" the patch? Even using overlapping kernels it seems quite intensive! And prone to artefacts...
1 reply →
Didn't know where else to contact you, but I made a straight [Java port](https://github.com/Aqwis/JavaWaveFunctionCollapse) just as a fun exercise. Feel free to add it to your README.md if you want.
how did you come to understand all of this? are you an academic?
This is great!
It would be interesting to apply this concept to wavelets (instead of pixels or voxels) in order to work on real-life pictures.
Also, 3 dimensions as in X, Y and time, to work on animated GIFs. Think about an infinite, never repeating Blue Ball Machine! http://m.imgur.com/5Flh68G
This is crazy, and I think it hints at the possibility of universe creation:
you start from a finite pattern, which becomes the 'rules' of your created universe. Then, using this wave function collapse algorithm you expand it into an infinity where the possibilities are endless within the constraints of those generator rules
If you enjoy pondering on that, you may enjoy reading Permutation City [1] by Greg Egan.
[1] http://www.goodreads.com/book/show/156784.Permutation_City
1 reply →
Premise of New Kind of Science, basically.
Fantastic stuff. I love it! As with most meachine learning stuff and these very cool ideas (of which I am very hazy on, esp their categorizations) I immediately want to use it for our lab's brain work.
We have a LOT of 3-D images (per voxel is ~200nmx200nmx~600nm for a lot of 1028x1028 .tiff images all stitched together) and would love to feed these images BACKWARDS into this. IE we have the 'far field', and we want the 'elements' that make it up. Say we have a large amount of data on the Pituitary gland, and we want to compare the 'elements' of the Pituitary to the 'elements' of the hippocampus. We know a lot of these differences, but there may be 'something else' in there that we humans are not seeing. This may be of great use to use for disease pathology like Lewy Body Disease precursors.
Sounds like an altogether easier problem, assuming you don't care to compensate for noise in your data somehow - supposing you want to identify all distinct m×n×k-sized "elements", simply use some appropriate rolling hash[1] (i.e. a hash of a window that you can update in constant time as you slide the window) as a key mapping to a list of "elements" you have seen so far with that hash, and only do pointwise comparison (to see if you have seen that exact pattern before) if the hashes match. Assuming you don't have too many distinct elements, this should give you performance close to linear in the size of your data.
[1] https://en.wikipedia.org/wiki/Rolling_hash
Whew, now that was a wikipedia binge. Thanks for the leaping off point! Unfortunately, our data is VERY noisy. We can do some techniques to smooth the data, but the unfortunate part is that the things that matter in biology are under the diffraction limit of light. Inherently, what we want out of the data will then be noisy. Gestalt structures are less noisy and these techniques can help with that (think using a fiber-optic read-out as you go into the brain for surgery so you know what region you are in), at least I think. Also, the 'elements' of our data set are unknown, but likely very large. Maybe in the 100s to 1000s, not 5 or 10. It turns out the brain is complicated, who knew?!
Still, thanks a ton for this info! I think it can really help with some computational bio stuff another lab is working on here (viral similarity in genes in your DNA)
Nicely done! Your 3d extensions could be further extended (performance notwithstanding) to create some pretty cool procedurally generated VR experiences. Thanks for sharing.
Brilliant!
Don't really understand the technique, but would like your thoughts on if it's possible to give a Penrose tile set as a seed and see if aperiodic order is generated.
Lovin' it!
Thanks!
I'm not sure, but I think that Penrose tilesets are what I call "easy": you can't run into a situation where you can't place a new tile. It would be great if someone here could confirm or deny this.
So if this is the case, then Penrose tilesets are not interesting to WFC, because you can produce arbitrary tilings with much simpler algorithms.
Right now though WFC is only working with square tiles, but it's not hard to generalize it to arbitrary shapes. Paul F. Harrison made a tiling program that supports hex tiles: http://logarithmic.net/pfh/ghost-diagrams See also the relevant paragraph in the readme (just search the word "easy").
Agreed. The best way to find 'complexity' and perhaps aperiodicity is by using contradictory rules.
source: my own quasicrystal simulations (http://www.nature.com/nmat/journal/v14/n1/extref/nmat4152-s2...)
Penrose tiles are "easy" on an unbounded canvas, but I'm pretty sure they're a 100%-probable "contradiction" (because they're aperiodic) on a bounded toroidal canvas.
I think you're right about Penrose tiles being easy. One could, however, potentially use this with polygonal tiles to find their Heesch numbers and maybe even make some advancements in solving some of the many unsolved problems associated with Heesch tiling.
Maybe it could be interesting to place Penrose tiles with the simple algorithm, but color them with your algorithm just like you're currently coloring squares.
1 reply →
You need a backtracking algorithm to do Penrose tiles, so yeah, you can get stuck.
As I understand it: this treats image generation as constraint satisfaction. The constraints are that each NxN patch appears in the source image. The satisfaction method is arc consistency https://en.wikipedia.org/wiki/AC-3_algorithm, except, when that settles down prematurely, pick the least-constrained patch and make a random valid choice, then continue. (If this leads to getting stuck, then give up instead of backtracking.)
Is that the idea? The description wasn't completely clear to me.
Yes, (C1) is a constraint problem. But we also want to satisfy (C2) as close as possible, otherwise we could have just colored some outputs in a single color.
Yes, I took that to mean, when you're making a random valid choice, the weights come from the input distribution.
I don't understand about "a single color" unless you mean setting all output pixels to the same color, which would only satisfy the constraints if the input has an NxN patch all one color.
I hope my comment didn't seem to imply that the work seemed unoriginal. I don't think that; I wanted to check my reading.
4 replies →
(I should've said most-constrained)
Mindblowing... could this technique also be applied to music?
It should be, at least on audio level it would work. It would make a nice VST :)
Something similar does exist: http://labs.echonest.com/Uploader/index.html
It detects places in songs where the sound is similar, and can jump between these places in the track to make a seamless unending song.
I doubt it, because music is 1-dimensional and for 1-dimensional arrays WFC is just a Markov chain.
There are three-dimensional views of music, e.g. time-frequency-amplitude. See https://en.wikipedia.org/wiki/Spectrogram
4 replies →
How is music 1-dimensional?
8 replies →
Perhaps possible to slice the music wave data into frames, perform FFT to get a spectrogram and use that as tiles?
Amazing. Make sure to scroll through the whole README to see voxel models. Rendered using http://ephtracy.github.io/index.html which is in itself pretty cool too.
Excellent! I especially like the example third from the top (black/white "rooms"). This would be a great tool to generate maps for roguelikes in a more "organic" fashion. Not only that, but the bitmap source you use for input would provide wildly different results.
Given that and the simplicity of just providing a bitmap as input, this could be adapted well to provide customization to the player as well.
I don't have much to contribute to the conversation on this other than that it's incredibly cool.
My thoughts exactly. I will definitely read through this to understand it by my first impression was "This is so fucking neat!"
I'm missing a lot here. How does this equate to wave function collapse?
>with the help of ideas from quantum mechanics. >so it doesn't do the actual quantum mechanics, but it was inspired by QM.
Yea, my only disagreement here is the name.
2 replies →
This requires no more up front work than a Markov-based technique, but produces results on par with an L-grammar. It may be generally applicable to the problem domains of both!
This reminds me of the quite powerful PatchMatch algorithm [1]. Is it related?
Also note how PatchMatch can work with constraints.
[1] http://vis.berkeley.edu/courses/cs294-69-fa11/wiki/images/1/...
PatchMatch is an algorithm to quickly... match similar patches in an image, it is used in a lot of texture synthesis algos. See my answer to fitzwatermellow for the difference between texture synthesis and WFC. So yes, it's related.
Photoshop's implementation of PatchMatch handles constraints perfectly, yes.
You should add some gifs with a lower frame rate. I had to download the one you had and walk through frame by frame. Absolutely beautiful.
Thanks!
Yeah, you a right, I'll upload slower gifs. Right now youtube video has the slowest speed, in fact it has segments with no frame-skipping at all: https://youtu.be/DOQTr2Xmlz0
lower framerate, or lower playback speed?
Probably meant playback speed. Probably you knew this.
That's incredible!
This could be great for generating maps in a sprite based strategy game!
As a person currently rewriting the mapgen algorithm for their Dwarf-Fortress-like game, I'm very excited to experiment with this this weekend =D
Cool, got anything to show? :)
1 reply →
This could be extra awesome if it could be made small and fast enough to be able to execute as a fragment shader.
Well, right now it is not fast at all. :) But I plan to think about the problem of generating pixel shaders form examples in the future.
This is great. Someone should make a Minecraft plugin.
The way I see it, this would be the perfect way to produce semi-altered terrain sprites automatically. Right now,the textures are pseudo-randomly rotated for things like dirt blocks, grass tops, and stone. [1] However, with this you could create a multitude of different textures based off a single source.
Plus, dungeons could be a lot more interesting. It'd be interesting to see what could be done with it...
[1] - http://i.imgur.com/UJeLy.png
I don't have much to contribute other than to say this is really amazing, and I want to throw all kinds of things at it and see what happens. Pardon my ignorance of how this works, does this algorithm have anything to do with symmetry breaking?
Thanks!
No, not really. ConvChain though is related to symmetry breaking, the same way as MCMC simulation of the Ising model is https://github.com/mxgmn/ConvChain
ok thanks. The ConvChain project is also very interesting.
Can you feed it something other than bitmaps? Like its own source code?
Source code is a 1-dimensional array. For 1-dimensional arrays WFC is just a Markov chain. 2 and higher dimensional arrays are much more interesting because they have cycles, and there is no canonical way to generalize Markov chains to higher dimensions.
This will be abstract, but you seem to know your abstract algebra -- is it possible to do this kind of thing with graphs? It should be, right? And we all know code can be constructed with graphs, so… voila, you can generate code, no?
3 replies →
Just to make sure I understand, if I were to use 1D WFC with 1xN tiles, would it be the same as an (N-1)th order Markov chain? Or would it be a 1st-order Markov chain with (N-1) simultaneous outputs?
3 replies →
What about code in a bitmap-based language like Piet?
https://en.wikipedia.org/wiki/Esoteric_programming_language#...
On that tangent, it could be useful for fuzzing inputs to image analysis algorithms.
The README says it uses real-valued probability amplitudes, not complex-valued wavefunction. Could the simulation be done with complex numbers and if so how would the results differ?
We need to interpret those coefficients somehow. Real coefficients can be interpreted as mixing of colors, but for complex ones I don't see a good interpretation.
Very cool, might try do something like this in a game I've been thinking about writing.
Can this approach be used to generate "missing parts" in unfinished song? If composer has few good parts but is too lazy to finish the whole piece, for example? Or to "extend" Moonlight for example?
There are special approaches to generating music. The best for ratio of quality/complexity that I know of are Markov constraints https://www.youtube.com/watch?v=buXqNqBFd6E and WaveNet. I don't think WFC offers something useful and new for generation of music.
Very cool algorithm, and impressive visual results. Thanks for sharing.
Extremely nice to create procedural maps and other stuff. Really nice.
I love that one of the files is "Stuff.cs".
Super neat visuals and I can't wait to play with this, as someone who knows nothing about this kind of "stuff".
The circuit board example looks alarmingly convincing despite it's obvious we're not looking at an actual circuit...
Amazing. If this is fast enough, I imagine it could do wonders for games (terrain and background generation).
Very impressive. I have no use for your work whatsoever yet would love to dive into it further.
WOW, this is amazing. Really impressed seeing how it can adapt to multiple patterns.
This is amazing, and could have very cool applications in procedural generation.
Cool. I wonder what the results are on a high resolution image.
Google maps could maybe be interesting. Create alternate Earth maps to use... somehow... logistics? climate?
I wonder too =). But it'll run like forever on a high res image. For high res image you want to use something like texture synthesis, see my reply to fitzwatermellow for more.
I don't quite get it, but damn it's awesome!
This is pretty awesome, but there is no LICENSE file so I'm assuming no one is allowed to use it in their own projects.
The source files say its distributed under the MIT license.
I'm not experienced with the license law, but people told me that it's better to have license text in source files themselves, because I have samples in the repo that I have no idea who has rights for.
The license is MIT.
I opened this ticket: https://github.com/mxgmn/WaveFunctionCollapse/issues/2
Thanks for posting this! Really exciting work!
What language is this?
How can I compile / run it?
It's C#. I'm not familiar with it which is why I was surprised that after downloading Visual Studio Tools, I opened Developer Command Prompt, I did a `csc *.cs` and I was left with one nice executable, Main.exe.
Wow!