Comment by Lerc
14 hours ago
This will be the future of a class of 3d Game. the prompt may not be text however.
An input of a kind of schematic representation of what the designer wants would be better. It may resemble a storyboard or a collection of organised notes that large projects tend to already use.
Fully generative could probably do some cool things, but people will still want to bring their peronal vision to life.
Curious, why wouldn't the future be a full world model like Google's Genie? It just renders every pixel so someone could still make their vision come to life via a prompt too.
It could be done that way but you are spending parameters managing the fact that the output changes completely with a change in view position or orientation. A observer independent model only has to manage changes of things that are actually changing in the world.
Since you can view Gaussian splats from any POV you end up generating an output that is closer to the representation of the world instead of a projection that a single observer sees.
Yeah, when you describe that, I picture Wave Function Collapse to generate a map schematic... And then a text prompt, and some style photos the designers want it to match.