Comment by noman-land
9 hours ago
Really amazing video. Unfortunately this article is like 60% over my head. Regardless, I actually love reading jargon-filled statements like this that are totally normal to the initiated but are completely inscrutable to outsiders.
"That data was then brought into Houdini, where the post production team used CG Nomads GSOPs for manipulation and sequencing, and OTOY’s OctaneRender for final rendering. Thanks to this combination, the production team was also able to relight the splats."
Hi, I'm one of the creators of GSOPs for SideFX Houdini.
The gist is that Gaussian splats can replicate reality quite effectively with many 3D ellipsoids (stored as a type of point cloud). Houdini is software that excels at manipulating vast numbers of points, and renderers (such as Octane) can now leverage this type of data to integrate with traditional computer graphics primitives, lights, and techniques.
Can you put "Gaussing splats" in some kind of real world metaphor so I can understand what it means? Either that or explain why "Gaussian" and why "splat".
I am vaguely aware of stuff like Gaussian blur on Photoshop. But I never really knew what it does.
Sure!
Gaussian splatting is a bit like photogrammetry. That is, you can record video or take photos of an object or environment from many angles and reproduce it in 3D. Gaussians have the capability to "fade" their opacity based on a Gaussian distribution. This allows them to blend together in a seamless fashion.
The splatting process is achieved by using gradient descent from each camera/image pair to optimize these ellipsoids (Gaussians) such that the reproduce the original inputs as closely as possible. Given enough imagery and sufficient camera alignment, performed using Structure from Motion, you can faithfully reproduce the entire space.
Read more here: https://towardsdatascience.com/a-comprehensive-overview-of-g....
> I am vaguely aware of stuff like Gaussian blur on Photoshop. But I never really knew what it does.
Blurring is a convolution or filter operation. You take a small patch of image (5x5 pixels) and you convolve it with another fixed matrix, called a kernel. Convolution says multiply element-wise and sum. You replace the center pixel with the result.
https://en.wikipedia.org/wiki/Box_blur is the simplest kernel - all ones, and divide by the kernel size. Every pixel becomes the average of itself and its neighbors, which looks blurry. Gaussian blur is calculated in an identical way, but the matrix elements follow the "height" of a 2D Gaussian with some amplitude. It results in a bit more smoothing as farther pixels have less influence. Bigger the kernel, more blurrier the result.There are a lot of these basic operations:
https://en.wikipedia.org/wiki/Kernel_(image_processing)
If you see "Gaussian", it implies the distribution is used somewhere in the process, but splatting and image kernels are very different operations.
For what it's worth I don't think the Wikipedia article on Gaussian Blur is particularly accessible.
> explain why "Gaussian" and why "splat".
Happily. Gaussian splats are a technique for 3D images, related to point clouds. They do the same job (take a 3D capture of reality and generate pictures later from any point of view "close enough" to the original).
The key idea is that instead of a bunch of points, it stores a bunch of semi-transparent blobs - or "splats". The transparency increases quickly with distance, following a normal distribution- also known as the "Gaussian distribution."
Hence, "Gaussian splats".
How can you expect someone to tailor a custom explanation, when they don’t know your level of mathematical understanding, or even your level of curiosity. You don’t know what a Gaussian blur does; do you know what a Gaussian is? How deeply do you want to understand?
If you’re curious start with the Wikipedia article and use an LLM to help you understand the parts that don’t make sense. Or just ask the LLM to provide a summary at the desired level of detail.
2 replies →
My bad! I am the author. Gaussian splatting allows you to take a series of normal 2D images or a video and reconstruct very lifelike 3D from it. It’s a type of radiance field, like NeRFs or voxel based methods like Plenoxels!
Corridor has done some great stuff with Gaussian Splats, I recommend this video for a primer!
https://youtube.com/watch?v=cetf0qTZ04Y
Reminds me of Kurtwood Smith’s piping sales pitch in The Patriot