Comment by darhodester

21 days ago

Hi,

I'm David Rhodes, Co-founder of CG Nomads, developer of GSOPs (Gaussian Splatting Operators) for SideFX Houdini. GSOPs was used in combination with OTOY OctaneRender to produce this music video.

If you're interested in the technology and its capabilities, learn more at https://www.cgnomads.com/ or AMA.

Try GSOPs yourself: https://github.com/cgnomads/GSOPs (example content included).

61 comments

darhodester

henjodottech 21 days ago

I’m fascinated by the aesthetic of this technique. I remember early versions that were completely glitched out and presented 3d clouds of noise and fragments to traverse through. I’m curious if you have any thoughts about creatively ‘abusing’ this tech? Perhaps misaligning things somehow or using some wrong inputs.

darhodester 21 days ago
There's a ton of fun tricks you can perform with Gaussian splatting!
You're right that you can intentionally under-construct your scenes. These can create a dream-like effect.
It's also possible to stylize your Gaussian splats to produce NPR effects. Check out David Lisser's amazing work: https://davidlisser.co.uk/Surface-Tension.
Additionally, you can intentionally introduce view-dependent ghosting artifacts. In other words, if you take images from a certain angle that contain an object, and remove that object for other views, it can produce a lenticular/holographic effect.
- echelon 21 days ago
  
  Y'all did such a good job with this. It captivated HN and was the top post for the entire day, and will probably last for much of tomorrow.
  If you don't know already, you need to leverage this. HN is one of the biggest channels of engineers and venture capitalists on the internet. It's almost pure signal (minus some grumpy engineer grumblings - we're a grouchy lot sometimes).
  Post your contract info here. You might get business inquiries. If you've got any special software or process in what you do, there might be "venture scale" business opportunities that come your way. Certainly clients, but potentially much more.
  (I'd certainly like to get in touch!)
  --
  edit: Since I'm commenting here, I'll expand on my thoughts. I've been rate limited all day long, and I don't know if I can post another response.
  I believe volumetric is going to be huge for creative work in the coming years.
  Gaussian splats are a huge improvement over point clouds and NeRFs in terms of accessibility and rendering, but the field has so many potential ways to evolve.
  I was always in love with Intel's "volume", but it was impractical [1, 2] and got shut down. Their demos are still impressive, especially from an equipment POV, but A$AP Rocky's music video is technically superior.
  During the pandemic, to get over my lack of in-person filmmaking, I wrote Unreal Engine shaders to combine the output of several Kinect point clouds [3] to build my own lightweight version inspired by what Intel was doing. The VGA resolution of consumer volumetric hardware was a pain and I was faced with fpga solutions for higher real time resolution, or going 100% offline.
  World Labs and Apple are doing exciting work with image-to-Gaussian models [4, 5], and World Labs created the fantastic Spark library [6] for viewing them.
  I've been leveraging splats to do controllable image gen and video generation [7], where they're extremely useful for consistent sets and props between shots.
  I think the next steps for Gaussian splats are good editing tools, segmenting, physics, etc. The generative models are showing a lot of promise too. The Hunyuan team is supposedly working on a generative Gaussian model.
  [1] https://www.youtube.com/watch?v=24Y4zby6tmo (film)
  [2] https://www.youtube.com/watch?v=4NJUiBZVx5c (hardware)
  [3] https://www.twitch.tv/videos/969978954?collection=02RSMb5adR...
  [4] https://www.worldlabs.ai/blog/marble-world-model
  [5] https://machinelearning.apple.com/research/sharp-monocular-v...
  [6] https://sparkjs.dev/
  [7] https://github.com/storytold/artcraft (in action: https://www.youtube.com/watch?v=iD999naQq9A or https://www.youtube.com/watch?v=f8L4_ot1bQA )
  
  4 replies →
darhodester 21 days ago

The ghost effect is pretty cool, too! https://www.youtube.com/watch?v=DQGtimwfpIo
jofzar 21 days ago

https://youtu.be/eyAVWH61R8E?t=3m53s
Superman is what comes to mind for this

kqr 21 days ago

I remember splatting being introduced as a way to capture real life scenes, but one of the links you have provided in this discusson seems to have used a traditional polygon mesh scene as training input for the splat model. How common is this and why would one do it that way over e.g. vertex shader effects that give the mesh a splatty aesthetic?

darhodester 21 days ago

Yes, it's quite trivial to convert traditional CG to Gaussian splats. We can render our scenes/objects just as we would capture physical spaces. The additional benefits of using synthetic data is 100% accurate camera poses (alignment) which means the structure from motion (SfM) step can be bypassed.
It's also possible to splat from textured meshes directly, see: https://github.com/electronicarts/mesh2splat. This approach yields high quality, PBR compatible splats, but is not quite as efficient as a traditional training workflow. This approach will likely become mainstream in third party render engines, moving forward.
Why do this? 1. Consistent, streamlined visuals across a massive ecosystem, including content creation tools, the web, and XR headsets. 2. High fidelity, compressed visuals. With SOGs compression, splats are going to become the dominant 3D representation on the web (see https://superspl.at). 3. E-commerce (product visualizations, tours, real-estate, etc.) 4. Virtual production (replace green screens with giant LED walls). 5. View-dependent effects without (traditional) shaders or lighting
It's not just about the aesthetic, it's also about interoperability, ease of use, and the entire ecosystem.

sbierwagen 21 days ago

From the article:

>Evercoast deployed a 56 camera RGB-D array

Do you know which depth cameras they used?

bininunez 21 days ago
We (Evercoast) used 56 RealSense D455s. Our software can run with any camera input, from depth cameras to machine vision to cinema REDs. But for this, RealSense did the job. The higher end the camera, the more expensive and time consuming everything is. We have a cloud platform to scale rendering, but it’s still overall more costly (time and money) to use high res. We’ve worked hard to make even low res data look awesome. And if you look at the aesthetic of the video (90s MTV), we didn’t need 4K/6K/8K renders.
- bredren 21 days ago
  
  You may have explained this elsewhere, but if not—-what kind of post processing did you do to upscale or refine the realsense video?
  Can you add any interesting details on the benchmarking done against the RED camera rig?
  
  1 reply →
darhodester 21 days ago

Aha: https://www.red.com/stories/evercoast-komodo-rig
So likely RealSense D455.
darhodester 21 days ago

I was not involved in the capture process with Evercoast, but I may have heard somewhere they used RealSense cameras.
I recommend asking https://www.linkedin.com/in/benschwartzxr/ for accuracy.
secretsatan 21 days ago
Couldn’t you just use iphone pros for this? I developed an app specifically for photogrammetry capture using AR and the depth sensor as it seemed like a cheap alternative.
EDIT: I realize a phone is not on the same level as a red camera, but i just saw iphones as a massively cheaper option to alternatives in the field i worked in.
- F7F7F7 21 days ago
  
  ASAP Rocky has a fervent fanbase who's been anticipating this album. So I'm assuming that whatever record label he's signed to gave him the budget.
  And when I think back to another iconic hip hop (iconic that genre) video where they used practical effects and military helicopters chasing speedboats in the waters off of Santa Monica...I bet they had change to spear.
  
  1 reply →
- numpad0 21 days ago
  
  A single camera only captures the side of the object facing the camera. Knowing how far away that camera facing side of a Rubik's Cube help if you were making educated guesses(novel view synthesis), but it won't solve the problem of actually photographing the backside.
  There are usually six sides on a cube, which means you need minimum six iPhone around an object to capture all sides of it to be able to then freely move around it. You might as well seek open-source alternatives than relying on Apple surprise boxes for that.
  In cases where your subject would be static, such as it being a building, then you can wave around a single iPhone for the same effect for a result comparable to more expensive rigs, of course.
  
  2 replies →
- darhodester 21 days ago
  
  I think it's because they already had proven capture hardware, harvest, and processing workflows.
  But yes, you can easily use iPhones for this now.
  
  7 replies →
- fastasucan 21 days ago
  
  Why would they go for the cheapest option?
  
  1 reply →
brcmthrowaway 21 days ago

Kinect Azure

dostick 21 days ago

Can such plugin be possible for Davinci Resolve, to have merge of scene captured from two iPhones with spatial data, into 3D scene? With M4 that shouldn’t be problem?

darhodester 21 days ago

Yes: https://irrealix.com/plugin/gaussian-splatting-davinci-resol...
(I'm not the author.)
You can train your own splats using Brush or OpenSplat

jeffgreco 21 days ago

Great work! I’d love to see a proper BTS or case study.

darhodester 21 days ago

I do believe a BTS is being developed.
tokymegz 21 days ago

Stay tuned

c-fe 21 days ago

Hi David, have you looked into alternatives to 3DGS like https://meshsplatting.github.io/ that promise better results and faster training?

darhodester 21 days ago

I have. Personally, I'm a big fan of hybrid representations like this. An underlying mesh helps with relighting, deformation, and effective editing operations (a mesh is a sparse node graph for an otherwise unstructured set of data).
However, surface-based constraints can prevent thin surfaces (hair/fur) from reconstructing as well as vanilla 3DGS. It might also inhibit certain reflections and transparency from being reconstructed as accurately.

moralestapia 21 days ago

Random question, since I see your username is green.

How did you find out this was posted here?

Also, great work!

darhodester 21 days ago

My friend and colleague shared a link with me. Pretty cool to see this trending here. I'm very passionate about Gaussian splatting and developing tools for creatives.
And thank you!

npkk2 21 days ago

I've been mesmerized by the visusals of Gaussian splatting for a while now, congratulations for your great work!

Do you have some benchmarks about what is the geometric precision of these reproductions?

darhodester 21 days ago

Thank you!
Geometric analysis for Gaussian splatting is a bit like comparing apples and oranges. Gaussian splats are not really discrete geometry, and their power lies in overlapping semi-transparent blobs. In other words, their benefit is as a radiance field and not as a surface representation.
However, assuming good camera alignment and real world scale enforced at the capture and alignment steps, the splats should match real world units quite closely (mm to cm accuracy). See: https://www.xgrids.com/intl?page=geomatics.

tamat 21 days ago

nice work.

I can see that relighting is still a work in progress, as the virtual spot lights tends to look flat and fake. I understand that you are just making brighter splats that fall inside the spotlight cone and darker the ones behind lots of splats.

Do you know if there are plans for gaussian splats to capture unlit albedo, roughness and metalness? So we can relight in a more realistic manner?

Also, environment radiosity doesnt seem to translate to the splats, am I right?

Thanks

darhodester 21 days ago

Thank you!
There are many ways to relight Gaussian splats. However, the highest quality results are currently coming from raytracing/path tracing render engines (such as Octane and VRay), with 2D diffusion models in second place. Relighting with GSOPs nodes does not yield as high quality, but can be baked into the model and exported elsewhere. This is the only approach that stores the relit information in the original splat scene.
That said, you are correct that in order to relight more accurately, we need material properties encoded in the splats as well. I believe this will come sooner than later with inverse rendering and material decomposition, or technology like Beeble Switchlight (https://beeble.ai). This data can ultimately be predicted from multiple views and trained into the splats.
"Also, environment radiosity doesnt seem to translate to the splats, am I right?"
Splats do not have their own radiosity in that sense, but if you have a virtual environment, its radiosity can be translated to the splats.
darhodester 21 days ago

This may interest you: https://www.linkedin.com/posts/radiancefields_in-case-you-we...
Syzygies 21 days ago

Back in 2001 I was the math consultant for "A Beautiful Mind". One spends a lot of time waiting on a film set. Eventually one wonders why.
The majority of wait time was the cinematographer lighting each scene. I imagined a workflow where secondary digital cameras captured 3D information, and all lighting took place in post production. Film productions hemorrhage money by the second; this would be a massive cost saving.
I described this idea to a venture capitalist friend, who concluded one already needed to be a player to pull this off. I mentioned this to an acquaintance at Pixar (a logical player) and they went silent.
Still, we don't shoot movies this way. Not there yet...

mmaaz 21 days ago

Really cool work!

huflungdung 21 days ago

[dead]

delaminator 21 days ago

[flagged]

chrisjj 21 days ago

[flagged]

dagmx 21 days ago
Is it possible you didn’t comprehend which parts were 3D?
Or if you did, perhaps a critique is better rather than just a low effort diss.
- chrisjj 21 days ago
  
  I viewed on a flat monitor, so perhaps I missed some 4D and 5D too.
  /i
darhodester 21 days ago

That's hurtful.

GrowingSideways 21 days ago

Take the money and never admit to selling this shit. Why would you ever willingly associate your name with this?

darhodester 21 days ago
Read the room. Plenty of people are interested in the aesthetics and the technology.
- GrowingSideways 21 days ago
  
  Just because people want to give you money doesn't mean you toss your dignity out the window.