Comment by moregrist

18 hours ago

> My point is not that hard to understand.

Have you done any serious graphics programming? Even at the OpenGL 1.x level? What you’re saying just doesn’t make sense.

Just because you’re rotating and translating things in 3-space doesn’t negate that you have a stack of transforms that relate a point in world space to one on screen space and you want to be able to project from one to the other.

Nor does it make it any easier when you need to think about how to stack transforms to achieve effects like rendering a mirror.

I honed a lot of useful practical skill with linear algebra trying to get graphics to do what I wanted. And I say this as someone who’s spent the bulk of my career using linear algebra in the context of quantum mechanics, physical simulation, and ML-adjacent areas.

> negate that you have a stack of transforms that relate a point in world space to one on screen space and you want to be able to project from one to the other.

no it doesn't "negate", it's all completely orthogonal (pun) or irrelevant. like for real just please take a look at

https://docs.pytorch.org/docs/2.12/nn.html

and tell me which operators you're imagining have any resemblance with typical graphics linear algebra.

like when you people make such claims do you really have anything concrete in mind or just hype?

  • > tell me which operators you’re imagining have any resemblance with typical graphics linear algebra

    FWIW, since it seems like you’re unaware: most of those are used in graphics in general, and have been used since long before Torch existed. Convolution’s extremely common. Pooling is just a type of image resampling to graphics people. Non-linear activations are just response functions that graphics people use for colors for example, also volume rendering. Normalization, linear, distance, vision, and shuffle layers are all absolutely standard common operations in graphics, on everything from images to meshes to volumes to matrices.

    BTW, most of those Torch layers aren’t “linear algebra” per se, they are just convenient building blocks for neural networks, many of which are also convenient building blocks for graphics… and for similar reasons.

    Was your point implicitly limited to rotations or a raster pipeline’s model-view-projection matrix? That certainly does not amount to all “graphics”, right?

    > Graphics in no way, way, shape, or form prepares you for ML. I don't understand why this is so controversial.

    This isn’t really controversial, it’s just not particularly true as stated. Graphics is much more than 3d rotation matrices, and doing real modern graphics involves all kinds of linear algebra, with immense amounts of overlap between the linear algebra that ML and computer vision use.

    Perhaps missing from this conversation is any thoughtful consideration to the history of today’s ML and the cross pollination between the fields we call graphics, vision, and ML. The implicit assumption you seem to be making that they are distinct fields without a shared history and co-development and without a shared foundation is not a good assumption.

    I personally know enough ex-graphics people that transitioned to ML and were well prepared by graphics and are wildly successful in ML that it makes your claim sound somewhat ignorant of what’s happened and is happening in both graphics and ML from my perspective, for what it’s worth.