← Back to context

Comment by noosphr

13 hours ago

Racket is an amazing language for prototyping ideas that you don't understand yet.

At $dayjob I'm using it to test what novel geometries of deep learning models would look like. Being able to redefine any part of the stack for any reason is a superpower you don't know you need until you do.

A great place to start is the little learner which holds your hand until you get opinionated about what the underlying primitives should look like. E.g. what if we used sparse tensor representation?

That sounds kind of amazing. But you're not actually doing the machine learning in Racket, are you? Is your Racket code generating other code like PyTorch?

  • I'm doing the learning in racket because the bottleneck is human understanding.

    That mnist takes 30 minutes per epoch isn't a worry when I don't even know what vector addition should look like.

    • This is a complete tangent, but since you mentioned MNIST: I accidentally discovered Tsetlin machines this week when someone on r/Julia asked if anyone with an AMD GPU could run the benchmark in their package called Tsetlin.jl. I've got an AMD GPU so I was happy to oblige. Then I looked at what the benchmark was doing: it was training an MNIST classifier to 98% accuracy in 9 seconds - that seemed like a couple of orders of magnitude too fast. I was flabbergasted and wondered what the heck this thing was and that's when I learned about Tsetlin machines. I went on (with the help of Claude) to implement one in an FPGA and again was flabbergasted when it only took 2k LUTs to implement a Tsetlin machine for MNIST classification in hardware.

      1 reply →

    • > I don't even know what vector addition should look like.

      I think you're trying to imply you're inventing something new and racket enables you to explore... But what I read (as someone with a PhD in deep learning that has worked on sparsity) is you actually don't know the prior art and you're using racket as an excuse to reinvent a whole bunch of stuff that already exists in plenty of mature libraries in more mundane languages (including python/pytorch). Which is of course fine for personal growth but please don't oversell racket as a "superpower" - to wit I can manipulate any part of my stack too because it's all written in cpp.

      2 replies →