Comment by ssivark

12 hours ago

Interesting question. I don't know much about WebGPU, but I'd posit (heh!) that the GPU on the client devices doesn't matter too much since folks will likely be working over the network anyways (cloud-based IDE, coding agent connected to cloud-hosted LLM, etc) and we also have innovations like Modal which allow serverless lambdas for GPUs.

As long as silicon is scarce it would make sense to hoard it and rent it out (pricing as a means of managing scarcity); if we end up in a scenario where silicon is plentiful, then everyone would have powerful local GPUs, using local AI models, etc.

2 comments

ssivark

akst 10 hours ago

I guess in my mind I was thinking use cases other than AI. Like statistical or hierarchical scientific models, simulations or ETL work. I also don't know if some of the econometricians I know with a less technical background would even know how to get setup with AWS, and I feel more boardly there's enough folks doing data work in a none tech field who know how to use Python or R or Matlab to do their modelling but likely isn't comfortable with cloud infrastructure, but might have an apple laptop with apple silicon that could improve their iteration loop. Folks in AI are probably more comfortable with a cloud solution.

There are aspects of data science which is iterative and you're repeatedly running similar computations with different inputs, I think there's some value in shaving off time between iterations.

In my case I have a temporal geospatial dataset with 20+ million properties for each month over several years each with various attributes, it's in a nonprofessional setting and the main motivator for most of my decisions is "because I can and I think it would be fun and I have a decent enough GPU". While I could probably chuck it on a cluster, I'd like to avoid if I can help it and an optimisation done on my local machine would still pay off if I did end up setting up a cluster. There's quite a bit of ETL preprocessing work before I load it into the database, I think are portions that might be doable on the GPU. But it's more so the computations I'd like to do on the dataset before generating visualisations in which I think I could reduce the iteration wait time for processing for plots, ideally to the point I can make iterations more interactive. There's enough linear operations you could get some wins with a GPU implementation.

I am keen to see how far I'll get, but worst case scenario I learn a lot, and I'm sure those learnings will be transferrable to other GPU experiments.

ssivark 4 hours ago

TBC, I too did not really mean "AI" (as in LLMs and the like) which is often hosted/served with a very convenient interface. I do include more bespoke statistical / mathematical models -- be it hierarchical, Bayesian, whatever.
Since AWS/etc are quite complicated, there are now a swarm of startups trying to make it easier to take baby steps into the cloud (eg. Modal, Runpod, etc) and make it very easy for the user to get a small slice of that GPU pie. These services have drastically simpler server-side APIs and usage patterns, including "serverless" GPUs from Modal, where you can just "submit jobs" from a Python API without really having to manage containers. On the client side, you have LLM coding agents that are the next evolution in UI frontends -- and they're beginning to make it much much easier to write bespoke code to interact with these backends. To make it abundantly clear what target audience I'm referring to: I imagine they are still mostly using sklearn (or equivalents in other languages) and gradient boosting with Jupyter notebooks, still somewhat mystified by modern deep learning and stuff. Or maybe those who are more mathematically sophisticated but not software engg sophisticated (eg: statisticians / econometricians)
To inspire you with a really concrete example: since Modal has a well documented API, it should be quite straight-forward ("soon", if not already) for any data scientist to use one of the CLI coding agents and
1. Implement a decent GPU-friendly version of whatever model they want to try (as long as it's not far from the training distribution i.e. not some esoteric idea which is nothing like prior art)
2. Whip up a quick system to interface with some service provider, wrap that model up and submit a job, and fetch (and even interpret) results.
----
In case you haven't tried one of these new-fangled coding agents, I strongly encourage you to try one out (even if it's just something on the free tier eg. gemini-cli). In case you have and they aren't quite good enough to solve your problem, tough luck for now... I anticipate their usability will improve substantially every few months.