Comment by akst
20 hours ago
I know "next-generation" is just SEO slop, but I'm going to hyper fixate on this for a moment (so feel free to ignore if you're actually interested in Positron).
I think the future of data science will likely be something else, with the advent of WebGPU[1] (which isn't just a web technology) and the current quality/availability of GPUs in end user devices, and a lot of data computation clearly standing to benefit from this.
The real next generation of data science tools will likely involve tools that are GPU first and try to keep as much work in the GPU as possible. I definitely think we'll see some new languages eventually emerge to abstract much of the overhead of batching work but also forces people to explicitly consider when they write code that simply won't run on the GPU, like sequential operations that are nonlinear, nonassociative/noncommutative (like highly sequential operations like processing an ordered block of text).
I think WebGPU is going to make this a lot easier.
That said I'd imagine for larger compute workloads people are going to continue to stick with large CUDA clusters as they have more functionality and handle a larger variety of workloads. But on end user devices there's an opportunity to create tools that allow data scientists to more trivially do this kind of work when they compute their models, process their datasets.
[1] Other compute APIs existed in the past, but WebGPU might be one of the most successful attempt to provide a portable (and more accessible) way to write general GPU compute code. I've seen people say WebGPU is hard, but having given it ago (without libraries) I don't think this is all that true, compared to OpenGL there are no longer specialised APIs to load data into uniforms everything is just a buffer. I wonder if this has more to do with non JS bindings for use outside the browser/node or the fact you're forced to consider memory layout of anything your loading into the GPU from the start (something that can be abstracted and generalised), just in my experience after my first attempt at writing a compute shader it's fairly simple IMO. Like stuff that always complicated in rendering like text is still complicated, but at least its not a state based API like web/opengl.
It's worth considering what nextgen really would be, but probably VSCode and its forks will dominate for the time being. I recall Steve Yegge predicting that the next IDE to beat be the web browser, and this was around 2008 or so. It's not the reality, but took about 10-15 years for it to actually happen, even though there were earlier shots at it by like Atom.
I guess my mind is wasn’t so much on the editor but that was what the article is was about and I don’t disagree.
Interesting question. I don't know much about WebGPU, but I'd posit (heh!) that the GPU on the client devices doesn't matter too much since folks will likely be working over the network anyways (cloud-based IDE, coding agent connected to cloud-hosted LLM, etc) and we also have innovations like Modal which allow serverless lambdas for GPUs.
As long as silicon is scarce it would make sense to hoard it and rent it out (pricing as a means of managing scarcity); if we end up in a scenario where silicon is plentiful, then everyone would have powerful local GPUs, using local AI models, etc.
I guess in my mind I was thinking use cases other than AI. Like statistical or hierarchical scientific models, simulations or ETL work. I also don't know if some of the econometricians I know with a less technical background would even know how to get setup with AWS, and I feel more boardly there's enough folks doing data work in a none tech field who know how to use Python or R or Matlab to do their modelling but likely isn't comfortable with cloud infrastructure, but might have an apple laptop with apple silicon that could improve their iteration loop. Folks in AI are probably more comfortable with a cloud solution.
There are aspects of data science which is iterative and you're repeatedly running similar computations with different inputs, I think there's some value in shaving off time between iterations.
In my case I have a temporal geospatial dataset with 20+ million properties for each month over several years each with various attributes, it's in a nonprofessional setting and the main motivator for most of my decisions is "because I can and I think it would be fun and I have a decent enough GPU". While I could probably chuck it on a cluster, I'd like to avoid if I can help it and an optimisation done on my local machine would still pay off if I did end up setting up a cluster. There's quite a bit of ETL preprocessing work before I load it into the database, I think are portions that might be doable on the GPU. But it's more so the computations I'd like to do on the dataset before generating visualisations in which I think I could reduce the iteration wait time for processing for plots, ideally to the point I can make iterations more interactive. There's enough linear operations you could get some wins with a GPU implementation.
I am keen to see how far I'll get, but worst case scenario I learn a lot, and I'm sure those learnings will be transferrable to other GPU experiments.
TBC, I too did not really mean "AI" (as in LLMs and the like) which is often hosted/served with a very convenient interface. I do include more bespoke statistical / mathematical models -- be it hierarchical, Bayesian, whatever.
Since AWS/etc are quite complicated, there are now a swarm of startups trying to make it easier to take baby steps into the cloud (eg. Modal, Runpod, etc) and make it very easy for the user to get a small slice of that GPU pie. These services have drastically simpler server-side APIs and usage patterns, including "serverless" GPUs from Modal, where you can just "submit jobs" from a Python API without really having to manage containers. On the client side, you have LLM coding agents that are the next evolution in UI frontends -- and they're beginning to make it much much easier to write bespoke code to interact with these backends. To make it abundantly clear what target audience I'm referring to: I imagine they are still mostly using sklearn (or equivalents in other languages) and gradient boosting with Jupyter notebooks, still somewhat mystified by modern deep learning and stuff. Or maybe those who are more mathematically sophisticated but not software engg sophisticated (eg: statisticians / econometricians)
To inspire you with a really concrete example: since Modal has a well documented API, it should be quite straight-forward ("soon", if not already) for any data scientist to use one of the CLI coding agents and
1. Implement a decent GPU-friendly version of whatever model they want to try (as long as it's not far from the training distribution i.e. not some esoteric idea which is nothing like prior art)
2. Whip up a quick system to interface with some service provider, wrap that model up and submit a job, and fetch (and even interpret) results.
----
In case you haven't tried one of these new-fangled coding agents, I strongly encourage you to try one out (even if it's just something on the free tier eg. gemini-cli). In case you have and they aren't quite good enough to solve your problem, tough luck for now... I anticipate their usability will improve substantially every few months.
check out the RAPIDS ecosystem from 2018 or so :)
This looks interesting, thanks for sharing.