Comment by lvl155
12 days ago
Went from pandas to polars to duckdb. As mentioned elsewhere SQL is the most readable for me and LLM does most of the coding on my end (quant). So I need it at the most readable and rudimentary/step-wise level.
OT, but I can’t imagine data science being a job category for too long. It’s got to be one of the first to go in AI age especially since the market is so saturated with mediocre talents.
As a long time DS I sadly feel we filled the field with people who don’t do any actual data science or engineering. A lot of it is glorified BI users who at most pull some averages and run half baked AB tests.
I don’t think the field will go away with AI, frankly with LLMs I’ve automated that bottom 80% of queries I used to have to do for other users and now I just focus on actual hard problems.
That “build a self serve dashboard” or number fetching is now an agentic tool I built.
But the real meat of “my business specializes in X, we need models to do this well” has not yet been replaceable. I think most hard DS work is internal so isn’t in training sets (yet).
Even before LLMs, Data Science was being replaced by more specialization, IME.
Data Engineers took over the plumbing once they moved on from Scala and Spark. ML Engineers took over the modeling (and LLMs are now killing this job too, as it’s rare to need model training outside of big labs). Data analysts have to know SQL and python these days, and most DS are now just this, but with a nicer title and higher pay.
Once upon a time I thought DS would be much more about deeper statistics and causal inference, but those have proven to be rare, niche needs outside soft science academia.
Reading a comment like this makes me realize how broad the title “Data Scientist” is, especially this tidbit:
> as it’s rare to need model training outside of big labs
Do you think there are pre-trained models for e.g. process optimization for the primary metallurgy process for steel manufacturing? Industrial engineers don’t know anything about machine learning (by trade), and there are companies that bring specialized Data Science know-how to that industry to improve processes using modern data-driven methods, especially model building.
It’s almost like 99% of comments on this topic think that DS begins at image classification and ends at LLMs, with maybe a little bit of landing page A/B testing or something. Wild.
> Once upon a time I thought DS would be much more about deeper statistics and causal inference, but those have proven to be rare, niche needs outside soft science academia.
This is my entire career lol.
> It’s got to be one of the first to go in AI age especially since the market is so saturated with mediocre talents.
Depends what your definition of “to go” means. Responsibilities swallowed by peers? Sure, and new job titles might pop up like Research & Development Engineer or something.
The discipline of creating automated systems to extract insights from data to create business value? I can’t really see that going anywhere. I mean, why tf would we be building so many data centers if there’s no value in the data they’re storing.
<< It’s got to be one of the first to go in AI age especially since the market is so saturated with mediocre talents.
This is interesting. I wanted to dig into it a little since I am not sure I am following the logic of that statement.
Do you mean that AI would take over the field, because by default most people there are already not producing anything that a simple 'talk to data' LLM won't deliver?
Not GP, but as a data engineer who has worked with data scientists for 20 years, I think the assessment is unfortunately true.
I used to work on teams where DS would put a ton of time into building quality models, gating production with defensible metrics. Now, my DS counterparts are writing prompts and calling it a day. I'm not at all convinced that the results are better, but I guess if you don't spend time (=money) on the work, it's hard to argue with the ROI?
In what field do you work?
> writing prompts and calling it a day
What does this mean? They’re not creating pull requests and maintaining learning / analytics systems?
This kind of vagueposting gets on my nerves.
2 replies →