Comment by mynameisash
4 hours ago
> They’re not creating pull requests and maintaining learning / analytics systems?
Sure, they check prompts into git. And there are a few notebooks that have been written and deployed, but most of that is collecting data and handing it off to ChatGPT. No, they're not maintaining learning/analytics systems. My team builds our data processing pipelines, and we support everything in production.
> This kind of vagueposting gets on my nerves.
What is vague about my comment?
Whereas in the past, the DS teams I worked with would do feature engineering and rigorous evaluation of models with retraining based on different criteria, now I'm seeing that teams are being lazy and saying, "We'll let the LLM do things. It can handle unstructured data, and we can give it new data without additional work on our part." Hence, they're simply writing a prompt and not doing much more.
I have never heard of this. What kind of insights are being generated? What kind of data? Am I unaware that we’re at the point that I can give a CSV of e.g. industrial measurement data to an LLM and it provides reliable and repeatable output? Are people making decisions based on the LLM output? Do the people making those decisions based on that output know that it might be completely hallucinated and the only response they’ll get from the “Data Scientists” is a shoulder shrug?
So many questions. That’s why I called it vague. I don’t know how any data scientist could read this and not have a million follow up questions. Is this offline learning? Online learning? What are the guardrails? Are there guardrails? Mostly, wtf?