← Back to context

Comment by XenophileJKO

17 hours ago

That is like saying training tensorflow models is just calling some APIs.

Actually making a system like this work seems easy, but isn't really.

(Though with the CURRENT generation or two of models it has gotten "pretty easy" I think. Before that, not so much.)

No idea about training tenserflow models - is it super complex or is it just calling a couple of APIs ? Langchain is literally calling an API. Maybe you need to get good with prompting or whatever, but I don't see where the complexity lies. Please let me know.

  • Having used both Tensorflow (though I expect they mean PyTorch which is way more popular, and I have also used) and langchain, they are nothing alike.

    They he ML frameworks are much closer to implementing the mathematics of neural networks, with some abstractions but much closer to the linear algebra level. It requires an understanding of the underlying theory.

    Langchain is a suite of convenience functions for composing prompts to LLMs. I wouldn’t consider there to be some real domain knowledge one would need to use it. There is a learning curve but it’s about learning the different components rather than learning a whole new academic discipline.

    • You're right, none of these new tools are disciplines. They are vendor specific approaches that are very recent. That's part of my overall point. Who is out there with 2+ years of very narrow tooling experience at another company at a senior level and is available for a rando startup (or desparate enterprise looking for bolt-on AI features) at a fraction of the pay? Not many, I'm sure. We can level up, do training, and maybe stand up a demo project. But that won't satisfy an ATS scan. It's unrealistic.

    • There's a big difference between building an ML framework like Tensorflow or PyTorch (I built a Lua Torch-like one in C++ myself) and just using it to build/train a model.

      Building the model may range from very simple if you are just recreating a standard architecture, or be a research endeavor if you are designing something completely new.

      The difficulty/complexity of then training the model depends on what it is. For something simple like a CNN for image recognition, it's really just a matter of selecting a few hyperparameters and letting it rip. At the other end of the spectrum you've got LLMs where training (and coping with instabilities) is something of a black art, with RL training completely different from pre-training, and there is also the issue of designing/discovering a pre/mid/post training curriculum.

      But anyways, the actual training part can be very simple, not requiring too much knowledge of what's going on under the hood, depending on the model.