Comment by merb

1 month ago

yeah like ask them to use tailwindcss.

most llm's actually fail that task, even in agent modes and there is a really simple reason for that. because tailwindcss changed their packages / syntax.

and this is basically a test that should be focused on. change things and see if the llm can find a solutions on its own. (...it can't)

And if I take my regular ordinary commuter car off the paved road and onto the dirt I get stuck in the mud. That doesn't mean the whole concept of cars is worthless, instead we paved all over the world with roads. But for some reason with LLMs, the attitude is that them being unable to go offroad means everyone's totally deluded and we should give up on the whole idea.

  • Im not against llms. I‘m just not a fan of people that says we have agi/singularity soon. I basically dropped google to search for things about code, because even if it fails to get stuff right I can ask for the doc source and I can force it to give me a link or the exact example/wording of the docs.

    But using it correctly means that especially junior developers have a way harder barrier of entry.

  • I don't think your analogy works for the tailwind situation, and there is no whole idea to give up on anyway. People will still be researching this hyper-complicated matrix multiplication thing, i.e. LLM, for a very long time.

    Personally, the tailwind example is an argument against one specific use case: LLM-assisted/driven coding, which I also believe is the best shot of LLM being actually productive in a non-academic setting.

    If I have a super-nice RL-ed (or even RLHF-ed) coding model & weights that's working for me (in whatever sense the word "working" means), and changing some function names will actually f* it up badly, then it is very not good. I hope I will never ever have to work with "programmer" that is super-reluctant to reorganize the code just to protect their pet LLM.

How do they do if you include the updated docs in the context?

  • You would need to remove the older docs first and still than it will hallucinate. Forcing the llm to open the doc webpage does produce some hallucinations as well. The more context you provide the worse it gets. And tbf inb4 most llms could migrate bootstrap to tailwindcss v3 without too much trouble (of course it fails to change tags when building css classes from multiple strings, but that’s fine) And I tried a lot of models. It just broke from one week to another

    • older docs are forever there. what it needs is more training data with new APIs. Actually, because older docs are there, you can ask to update some old code to newer versions automatically.

      Point is that it needs enough examples with a newer version. Also, reasoning models are pretty good at spotting which version they are using.

      (tested not with tailwind, but some other JS libs).