Comment by kstenerud
5 days ago
I haven't done any CSS/HTML/JS level work with Claude yet. I've mainly been using it for systems level stuff.
LLMs have traditionally had problems with visual rendering (the good ol' pelican on the bicycle test). I wonder if this is more of the same?
In this case, the visual display was fine -- I was instructing it to fix bad code from a previous round that happened to deliver the right results.
Like I said, this is just an example that happens to be CSS. I see this stuff daily, if not hourly.
That's interesting. As I said I haven't tried using LLMs at this level, although I'm about to embark on some this week.
What I've found helps (at least at the other layers) is to have principles documents and standards documents for the AI to reference when it's modifying code. Principles documents describe the why, and standards documents describe the how.
So for example a few parts from my initial CSS-standards.md (still needs a lot of revision):
Yeah, I have those, but it's still pretty hit and miss, and obviously, it ends up being a game of whack-a-mole for everything I find.
I don't mean to over-state the importance of these little errors, just to say that agents do plenty of dumb stuff, even today, and the people who say otherwise are selling something or (hot take incoming) some combination of stupid, lazy and/or delusional.
Great example.
Just IME, the quality of the prompt often significantly affects whether it does bad stuff like your example. It's not easy by any stretch and I'm still getting there, but I'm up to a couple dozen or so "Agent Instructions" in my CLAUDE.md files for various projects that have to say things like: "when doing TDD, don't write tests to verify bug fixes in tests" because the agent is really good at following things literally. I am sure it will continue to improve, but until then every project needs some bandaid things like that.