Comment by timr

6 days ago

While reading this thread, I literally just caught an agent putting in the following CSS selector in a rule:

> .row > div > div, .alert

This is fairly simple CSS, not multi-threaded systems development. A bar low enough that you could trip over it. I catch this kind of stuff all the time (literally every run), but only because I read every line. Most of it wouldn't be the end of the world for any particular task, but would eventually result in a complete mess.

I think the people doing the heaviest breathing around the elimination of programmers either aren't very good at programming, or they're not paying close attention. Or they're hyping their book.

12 comments

timr

kstenerud 5 days ago

I haven't done any CSS/HTML/JS level work with Claude yet. I've mainly been using it for systems level stuff.

LLMs have traditionally had problems with visual rendering (the good ol' pelican on the bicycle test). I wonder if this is more of the same?

timr 5 days ago

In this case, the visual display was fine -- I was instructing it to fix bad code from a previous round that happened to deliver the right results.

Like I said, this is just an example that happens to be CSS. I see this stuff daily, if not hourly.

kstenerud 5 days ago

That's interesting. As I said I haven't tried using LLMs at this level, although I'm about to embark on some this week.

What I've found helps (at least at the other layers) is to have principles documents and standards documents for the AI to reference when it's modifying code. Principles documents describe the why, and standards documents describe the how.

So for example a few parts from my initial CSS-standards.md (still needs a lot of revision):

    ## Utility-first discipline

    **Raw utilities everywhere by default. Never `@apply` for "components."** `@apply` exists only for
    true low-level primitives that can't live in a template (e.g., `prose` overrides, embedded
    third-party widget shells).

    Wathan's stated position: extract only on "worrisome duplication." The Tailwind team explicitly
    describes `@apply` as a tool you reach for after first reaching for templates. **Premature CSS
    abstraction is the failure mode.**

    ## Spacing

    Use only the default scale (`0, 0.5, 1, 1.5, 2, 3, 4, 6, 8, 12, 16, 24…`). **Never `p-[13px]`.** If
    you need a value, change the scale in `@theme`:

    ```css
    @theme {
      --spacing: 0.25rem;
    }
    ```

    v4 uses a single `--spacing` multiplier; everything derives from it.

    ## Anti-patterns (banned)

    - **`!` important prefix** (`!bg-red-500`). Fix specificity properly.
    - **Arbitrary values for colour** (`bg-[#1da1f2]`). Define in `@theme`.
    - **Arbitrary pixel offsets** as default (`top-[3px]`). Use the spacing scale. Tolerated only as
      rare one-offs.
    - **Nested custom CSS more than one level deep.**
    - **`@apply` for any class that wraps fewer than ~5 utilities** or appears in fewer than ~3
      templates.
    - **Dynamic class string interpolation** (`text-${level}-500`) — purger can't see these.
    - **Custom breakpoints in v1.**
    - **Inline `<style>` blocks.** All CSS goes through `assets/css/app.css`.

1 reply →

freedomben 5 days ago

Great example.
Just IME, the quality of the prompt often significantly affects whether it does bad stuff like your example. It's not easy by any stretch and I'm still getting there, but I'm up to a couple dozen or so "Agent Instructions" in my CLAUDE.md files for various projects that have to say things like: "when doing TDD, don't write tests to verify bug fixes in tests" because the agent is really good at following things literally. I am sure it will continue to improve, but until then every project needs some bandaid things like that.

habinero 5 days ago

> I think the people doing the heaviest breathing around the elimination of programmers either aren't very good at programming, or they're not paying close attention.

Yeah, absolutely. People think you're picking on, like, code formatting and no, dawg, your code doesn't do what you think it does, or it only handles the happiest of happy paths.

I do find it funny when people get mad about you critiquing their AI project. You didn't even write it, dude.

sjagauanbdvva 5 days ago

Or they don’t know CSS.

Amazing how the LLM is godly with things I don’t understand, and falls over completely when it works in my domain… I wonder why that is /s

timr 5 days ago
Yes, it's a mystery, isn't it?
Specifically for CSS, these bots really want to just barf out tailwind-style crap. If you deviate even slightly from the standards and practices of the modal front-end developer, you quickly see how these things are brittle, and no amount of prompting and cajoling will truly affect their behavior. In this case, you're kind of seeing the downstream affects of saying "no, do NOT do tailwind, make actual CSS with actual semantic class names please and thank you."
Perhaps ironically, this results in the quality of output I might expect if I had prompted a right-out-of-bootcamp coder to do the same. (But at least it doesn't whine about it!)
- maxsilver 5 days ago
  
  > these bots really want to just barf out tailwind-style crap.
  I get it. The LLMs struggle most with state. They don’t have a real fix for that yet. People generally compensate by shoving everything into context, and making the context window as large as possible, which half-works.
  Tailwind happens to be “stateless” CSS framework. Nothing uses anything else, nothing is shared, nothing is reused, nothing stacks. It’s super easy to write, since you don’t have to worry about anything else, and the styles are all duplicated dynamically and ‘compiled’ — to the point you can copy-and-paste a HTML block with tailwindcss classes from anywhere into your site, and it mostly ‘works’).
  —-
  Tailwind is uniquely suited for LLM use, because the problem Tailwind solves is the problem juniors (and now, LLMs) struggle with most. An LLM can happily write up a bunch of styles, without knowing any of the rest of the project state, and if it’s tailwind, it will mostly sort-of work.
  It just also happens to be bad practice, this style of development is the exact thing we told everyone not to do for two decades. (“Inline styles are bad! Duplicate styles everywhere is bad! It’s bloated, it’s inefficient. It’s the mark of inexperienced front end. Don’t inline styles. Unless it’s a tailwindcss class, you can inline those styles, they get a pass I guess”).
  We used to measure our JS and CSS in kilobytes, by 2011 standards this would be “far too bloated for production use”. For the old-timers, it can be hard to grapple with the idea that we’re just purposefully doing ‘worse’ front-end intentionally now. The calculation changes when half your content/styles/front-end is LLM-generated, and therefore completely disposable. Very “they don’t make them like they used to” vibes.
  
  3 replies →