← Back to context

Comment by kstenerud

5 days ago

I haven't done any CSS/HTML/JS level work with Claude yet. I've mainly been using it for systems level stuff.

LLMs have traditionally had problems with visual rendering (the good ol' pelican on the bicycle test). I wonder if this is more of the same?

In this case, the visual display was fine -- I was instructing it to fix bad code from a previous round that happened to deliver the right results.

Like I said, this is just an example that happens to be CSS. I see this stuff daily, if not hourly.

  • That's interesting. As I said I haven't tried using LLMs at this level, although I'm about to embark on some this week.

    What I've found helps (at least at the other layers) is to have principles documents and standards documents for the AI to reference when it's modifying code. Principles documents describe the why, and standards documents describe the how.

    So for example a few parts from my initial CSS-standards.md (still needs a lot of revision):

        ## Utility-first discipline
    
        **Raw utilities everywhere by default. Never `@apply` for "components."** `@apply` exists only for
        true low-level primitives that can't live in a template (e.g., `prose` overrides, embedded
        third-party widget shells).
    
        Wathan's stated position: extract only on "worrisome duplication." The Tailwind team explicitly
        describes `@apply` as a tool you reach for after first reaching for templates. **Premature CSS
        abstraction is the failure mode.**
    
        ## Spacing
    
        Use only the default scale (`0, 0.5, 1, 1.5, 2, 3, 4, 6, 8, 12, 16, 24…`). **Never `p-[13px]`.** If
        you need a value, change the scale in `@theme`:
    
        ```css
        @theme {
          --spacing: 0.25rem;
        }
        ```
    
        v4 uses a single `--spacing` multiplier; everything derives from it.
    
        ## Anti-patterns (banned)
    
        - **`!` important prefix** (`!bg-red-500`). Fix specificity properly.
        - **Arbitrary values for colour** (`bg-[#1da1f2]`). Define in `@theme`.
        - **Arbitrary pixel offsets** as default (`top-[3px]`). Use the spacing scale. Tolerated only as
          rare one-offs.
        - **Nested custom CSS more than one level deep.**
        - **`@apply` for any class that wraps fewer than ~5 utilities** or appears in fewer than ~3
          templates.
        - **Dynamic class string interpolation** (`text-${level}-500`) — purger can't see these.
        - **Custom breakpoints in v1.**
        - **Inline `<style>` blocks.** All CSS goes through `assets/css/app.css`.

    • Yeah, I have those, but it's still pretty hit and miss, and obviously, it ends up being a game of whack-a-mole for everything I find.

      I don't mean to over-state the importance of these little errors, just to say that agents do plenty of dumb stuff, even today, and the people who say otherwise are selling something or (hot take incoming) some combination of stupid, lazy and/or delusional.

  • Great example.

    Just IME, the quality of the prompt often significantly affects whether it does bad stuff like your example. It's not easy by any stretch and I'm still getting there, but I'm up to a couple dozen or so "Agent Instructions" in my CLAUDE.md files for various projects that have to say things like: "when doing TDD, don't write tests to verify bug fixes in tests" because the agent is really good at following things literally. I am sure it will continue to improve, but until then every project needs some bandaid things like that.