Comment by tmhrtly
6 hours ago
My concern here is that by gravitating to HTML you lose the ability for a human (you!) to easily co-author the document with the LLM. If it’s just an explainer for your consumption, that’s not a concern - but if it’s a spec sheet for something more complex, I deeply value being able to dive in and edit what is produced for me. With a HTML doc it is much harder to do that than with MD.
Now of course you could just reprompt your LLM to change the HTML - but when I already have a clear idea of what I want to say in my head, that’s just another roadblock in the way.
If this pattern becomes more common I suspect human/LLM co-creation will further dwindle in favour of just delegating voice, tone and content choice to the LLM. I was surprised not to see this concern in the blog post’s FAQ.
I actually think there is a second level to this. Yes HTML will get you most anywhere, but I found that letting the LLM define its own language is also unreasonably effective.
Currently working on a dumb little mobile game with isometric view and sound:
- told codex to write a tool that lets its place blocks in a prepared three.js document and have chromium dev tools take a screenshot. It made up a little JSON structure that defines blocks / colors and some other effects and it outputs 2.5d tilesets.
- told it to create a uv python script that would let it define sounds and music, and it made a yaml format that lets it create noises.
We completely shot past the svg pelican test. Codex has created both perfectly adequate prototype art of soldiers/knights/priests as well as a prototype soundtrack.
I’ve started using HTML for reports recently. But I always use a markdown file as an intermediate and tell the LLM to generate a fancier version of it with SVG for graphs/pictures based on tables in the markdown.
We have been authoring HTML by hand for decades with ease. Text editors are very good at it, and many have commands to auto-wrap, auto-close etc. Reading and writing is simple.
Templated though, not manually writing it out for every blog post say. I think GP means it just has more friction as a writing format than markdown for example.
No, literally manually typing out HTML tags and everything. Many of us did it so much things like Emmet (https://emmet.io/) were invented and used so we could hammer out full HTML documents even faster.
Even after React became popular, people are still manually typing out HTML elements, although they call it "JSX" instead, but in reality it's just HTML.
My first blog on the internet literally was a bunch of .html files, where my post "template" was the first post copy-pasted when you wanted to make a new post. Changing the design involved changing the same thing across all files.
Oh my sweet summer child…
You have been authoring HTML by hand for decades. Not every SWE is a FE dev.
I learned HTML 20+ years ago in high school.
I did not go to a front end high school.
Java engineers write lots of HTML in java docs:)
1 reply →
People have been authoring html by hand for a long time before the specialization to Frontend dev even existed...
Most front end devs can’t get HTML right either.
1 reply →
>We have been authoring HTML by hand for decades with ease
No, we've been generating it with templates or authoring templates.
Authoring HTML by hand is a very early 2000s thing to do.
After you a FE webdev that doesn't regularly author HTML by hand?
2 replies →
I suppose that only applies if you constrain yourself to a raw teletypewriter emulator… in any proper coding environment, editing HTML should be absolutely simple - even an embedded WYSIWYG editor would be an option if rich model output is a way we head into.
A counter argument would be that all programming languages of the last decades have been plain text based. No other more structured format has ever gained traction even though modern editors could be argued to be able to support that easily. Turns out, it doesn't actually work that way.
HTML is plain text based at the same level as any programming language I can think of.
But we’re not even dealing with a programming language in any classical sense here. Interacting with an LLM coding system is a multi-mode communication system with on-demand, purpose-generated ephemeral UI. That doesn’t fit any of the established categories, so I think carrying over constraints from them doesn’t make sense either.
Most people edit documents in Microsoft word, though, so it didn’t seem too far fetched that LLM content would be edited similarly, especially as more and more non-programmers use it.
1 reply →
It highlights the extremes the anthropic team adopts LLMs in their workflow.
I think most of us live somewhere in the middle, using the right tool / output for the job.
Is HTML really that much worse to edit than MD?
Markdown is essentially just syntactic sugar for HTML[0], so yes it was made to be easier to edit than HTML.
[0]: https://spec.commonmark.org/0.31.2/#html-blocks
It’s a bit easier yeah but there’s not much in it.
Let’s see…
It depends what we mean I guess, isn’t Markdown supposed to allow [hx]ml tags anyway if user need them? Then it’s more about asking the LLM to generate Markdown with this in consideration, and privilege rendering the output of reports in the preferred browser after relevant rendering.
1. I believe many applications that use markdown allow html. Others don't due to security/rendering issues.
2. One of the limiting factors of LLM is context. An html table takes up way more tokens than a markdown table. Especially if it's a WYSIWYG editor that has all kinds of css and <span> tags just for fun.
Yes that’s the case. And as Anthropic staff, author has an incentive to promote workflows that require an agent to interact with text documents.
HTML is by far simpler than Markdown.
I've yet to see Anthropic promote any sort of token optimization strategy to its users - they always assume we all have infinite inference.
"No bread? Let them eat cake!"
Not sure how you use CC, but the last 6 months has felt like significant optimization efforts to me. Last year Claude would just read and edit files, now it's all kinds of basic tool gymnastics with grep/awk/sed/etc to narrowly slice and avoid token-heavy reads. Resuming sessions that aren't even that large get a scary prompt about using a significant portion of your token budget if you continue without compacting.
To me it feels like a worse experience, and they probably feel it too, but it makes sense from an optimization perspective. I've probably learned some shell tricks, but also going blind from watching Claude try dozens of variations of some multi-line chained and piped wall of bash nightmare, instead of just reading a few files.
2 replies →
I've noticed that's changed over the past month or so. Claude-code used to happily pipe build commands straight into context, but recently it's been running them as background tasks that pipe to file, and it'll search and do partial reads on the output instead.
It also gives tips on reducing context size when you run /context .
Presumably they are actually starting to feel the pinch on inference costs themselves with what still feels like a fairly generous max plan.
1 reply →
Nah they do. They push Sonnet pretty hard rather than Opus for most tasks.
Also: https://platform.claude.com/docs/en/agents-and-tools/tool-us...
Makes sense for actual devs. For non-devs who'd just edit docs via LLMs anyway (myself), I can't imagine it'd introduce much friction.
HTML is super human readable if you stick to a subset of it.
It's arguable even more readable.
<b>bold</b> <i>italic</i> <u>underline</u>
I can never remember how many stars and ticks correspond to what in markdown.
Yes, and you can always embed HTML in Markdown for <script>, <style>, <svg>, and other tags that cannot be coded in Markdown.
[flagged]
[dead]