Comment by simonw

1 month ago

One of the big open questions for me right now concerns how library dependencies are used.

Most of the big ones are things like skia, harfbuzz, wgpu - all totally reasonable IMO.

The two that stand out for me as more notable are html5ever for parsing HTML and taffy for handling CSS grids and flexbox - that's vendored with an explanation of some minor changes here: https://github.com/wilsonzlin/fastrender/blob/19bf1036105d4e...

Taffy a solid library choice, but it's probably the most robust ammunition for anyone who wants to argue that this shouldn't count as a "from scratch" rendering engine.

I don't think it detracts much if at all from FastRender as an example of what an army of coding agents can help a single engineer achieve in a few weeks of work.

26 comments

simonw

sealeck 1 month ago

I think the other question is how far away this is from a "working" browser. It isn't impossible to render a meaningful subset of HTML (especially when you use external libraries to handle a lot of this). The real difficulty is doing this (a) quickly, (b) correctly and (c) securely. All of those are very hard problems, and also quite tricky to verify.

I think this kind of approach is interesting, but it's a bit sad that Cursor didn't discuss how they close the feedback loop: testing/verification. As generating code becomes cheaper, I think effort will shift to how we can more cheaply and reliably determine whether an arbitrary piece of code meets a desired specification. For example did they use https://web-platform-tests.org/, fuzz testing (e.g. feed in random webpages and inform the LLM when the fuzzer finds crashes), etc? I would imagine truly scaling long-running autonomous coding would have an emphasis on this.

Of course Cursor may well have done this, but it wasn't super deeply discussed in their blog post.

I really enjoy reading your blog and it would be super cool to see you look at approaches people have to ensuring that LLM-produced code is reliable/correct.

simonw 1 month ago
Yeah, I'm hoping they publish a lot more about this project! It deserves way more then the few sentences they've shared about it so far.
- cousinbryce 1 month ago
  
  I’m interested to see how much more they know about the project
polyglotfacto 1 month ago
I think the current approach is simply not scalable to a working browser ever.
To leverage AI to build a working browser you would imo need the following:
- A team of humans with some good ideas on how to improve on existing web engines.
- A clear architectural story written not by agents but by humans. Architecture does not mean high-level diagrams only. At each level of abstraction, you need humans to decide what makes sense and only use the agent to bang out slight variations.
- A modular and human-overseen agentic loop approach: one agent can keep running to try to fix a specific CSS feature(like grid), with a human expert reviewing the work at some interval(not sure how fine-grained it should be). This is actually very similar to running an open-source project: you have code owners and a modular review process, not just an army of contributor committing whatever they want. And a "judge agent" is not the same thing as a human code owner as reviewer.
Example on how not to do it: https://github.com/wilsonzlin/fastrender/blob/19bf1036105d4e...
This rendering loop architecture makes zero sense, and it does not implement web standards.
> in the HTML Standard, requestAnimationFrame is part of the frame rendering steps (“update the rendering”), which occur after running a task and performing a microtask checkpoint
> requestAnimationFrame callbacks run on the frame schedule, not as normal tasks.
This is BS: "update the rendering" is specified as just another task, which means it needs to be followed by a microtask checkpoint. See https://html.spec.whatwg.org/multipage/#event-loop-processin...
Following the spec doesn't mean you cannot optimize rendering tasks in some way vs other tasks in your implementation, but the above is not that, it's classic AI bs.
Understanding Web standards and translating them into an implementation requires human judgement.
Don't use an agent to draft your architecture; an expert in web standards with a interest in agentic coding is what is required.
Message to Cursor CEO: next time, instead of lighting up those millions on fire, reach out to me first: https://github.com/gterzian
- ontouchstart 1 month ago
  
  How much effort would it take GenAI to write a browser/engine from scratch for GenAI to consume (and generate) all the web artifacts generated by human and GenAI? (This only needs to work in headless CI.)
  How much effort would it take for a group of humans to do it?
  
  3 replies →

mwcampbell 1 month ago

I was gratified to learn that the project used my own AccessKit for accessibility (or at least attempted to; I haven't verified if it actually works at all; I doubt it)... then horrified to learn that it used a version that's over 2 years old.

embedding-shape 1 month ago

For me, the biggest open question is currently "How autonomous is 'autonomous'?" because the commits make it clear there were multiple actors involved in contributing to the repository, and the timing/merges make it seem like a human might have been involved with choosing what to merge (but hard to know 100%) and also making smaller commits of their own. I'm really curious to understand what exactly "It ran uninterrupted for one week" means, which was one of Cursor's claims.

I've reached out to the engineer who seemed to have run the experiment, who hopefully can shed some more light on it and (hopefully) my update to https://news.ycombinator.com/item?id=46646777 will include the replies and more investigations.

shubhamjain 1 month ago

Why attempt something that has abundant number of libraries to pick and choose? To me, however impressive it is, 'browser build from scratch' simply overstates it. Why not attempt something like a 3D game where it's hard to find open source code to use?

Banditoz 1 month ago

Is something like a 3D game engine even hard to find source code for? There's gotta lots of examples/implementations scattered around.
cheevly 1 month ago
Assets are very hard to produce and largely unsolved by AI at the moment.
- fulafel 1 month ago
  
  There's AI based 3d asset generation tools around. For example https://www.meshy.ai/ https://hyper3d.ai/ https://www.sloyd.ai/
  
  1 reply →
- qingcharles 1 month ago
  
  This is definitely correct. I had a dream about a new video game the other day, woke up and Gemini one-shotted the game, but the characters are janky as hell because it has made them from whole cloth.
  What it should have been willing to do is go off and look for free external assets on the Web that it could download and integrate.
XenophileJKO 1 month ago

There are a lot of examples out there. Funny that you mention this. I literally just last night started a "play" project having Claude Code build a 3D web assembly/webgl game using no frameworka. It did it, but it isn't fun yet.
I think the current models are at a capability level that could create a decent 3D game. The challenges are creating graphic assets and debugging/Qa. The debugging problem is you need to figure out a good harness to let the model understand when something is working, or how it is failing.
fulafel 1 month ago

There's many open source ones around.
Also graphics acceleration makes it hard to do from scratch rather than using using the 3D APIs but I guess you could in principle go bare iron on hardware that has published specs such as AMD, or just do software only rendering.

janoelze 1 month ago

Any views on the nature of "maintainability" shifting now? If a fleet of agents demonstrated the ability to bootstrap a project like that, would that be enough indication to you that orchestration would be able to carry the code base forward? I've seen fully llm'd codebases hit a certain critical weight where agents struggled to maintain coherent feature development, keeping patterns aligned, as well as spiralling into quick fixes.

simonw 1 month ago
Almost no idea at all. Coding agents are messing with all 25+ years of my existing intuitions about what features cost to build and maintain.
Features that I'd normally never have considered building because they weren't worth the added time and complexity are now just a few well-structured prompts away.
But how much will it cost to maintain those features in the future? So far the answer appears to be a whole lot less than I would previously budget for, but I don't have any code more than a few months old that was built ~100% by coding agents, so it's way too early to judge how maintenance is going to work over a longer time period.
- htrp 1 month ago
  
  I'm seeing a lot of duplication in our AI coded repos that is getting to the point of being problematic to maintain.
- visarga 1 month ago
  
  > But how much will it cost to maintain those features in the future?
  Very little if they have good specs and tests.
brianjeong 1 month ago

I think there's a somewhat valid perspective that the Nth+1 model can simply clean up the previous models mess.
Essentially a bet that the rate of model improvement is going to be faster than the rate of decay from bad coding.
Now this hurts me personally to see as someone who actually enjoys having quality code but I don't see why it doesn't have a decent chance of holding
Deevian 1 month ago

They demonstrated the ability to bootstrap... "something". There's no maintainability to the output of the experiment.

teaearlgraycold 1 month ago

It looks like JS execution is outsourced to QuickJS?

simonw 1 month ago

No, it has its own JS implementation: https://news.ycombinator.com/item?id=46650998