Comment by nicoburns
12 hours ago
This table is informative as to exactly what lightpanda is: https://lightpanda.io/blog/posts/what-is-a-true-headless-bro...
TL;DR: It does the following:
- Fetch HTML over the network
- Parse HTML into a DOM tree
- Fetch and execute JavaScript that manipulates the DOM
But not the following:
- Fetch and parse CSS to apply styling rules
- Calculate layout
- Fetch images and fonts for display
- Paint pixels to render the visual result
- Composite layers for smooth scrolling and animations
So it's effectively a net+DOM+script-only browser with no style/layout/paint.
---
Definitely fun for me to watch as someone who is making a lightweight browser engine with a different set of trade-offs (net+DOM+style/layout/paint-only with no script)
When I was working before on something that used headless browser agents, the ability to do a screenshot (or even a recording) was really great for debugging... so I am not sure about the "no paint". But hey everything in life is a trade-off.
Really depends on what you want to do with the agents. Just yesterday I was looking for something like this for our web access MCP server[0]. The only thing that it needs to do is visit a website and get the content (with JS support, as it's expected that most pages today use JS), and then convert that to e.g. Markdown.
I'm not too happy with the fact that Chrome is one of our memory-hungriest parts of all the MCP servers we have in use. The only thing that exceeds that in our whole stack is the Clickhouse shard, which comes with Langfuse. Especially if you are looking to build a "deep research" feature that may access a few hundreds of webpages in a short timeframe, having a lightweight alternative like Lightpanda can make quite the difference.
[0]: https://github.com/EratoLab/web-access-mcp
yeah I feel the same, I think even having a screenshot of part of rendered page or full page can be useful even for machines considering how heavy those HTML can be to parse and expensive for LLM context. Sometimes (sub)screenshot is just a better kind of compression
Yes HTML is too heavy and too expensive for LLM. We are working on a text-based format more suitable for AI.
2 replies →
> So it's effectively a net+DOM+script-only browser with no style/layout/paint.
> ---
> Definitely fun for me to watch as someone who is making a lightweight browser engine with a different set of trade-offs (net+DOM+style/layout/paint-only with no script)
Both projects (Lightpanda, DioxusLabs/blitz) sound very interesting to me. What do you think about rendering patterns that require both script+layout for rendering, e.g. virtual scrolling of large tables?
What would be a good pattern to make virtual scrolling work with Lightpanda or Blitz?
For scrolling, when using Intersection Observer, we currently assume all elements are visible. So, if you register an observer, we will dispatch an entry indicating an intersection with a ratio of 1.0.
So Blitz does technically have scripting, it's just Rust scripting rather than JavaScript scripting. So the plan for virtual scrolling would likely be to implement it in Rust.
If your aim is to render a UI (ala Electron/Flutter) then we have a React-style framework (Dioxus) that runs on top of Blitz, and allows you access to the low-level Rust API of the DOM for advanced use cases (although it's still a WIP and this API is a bit rough atm). I'm also hoping to eventually have a built-in `RecyclerView`-like widget for this (that can bypass the style/layout systems for much more efficient virtual scrolling).
Thanks! But I meant JS based virtual scrolling in web pages. E.g. dynamic data tables that only render the part of the table that fits in the viewport.