Comment by anerli
1 day ago
I think depends a lot on how much you value your own time, since its quite time consuming to write and update playwright scripts. It's gonna save you developer hours to write automations using natural language rather than messing around with and fixing selectors. It's also able to handle tasks that playwright wouldn't be able to do at all - like extracting structured data from a messy/ambiguous DOM and adapting automatically to changing situations.
You can also use cheaper models depending on your needs, for example Qwen 2.5 VL 72B is pretty affordable and works pretty well for most situations.
But we can use an LLM to write that script though and give that agent access to a browser to find DOM selectors etc. And than we have a stable script where we, if needed, manually can fix any LLM bugs just once…? I’m sure there are use cases with messy selectors as you say, but for me it feels like most cases are better covered by generating scripts.
Yeah we've though about this approach a lot - but the problem is if your final program is a brittle script, you're gonna need a way to fix it again often - and then you're still depending on recurrently using LLMs/agents. So we think its better to have the program itself be resilient to change instead of you/your LLM assistant having to constantly ensure the program is working.
I wonder if a nice middle ground would be: - recording the playwright behind the scenes and storing - trying that as a “happy path” first attempt to see if it passes - if it doesn’t pass, rebuilding it with the AI and vision models
Best of both worlds. The playwright is more of a cache than a test
1 reply →
Are you sure? Couldnt you just just go back to the LLM if the script breaks? Pages changes but not that often in general.
It seems like a hybrid approach would scale better and be significantly cheaper.
1 reply →
I think you are overstating. Just use Playwright codegen. No need for manual test writing, or at least 90% can get generated. Still 10x faster and cheaper.