Comment by hackgician
9 days ago
The purpose of using Playwright is to basically write deterministic workflows in deterministic automation code. We have basic prompt caching right now that works if the DOM doesn't change (as you mention), but also the best way to reduce token cost is to reduce reliance on AI itself. You have the most control over how much you want to rely on AI v. how much you want to write repeatable Playwright code.
That seems like a pretty tough sell over bare playwright. Unless the UI is constantly changing, the cost of verifying tests are still successful seems like it would eclipse the cost of an engineer maintaining the test pretty quickly.
Some minimal model that could be run locally and specifically tuned for this purpose might be pretty fruitful here compared to delegating out to expensive APIs.
I think a hybrid solution where you use AI if the X path fails or if the test as a whole fails would be ideal. Then cache the results and use them until it fails again.
Definitely a very interesting problem we're trying to dig deep into. We'd welcome any PRs here as well from the community :)