Comment by p0deje

1 year ago

I'm working on Alumnium (https://alumnium.ai). It's an open-source library to simplify web application testing with Selenium/Playwright.

I aim to create a stable and affordable tool that allows me to eliminate most of the support code I write for web tests (page objects, locators, etc.) and replace it with human-readable actions and assertions. These actions and assertions are then translated by an LLM into browser instructions. The tool, however, should still leverage all existing infrastructure (test runner, CI/CD, Selenium infrastructure).

So far, it's working well on simple websites (e.g., a calculator, TodoMVC), and I'm currently working on scaling it to large web applications.

Pretty cool. I built my own framework to do something similar very recently.

Microsoft omni parser and claude computer use alone can take you very far in testing almost anything.

  • I experimented with Computer Use and even though it's pretty cool, I ended up not using it for 2 main reasons:

    1. It's unreasonably expensive. A single test "2+2=4" for a web calculator costs around $0.15. I run roughly 1k tests per month on CI and I don't want to spend $150 on those. The approach I took with Alumnium costs me $4 per month for the same amount of tests.

    2. It tries too hard to make the test pass even when it's not possible. When I intentionally introduced bugs in applications, Computer Use sometimes pretended the everything was fine and marked the test passed. Alumnium on the other hand attempts to fail as early as possible.

    • For the 1st point, I generate a script with hashed check points so next run is automated unless something changes in the UI to invoke AI. I make this possible by proxy wrapping playwright library so I can take over every method. Users use playwright like they always have but with one extra method called act.

      Omini parser lets you split section of the UI to hash and watch for changes that are relevant.

      For 2, can you give some examples?

      2 replies →