← Back to context

Comment by skydhash

2 months ago

I think you only need something like `jsdom` to have the core API available. The DOM itself is just a tree structure with special nodes. Most APIs are optional and you can provide stubs if you're targeting a specific websites. It's not POSIX level.

I would like to know more about this. I had some web scrapers in Perl but they no longer work. :(

  • The state of the art is to remote-control a real browser now. Defeats all not-a-real-browser checks. You can even click on the cloudflare captchas.