← Back to context

Comment by Uptrenda

3 years ago

Anyone who has ever pulled a website from a script knows the pain that is Javascript. Normally you want to just get some text and work out the API actions but a lot of sites use horribly obfuscated Javascript -- either because that's what modern web development is (lolz) -- or because its part of their 'security.' That means if you want to write browser-based bots properly -- you ought to use a browser. There are special browsers that run 'headlessly' or are designed mostly for bot use. Like https://www.selenium.dev/ which plugs into a few different 'browser engines.'

But now you have another problem. Your simple script goes from being small, simple, self-contained, and elegant gem, to requiring a full browser, specialized drivers, and/or daemons running just to work. If you're using something like Python you just frankly don't have very good packaging. So it's hard to string together all that into a solution and have it magically work for everyone. What YouTube-dl have done is good engineering. Even though it's not a full JS interpreter: they've kept their software lean, self-contained, and easier to use.

Just npm install puppeteer.

  • Puppeteer is cool, but it's exactly what OP is warning against: it's a full browser that is downloaded and run through npm. It's remarkably well packaged, but still far more error prone than a simple HTTP request, and far more likely to break on its own just with the passage of time.

    • Yes, but:

      ”Your simple script goes from being small, simple, self-contained, and elegant gem, to requiring a full browser, specialized drivers, and/or daemons running just to work”

      Complex problems cannot be solved by simple scripts, but they can be abstracted away to vendor libraries when/if they are well maintained, such as in this case. While it can break with time, at least someone else fixes it for you.

    • There's also puppeteer-core which lets you use your own (Google Chrome) browser and if your own browser is broken then you're having bigger problems than youtube-dl not working.