Comment by dmkolobov

2 months ago

Ooh blast from the past!

At a previous company we moved off of wkhtmltopdf to a nodejs service which received static html and rendered it to pdf using phantomjs. These days you probably use puppeteer.

The trick was keeping the page context open to avoid chrome startup costs and recreating `page`. The node service would initialize a page object once with a script inside which would communicate with the server via a named Linux pipe. Then, for each request:

1. node service sends the static html to the page over the pipe

2. the page script receives the html from the pipe, inserts it into the DOM, and sends an “ack” back over the pipe

3. the node service receives the “ack” and calls the pdf rendering method on the page.

I don’t remember why we chose the pipe method: I’m sure there’s a better way to pass data to headless contexts these days.

The whole thing was super fast(~20ms) compared to WK, which took at least 30 seconds for us, and would more often than not just time out.

4 comments

dmkolobov

sshine 2 months ago

Sounds like fun considering how real the problem is.

dmkolobov 2 months ago
It was!
I remember the afternoon I had the idea: it was beer Friday -and it took a few hours to write up a basic prototype that rendered a PDF in a few hundred milliseconds. That was the first time I’d written a 100x speed improvement. Felt like a real rush.
- mherrmann 2 months ago
  
  Congratulations. Doesn't make this approach make so much more sense than writing a browser engine from scratch?
  
  1 reply →