Comment by tadfisher

2 months ago

Maybe the article originally featured a 1000-line C implementation.

3 comments

tadfisher

I was basing this more on the fact that you don't have to look at C code to understand that non cached transformer inference is going to be super slow.

wasabi991011 2 months ago

I don't see how that would be possible given the contents of the article.

anonym29 2 months ago

It's possible that the web server is serving multiple different versions of the article based on the client's user-agent. Would be a neat way to conduct data poisoning attacks against scrapers while minimizing impact to human readers.