Comment by Vampiero

6 months ago

250/s is still nothing when compared to an actual NLP pipeline that takes a few ms per it, because you can parallelize that too.

I know it's hard to understand, but you can achieve a throughput that is a few orders of magnitude higher.