← Back to context Comment by jazzyjackson 2 months ago OP said 2 seconds as if that wasn't an eternity... 3 comments jazzyjackson Reply gbnwl 2 months ago But then they said 250/second when running multiple inference? Again I don't know if their assertions about running multiple inference are correct but why focus on the wrong number instead of addressing the actual claim? Vampiero 2 months ago 250/s is still nothing when compared to an actual NLP pipeline that takes a few ms per it, because you can parallelize that too.I know it's hard to understand, but you can achieve a throughput that is a few orders of magnitude higher. EvgeniyZh 1 month ago 250/s is few (4) ms per it
gbnwl 2 months ago But then they said 250/second when running multiple inference? Again I don't know if their assertions about running multiple inference are correct but why focus on the wrong number instead of addressing the actual claim? Vampiero 2 months ago 250/s is still nothing when compared to an actual NLP pipeline that takes a few ms per it, because you can parallelize that too.I know it's hard to understand, but you can achieve a throughput that is a few orders of magnitude higher. EvgeniyZh 1 month ago 250/s is few (4) ms per it
Vampiero 2 months ago 250/s is still nothing when compared to an actual NLP pipeline that takes a few ms per it, because you can parallelize that too.I know it's hard to understand, but you can achieve a throughput that is a few orders of magnitude higher. EvgeniyZh 1 month ago 250/s is few (4) ms per it
But then they said 250/second when running multiple inference? Again I don't know if their assertions about running multiple inference are correct but why focus on the wrong number instead of addressing the actual claim?
250/s is still nothing when compared to an actual NLP pipeline that takes a few ms per it, because you can parallelize that too.
I know it's hard to understand, but you can achieve a throughput that is a few orders of magnitude higher.
250/s is few (4) ms per it