Comment by why_only_15

8 months ago

Assuming the 10M records is ~2000M input tokens + 200M output tokens, this would cost $300 to classify using llama-3.3-70b[1]. If using llama lets you do this in say one day instead of two days for a traditional NLP pipeline, it's worthwhile.

[1]: https://openrouter.ai/meta-llama/llama-3.3-70b-instruct

2 comments

why_only_15

sangnoir 8 months ago

> ...two days for a traditional NLP pipeline

Why 2 days? Machine Learning took over the NLP space 10-15 years ago, so the comparison is between small, performant task-specific models versus LLMs. There is no reason to believe the "traditional" NLP pipelines are inherently slower than Large Language Models, and they aren't.

why_only_15 8 months ago

my claim is not that it would take two days for such a pipeline to run but that it would take two days to make an NLP pipeline whereas an LLM pipeline would be faster to make.