Comment by why_only_15
2 months ago
Assuming the 10M records is ~2000M input tokens + 200M output tokens, this would cost $300 to classify using llama-3.3-70b[1]. If using llama lets you do this in say one day instead of two days for a traditional NLP pipeline, it's worthwhile.
[1]: https://openrouter.ai/meta-llama/llama-3.3-70b-instruct
> ...two days for a traditional NLP pipeline
Why 2 days? Machine Learning took over the NLP space 10-15 years ago, so the comparison is between small, performant task-specific models versus LLMs. There is no reason to believe the "traditional" NLP pipelines are inherently slower than Large Language Models, and they aren't.
my claim is not that it would take two days for such a pipeline to run but that it would take two days to make an NLP pipeline whereas an LLM pipeline would be faster to make.