← Back to context

Comment by girvo

17 hours ago

At work I regularly hit my 7.5mil tokens per hour limit one of our tools has, and have to switch model of tool, and I’m not even really a remotely heavy user. I think people don’t realise how many tokens get burned with CoT and tool calls these days

At 7.5mil per hour hard limit, 84 days to hit the grandparents $3k

That said local models really are slow still, or fast enough and not that great

They already stated they can only generate 57,600 tokens per hour locally (expressed as 16 tokens per second). So that's the limiting factor here.