Comment by sethkim
10 days ago
Yes, we're a startup! And LLM inference is a major component of what we do - more importantly, we're working on making these models accessible as analytical processing tools, so we have a strong focus on making them cost-effective at scale.
I see your prices page lists the average cost per million tokens. Is that because you are using the formula you describe, which depends on hardware time and throughput?
> API Price ≈ (Hourly Hardware Cost / Throughput in Tokens per Hour) + Margin