Comment by the_pwner224
7 days ago
Congratulations!
As the other commenter mentioned, this seems like an extremely competitive space. But I did a quick Google for speech-to-text and your solution is a few times cheaper than most others.
I know you may not want to share your secret sauce, but I'm curious as to how you managed that. I'm guessing you use Whisper, is the trick to just run it on your own hardware instead of paying AWS prices for compute?
I think people who don't have tech skills not aware of cheaper solutions so still can make profit but yeah many solutions that are cheaper: e.g. groq provide whisper v3 large turbo for $0.04 per hour: https://groq.com/pricing . Gemini 2.0 flash is around $0.08 per hour
Oh, did not know Groq offers ASR at that rate. Will be tough to compete with.
Will probably have to at least match their price in STT API and take a hit to revenue. But we are working on an end-to-end voice offering, and Groq does not seem to have anything in that area yet, so hopefully will stay afloat.
It runs on a AirBnB-like GPU network (also our own).