Comment by ilyakaminsky
3 days ago
Fast is also cheap. Especially in the world of cloud computing where you pay by the second. The only way I could create a profitable transcription service [1] that undercuts the rest was by optimizing every little thing along the way. For instance, just yesterday I learned that the image size I've put together is 2.5× smaller than the next open source variant. That means faster cold boots, which reduces the cost (and providers a better service).
ive approached the same thing but slightly differently. i can run it on consumer hardware for vastly cheaper than the cloud and don't have to worry about image sizes at all. (bare metal is 'faster') offering 20,000 minutes of transcription for free up to the rate limit (1 Request Every 5 Seconds)
https://geppetto.app
I contributed "whisperfile" as a result of this work:
* https://github.com/Mozilla-Ocho/llamafile/tree/main/whisper....
* https://github.com/cjpais/whisperfile
if you ever want to chat about making transcription virtually free or so cheap for everyone let me know. I've been working on various projects related to it for a while. including open source/cross-platform superwhisper alternative https://handy.computer
> i can run it on consumer hardware for vastly cheaper than the cloud
Woah, that's really cool, CJ! I've been toying the with idea of standing up a cluster of older iPhones to run Apple's Speech framework. [1] The inspiration came from this blog post [2] where the author is using it for OCR. A couple of things are holding me back: (1) the OSS models are better according to the current benchmarks and (2) I have customers all over the world, so that geographical load-balancing is a real factor. With that said, I'll definitely spend some time checking out your work. Thanks for sharing!
[1] https://developer.apple.com/documentation/speech
[2] https://terminalbytes.com/iphone-8-solar-powered-vision-ocr-...
ty! if there's any way I can help just lmk, always happy to lend a hand or an ear
Is S3 slow or fast? It’s both, as far as I can tell and represents a class of systems (mine included) that go slow to go fast.
S3 is “slow” at the level of a single request. It’s fast at the level of making as many requests as needed in parallel.
Being “fast” is sometimes critical, and often aesthetic.
We have common words for those two flavors of “fast” already: latency and throughput. S3 has high latency (arguable!), but very very high throughput.
Fast is cheap everywhere. The only reasons software isn’t faster:
* developer insecurity and pattern lock in
* platform limitations. This is typically software execution context and tool chain related more than hardware related
* most developers refuse to measure things
Even really slow languages can result in fast applications.
Well said. And it's not just the cloud. We self-host at my job and there are real cost savings to speed here too. Being able to continue using an old server for another year and having your staff be just a little more efficient adds up quickly.
Yep. I'm hoping that installed copies of PAPER (at least on Linux) will be somewhere under 2MB total (including populating the cache with its own dependencies etc). Maybe more like 1, although I'm approaching that line faster than I'd like. Compare 10-15 for pip (and a bunch more for pipx) or 35 for uv.
Fast doesn't necessarily mean efficient/lightweight and therefore cheaper to deploy. It may just mean that you've thrown enough expensive hardware at the problem to make it fast.
Your CSS is broken fyi
Not in development and maintenance dollars it's not
Hmm… That's a good point. I recall a few instances where I went too far to the detriment of production. Having a trusty testing and benchmarking suite thankfully helped with keeping things more stable. As a solo developer, I really enjoy the development process, so while that bit is costly, I didn't really consider that until you mentioned it.