← Back to context

Comment by geerlingguy

6 months ago

I like the conclusion; like for me, Whisper has radically improved CC on my video content. I used to spend a few hours translating my scripts into CCs, and tooling was poor.

Now I run it through whisper in a couple minutes, give one quick pass to correct a few small hallucinations and misspellings, and I'm done.

There are big wins in AI. But those don't pump the bubble once they're solved.

And the thing that made Whisper more approachable for me was when someone spent the time to refine a great UI for it (MacWhisper).

Author here. Indeed - it would be just as fantastical to deny there has been no value from deep learning, transformers, etc.

Yesterday I heard Cory Doctorow talk about a bunch of pro bono lawyers using LLMs to mine paperwork and help exonerate innocent people. Also a big win.

There's good stuff - engineering - that can be done with the underlying tech without the hyperscaling.

Not only whispr, so much of the computer vision area is not as in vogue. I suspect because the truly monumental solutions unlocked are not that accessible to the average person; i.e. industrial manufacturing and robotics at scale.

  • I think that LLM hype is hiding a lot of very real and impactful progress in real world/robot intelligence.

    An essay writing machine is cool. A machine that can competently control any robot arm, and make it immediately useful is a world-changing prospect.

    Moving and manipulating objects without explicit human coded instructions will absolutely revolutionize so much of our world.

  • That's because industrial manufacturing and robotics are failing to bring down costs and make people's lives more affordable.

    That's really the only value those technologies provide, so if people aren't seeing costs come down there really is zero value coming from those technologies.

I switched to Parakeet the other day.

It's better than Whisper, and faster, while running on CPU on my ten year old ThinkPad.

I had Claude make me Python bindings for it and add it to my voice typing app.

We live in the future.

I think a lot of AI wins are going to end up local and free much like whisper.

Maybe it could be a little bit more accurate, it would be nice if it ran a little faster, but ultimately it's 95% complete software that can be free forever.

My guess is very many AI tasks are going to end up this way. In 5-10 years we're all going to be walking around with laptops with 100k cores and 1TB of RAM and an LLM that we talk to and it does stuff for us more or less exactly like Star Trek.