Comment by gopher_space
8 days ago
I just wrapped up a test project[0] based on a comment from that post! My takeaway was that there are a lot of steps in the process you can farm out to cheaper, faster ML models.
For example, the slowest part of my pipeline is picture description since I need a LLM for that (and my project needs to run on low-end equipment). Locally I can spin up a tiny LLM and get one-word descriptions in a minute, but anything larger takes like 30. I might be able to only send sections I don't have the hardware to process.
It was a good into to ML models incorporating vision, and video is "just" another image pipeline, so it's been easy to look at e.g. facial recognition groupings like any document section.
No comments yet
Contribute on Hacker News ↗