Comment by scyzoryk_xyz
17 hours ago
Well. Obviously if you have the attention span it probably makes most sense to actually learn the flags and teach yourself to write FFMPEG commands. That's the serious way to do it if you have a serious workflow.
But I've found it easier to brute force with LLM's because, like, every time I had to do video work it'd be something different. Prompts like 'I need to remove this and this and change the resultion from this to that', 'I need it to be this fps or that, or even I want this file to weigh this much. Or I 'need to split these two' or 'combine those three'. It'll usually get you a chunk of the way there. Another prompt or two of double-checking, copy paste into CMD line or terminal and either brr or error copy paste what does this mean. 3 minutes later it's doing the thing you wanted, and you're more or less understanding what's it giving you.
But I keep an Obsidian file with a bunch commands that made me happy before. Dumping that I to the context window helps.
Another one has been multi camera, multi screen recordings with OBS. I discovered it was easier to do the math, make a big canvas, record all the feeds onto those so I don't have to think about syncing anything later. Then brr an FFMPEG command to output that 1920x1080 and that 3840x2160
Whisper is great with that too - raw recording, output just the audio. 'give me whisper command to get this as srt'. Then 'now render subtitles onto this video'
There was an experiment I tried that kinda almost worked where I had this boring recording of some conversation but needed to extract scattered bits. Used whisper to get transcript, put that into LLM, used that to zero in on the actual bits that were important, then got it to spit out the timecodes. Then hobbled together this janky script that cut out those bits and stitched them together. That was faster than taking the time to do it with a GUI and listening it all through.
Of course there are tools like opus clip that spit that out for you now so...
Although to be honest, when the stakes go high and you're doing something serious that requires quality you do it slow.
The point at which I was doing this most was when I was doing video UX/UI research on a hardware/software product. We would set up multi-cams, set and forget so we could talk to subjects and not think about what's being captured.
Dozens of hours of footage, little clips that would end up as insights on the Product Discovery Jira for the thing. So quality wasn't really important.
No comments yet
Contribute on Hacker News ↗