Regex is only difficult because it's complicated, the primitives are all sensibly arranged and predictable. FFmpeg is layers of dark magic where the primitives are often inscrutable before you compose them.
Yeah, you can give an LLM queries like “make this smaller with libx265 and add the hvc1 tag” or “concatenate these two videos” and it usually crushes it. They have a similar level of mastery over imagemagick, too!
Yeah, LLMs have honestly made ffmpeg usable for me, for the first time. The difficulty in constructing commands is not really ffmpeg's fault—it's just an artifact of the power of the tool and the difficulties in shoehorning that power into flags for a single CLI tool. It's just not the ideal human interface to access ffmpeg's functionality. But keeping it CLI makes it much more useful as part of a larger and often automated workflow.
It's funny because GPU stuff like what this article is about is where the LLMs totally fall apart. I can make any LLM produce volumes hallucinations at the drop of a hat by asking it how to construct ffmpeg commands that use hardware acceleration.
Another option is to use a non-cli LLM and ask it to produce a script (bash/ps1) that uses ffmpeg to do X, Y, and Z to your video files. If using a chat LLM it will often provide suggestions or ask questions to improve your processing as well. I do this often and the results are quite good.
fwiw, `tar xzf foobar.tgz` = "_x_tract _z_e _f_iles!" has been burned into my brain. It's "extract the files" spoken in a Dr. Strangelove German accent
Better still, I recently discovered `dtrx` (https://github.com/dtrx-py/dtrx) and it's great if you have the ability to install it on the host. It calls the right commands and also always extracts into a subdir, so no more tar-bombs.
If you want to create a tar, I'm sorry but you're on your own.
I used tar/unzip for decades I think, before moving to 7z which handles all formats I throw at it, and have the same switch for when you want to decompress into a specific directory, instead of having to remember which one of tar and unzip uses -d, and which one uses -C.
"also always extracts into a subdir" sounds like a nice feature though, thanks for sharing another alternative!
For anyone curious, unless you are running a 'tar' binary from the stone ages, just skip the gunzip and cat invocations. Replace .gz with .xz or other well known file ending for different compression.
Examples:
tar -cf archive.tar.gz foo bar # Create archive.tar.gz from files foo and bar.
tar -tvf archive.tar.gz # List all files in archive.tar.gz verbosely.
tar -xf archive.tar.gz # Extract all files from archive.tar.gz
-l, --check-links
(c and r modes only) Issue a warning message unless all links to each file are archived.
And you don't need to uncompress separately. tar will detect the correct compression algorithm and decompress on its own. No need for that gunzip intermediate step.
The problem is it's very non-obvious and thus is unnecessarily hard to learn. Yes, once you learn the incantations they will serve you forever. But sit a newbie down in front of a shell and ask them to extract a file, and they struggle because the interface is unnecessarily hard to learn.
and here is an example from its Wikipedia page, under the "Operation and archive format" section, under the Copy subsection:
Copy
Cpio supports a third type of operation which copies files. It is initiated with the pass-through option flag (p). This mode combines the copy-out and copy-in steps without actually creating any file archive. In this mode, cpio reads path names on standard input like the copy-out operation, but instead of creating an archive, it recreates the directories and files at a different location in the file system, as specified by the path given as a command line argument.
This example copies the directory tree starting at the current directory to another path new-path in the file system, preserving files modification times (flag m), creating directories as needed (d), replacing any existing files unconditionally (u), while producing a progress listing on standard output (v):
I'd also include Regex in the list of dark arts incantations.
I'm ok with regex, but the ffmpeg manpage, it scares me...
Ffmpeg was designed to be unusable if it falls into enemy hands.
1 reply →
Regex is only difficult because it's complicated, the primitives are all sensibly arranged and predictable. FFmpeg is layers of dark magic where the primitives are often inscrutable before you compose them.
I am perfectly at home with regexp, but ffmpeg, magick, and jq are still on the list to master.
with gemini-cli and claude-cli you can now prompt while it prompts ffmpeg, and it does work.
Yeah, you can give an LLM queries like “make this smaller with libx265 and add the hvc1 tag” or “concatenate these two videos” and it usually crushes it. They have a similar level of mastery over imagemagick, too!
Yeah, LLMs have honestly made ffmpeg usable for me, for the first time. The difficulty in constructing commands is not really ffmpeg's fault—it's just an artifact of the power of the tool and the difficulties in shoehorning that power into flags for a single CLI tool. It's just not the ideal human interface to access ffmpeg's functionality. But keeping it CLI makes it much more useful as part of a larger and often automated workflow.
It's funny because GPU stuff like what this article is about is where the LLMs totally fall apart. I can make any LLM produce volumes hallucinations at the drop of a hat by asking it how to construct ffmpeg commands that use hardware acceleration.
Just seeking a clarification on how this would be done:
One would use gemini-cli (or claude-cli),
- and give a natural language prompt to gemini (or claude) on what processing needs to be done,
- with the correct paths to FFmpeg and the media file,
- and g-cli (or c-cli) would take it from there.
Is this correct?
Another option is to use a non-cli LLM and ask it to produce a script (bash/ps1) that uses ffmpeg to do X, Y, and Z to your video files. If using a chat LLM it will often provide suggestions or ask questions to improve your processing as well. I do this often and the results are quite good.
Yes. It works amazingly well for ffmpeg.
1 reply →
Curious to see how quickly each LLM picks up the new codecs/options.
I use the Warp terminal and I can ask it to run —-help and it figures it out
the canonical (if that's the right word for a 2-year-old technique) solution is to paste the whole manual into the context before asking questions
1 reply →
OT, but yours has to be the best username on this site. Props.
Culón is Spanish for big-bottomed, for anyone else wondering.
nope, that would be handling tar balls
ffmpeg right after
Tough crowd.
fwiw, `tar xzf foobar.tgz` = "_x_tract _z_e _f_iles!" has been burned into my brain. It's "extract the files" spoken in a Dr. Strangelove German accent
Better still, I recently discovered `dtrx` (https://github.com/dtrx-py/dtrx) and it's great if you have the ability to install it on the host. It calls the right commands and also always extracts into a subdir, so no more tar-bombs.
If you want to create a tar, I'm sorry but you're on your own.
I used tar/unzip for decades I think, before moving to 7z which handles all formats I throw at it, and have the same switch for when you want to decompress into a specific directory, instead of having to remember which one of tar and unzip uses -d, and which one uses -C.
"also always extracts into a subdir" sounds like a nice feature though, thanks for sharing another alternative!
> tar xzf foobar.tgz
You don't need the z, as xf will detect which compression was used, if any.
Creating is no harder, just use c for create instead, and specify z for gzip compression:
Same with listing contents, with t for tell:
Personally I never understood the problem with tar balls.
The only options you ever need are tar -x, tar -c (x for extract and c for create). tar -l if you wanna list, l for list.
That's really it, -v for verbose just like every other tool if you wish.
Examples:
You never need anything else for the 99% case.
For anyone curious, unless you are running a 'tar' binary from the stone ages, just skip the gunzip and cat invocations. Replace .gz with .xz or other well known file ending for different compression.
4 replies →
> tar -l if you wanna list, l for list.
Surely you mean -t if you wanna list, t for lisT.
l is for check-Links.
And you don't need to uncompress separately. tar will detect the correct compression algorithm and decompress on its own. No need for that gunzip intermediate step.
2 replies →
Yeah I never really understood why people complain about tar; 99% of what you need from it is just `tar -xvf blah.tar.gz`.
9 replies →
Except it's tar -t to list, not -l
1 reply →
You can skip a step in your pipeline.
The problem is it's very non-obvious and thus is unnecessarily hard to learn. Yes, once you learn the incantations they will serve you forever. But sit a newbie down in front of a shell and ask them to extract a file, and they struggle because the interface is unnecessarily hard to learn.
3 replies →
it was just a reference to xkcd#1168
I wasn't expecting the downvotes for an xkcd reference
I have so much of tar memorized. cpio is super funky to me, though.
cpio is not that hard.
A common use case is:
See:
and here is an example from its Wikipedia page, under the "Operation and archive format" section, under the Copy subsection:
Copy
Cpio supports a third type of operation which copies files. It is initiated with the pass-through option flag (p). This mode combines the copy-out and copy-in steps without actually creating any file archive. In this mode, cpio reads path names on standard input like the copy-out operation, but instead of creating an archive, it recreates the directories and files at a different location in the file system, as specified by the path given as a command line argument.
This example copies the directory tree starting at the current directory to another path new-path in the file system, preserving files modification times (flag m), creating directories as needed (d), replacing any existing files unconditionally (u), while producing a progress listing on standard output (v):
$ find . -depth -print | cpio -p -dumv new-path
1 reply →
nope, it's using `find`.