← Back to context

Comment by pmarreck

18 hours ago

Impressed anytime I have to use it (even if I have to study its man page again or use an LLM to construct the right incantation or use a GUI that just builds the incantation based on visual options). Becoming an indispensable transcoding multitool.

I think building some processing off of Vulkan 1.3 was the right move. (Aside, I also just noticed yesterday that Asahi Linux on Mac supports that standard as well.)

> incantation

FFmpeg arguments, the original prompt engineering

  • with gemini-cli and claude-cli you can now prompt while it prompts ffmpeg, and it does work.

    • Yeah, you can give an LLM queries like “make this smaller with libx265 and add the hvc1 tag” or “concatenate these two videos” and it usually crushes it. They have a similar level of mastery over imagemagick, too!

      1 reply →

    • Just seeking a clarification on how this would be done:

      One would use gemini-cli (or claude-cli),

      - and give a natural language prompt to gemini (or claude) on what processing needs to be done,

      - with the correct paths to FFmpeg and the media file,

      - and g-cli (or c-cli) would take it from there.

      Is this correct?

      3 replies →

  • nope, that would be handling tar balls

    ffmpeg right after

    • Tough crowd.

      fwiw, `tar xzf foobar.tgz` = "_x_tract _z_e _f_iles!" has been burned into my brain. It's "extract the files" spoken in a Dr. Strangelove German accent

      Better still, I recently discovered `dtrx` (https://github.com/dtrx-py/dtrx) and it's great if you have the ability to install it on the host. It calls the right commands and also always extracts into a subdir, so no more tar-bombs.

      If you want to create a tar, I'm sorry but you're on your own.

      2 replies →

    • Personally I never understood the problem with tar balls.

      The only options you ever need are tar -x, tar -c (x for extract and c for create). tar -l if you wanna list, l for list.

      That's really it, -v for verbose just like every other tool if you wish.

      Examples:

        tar -c project | gzip > backup.tar.gz
        cat backup.tar.gz | gunzip | tar -l
        cat backup.tar.gz | gunzip | tar -x
      

      You never need anything else for the 99% case.

      24 replies →

LLMs and complex command line tools like FFmpeg and ImageMagick are a perfect combination and work like magic…

It’s really the dream UI/UX from sience fiction movies: “take all images from this folder and crop 100px away except on top, saturate a bit and save them as uncompressed tiffs in this new folder, also assemble them in a video loop, encode for web”.

  • Had to do exactly that with a bunch of screenshots I took but happened to include a bunch of unnecessary parts of the screen.

    A prompt to ChatGPT and a command later and all were nicely cropped in a second.

    The dread of doing it by hand and having it magically there a minute later is absolutely mind blowing. Even just 5 years ago, I would have just done it manually as it would have definitely taken more to write the code for this task.

  • it can work but it's far from science fiction. LLMs tend to produce extremely subpar if not buggy ffmpeg code. They'll routinely do things like put the file parameter before the start time which needlessly decodes the entire video, produce wrong bitrates, re-encode audio needlessly, and so on.

    If you don't care enough about potential side effects to read the manual it's fine, but a dream UX it is not because I'd argue that includes correctness.

LLMs are a great interface for ffmpeg. There are tons of tools out there that can help you run it with natural language. Here's my personal script: https://github.com/jjcm/llmpeg

  • i wrote a command “please” that allows me to say “please use ffmpeg to do whatever” and it generates the command with confirmation