← Back to context

Comment by PaulDavisThe1st

1 year ago

I can't speak for ffmpeg, but I can report on why we use non-portable assembler inside Ardour (a x-platform digital audio workstation).

Ardour's own code doesn't do very much DSP (it's a policy choice), but one thing that our own code does do is metering: comparing a current sample value to every previous sample value in a given audio data stream within a given time window to decide if it is higher (or lower) than the previous max (or min).

When someone stepped forward (hi Sampo!) to code this in hand-written SIMD assembler, we got a 30% reduction in CPU usage when using mid-sized buffers on moderate size sessions (say, 24 tracks or so).

That's a worthy tradeoff, even though it means that we now have 5 different asm versions of about half-a-dozen functions. The good news is that they don't really need to be maintained. New SIMD architectures mean new implementations, not hacks to existing code.

However, I should note that it is always very important to compare what compilers are capable of, and to keep comparing that. In the decade or more after our asm metering code was first written, gcc improved to the point where simply using C(++) and some compiler flags produced code that was within an instruction or two of our hand-crafted version (and may be more correct in the face of all possible conditions).

So ... you can get dramatic performance benefits that are worth the effort, the maintainance costs are low, you should keep checking how your code compares with today's compiler's best optimization effort.

I'm not at all saying that it isn't worth it for ffmpeg to use assembly, but there is a tradeoff there. Ffmpeg either needs to either only support a limited number of architectures, and duplicate code for all of them, have asm implementations for the most popular architectures (probably x86(_64) and arm), and a slower, arch independent fallback implementation in c for the rest, or have asm implementations in a large number of ISAs. I'm guessing ffmpeg does the middle option, especially since this guide focuses on x86 assembly, but ffmpeg supports many other architectures.

The performance wins may very well be worth it, but it is still good to be aware of the tradeoff involved.

  • ffmpeg has multiple implementations for each architecture to take advantage of microarchitectural wins.

Ardour is a great piece of software! Thanks for that. I love hearing experiences like these.