Comment by dcsommer

21 hours ago

We must not continue to develop media codecs in memory unsafe languages. Small, auditable sections can opt-out perhaps, but choosing default-unsafe for this type of software is close to professional negligence.

Cryptography and video codecs are notable exceptions, they put a lot of effort to making the code provably memory safe: no recursion, limited use of stack variables, no dynamic allocations, etc. As a result, memory safe languages bring nothing but trouble by making it non deterministic, that’s especially true for crypto where compiler “optimisations” guarantee you side channels attacks.

  • Video codecs just don't need to do dynamic allocations because it's not relevant to the problem. There's still certainly plenty of opportunities for memory bugs because there's a lot of pointer math.

  • What in the world do you mean by “non-deterministic”?

    C compilers, Rust compilers, and assemblers are all deterministic.

    • In cryptography, you want operations to run in constant time, even if it’s wasteful, otherwise an attacker could guess information about the key or plaintext by measuring execution times.

      Modern compilers are extremely clever and will produce machine code that takes full advantage of modern CPU branch predictors, and reorder instructions to better take advantage of pipelining. This in itself will make the same code run at different speeds depending on the input data.

      Then there is the whole issue of compiler version roulette. As a developer you have no idea which version of compilers your users and distros will use, and what new and wonderful optimisation they will bring.

      1 reply →

    • > C compilers, Rust compilers, and assemblers are all deterministic.

      Within a version, yes, but not cross version. Different versions of GCC/Clang etc can give you completely different code.

For the codec itself, the majority of it is performance sensitive and often has a significant amount of assembly even, so a memory safe language doesn't change much.

However for the container/extractor... those should absolutely be in a memory safe language, and those are were a lot of the exploits/crashes are, too, as metadata is more fuzzy.

As a practical example of this see something like CrabbyAVIF. All the parser code is rust, but it delegates to dav1d for the actual codec portion

Of the 3 software AV1 encoders, the only one that is fully dead is the Rust encoder (rav1e). If people truly wanted memory safe encoders/decoders, they would fund and develop them.

  • I can totally understand why people would want a memory-safe decoder, but a memory-safe encoder is niche. Finding a memory-safety bug in a decoder is a matter of finding a single unchecked integer field somewhere; finding a memory-safety bug in an encoder requires first finding some sort of logic bug in the encoder and then crafting an adversarial input that survives a number of highly lossy transformations.

    Compare the number of CVEs against x264 (included decoders don't count!) and FFmpeg's H.264 decoder.

  • There are many paths to memory safety, even if the one Rust project seems to be going nowhere.

    There's other memory-safe languages, and there's formal verification.

    e.g. seL4 favors pancake.

  • > If people truly wanted memory safe encoders/decoders

    Really? How many codecs have your neighbors contributed money for the development of, just curious.

    • I think these conversations are directed by the parties funding the efforts. Example: "we (large company) want a fast AV2 decoder" -> they pay a specialized team to do it -> this team works in C for the most part, so it is done in C. If there were financial incentives to do it in Rust, they'd pay more for a Rust decoder.

      1 reply →