← Back to context

Comment by PaulDavisThe1st

11 days ago

There is a reason that most people do not use interpreted languages, or languages with garbage collection, for audio synthesis and DSP.

It's great that it works, and it may well work 99% of the time. And it may have been a great learning experience/platform, so congrats for that.

But it's important for people to understand why this is generally the wrong toolset for this sort of software development, even when it can be so much fun.

Python and other interpreted languages (Lua excepted, with conditions), and languages like Swift that have GC, cannot ensure non-blocking behavior in the code that need to runs in realtime. You can paper over this with very large audio buffers (which makes the synth feel sluggish) or with crossed fingers (which work a surprising amount of the time). But ultimately you need a language like C/C++/Rust etc. to ensure that your realtime DSP code is actually realtime.

Despite Apple pushing Swift "for everything", even they still acknowledge that you should not write AudioUnit (or any other plugin formats) using Swift.

Meanwhile, have fun with this, which it looks like you already did!

Got a friend who is in the high frequency trading industry and uses both Java and C#. I asked about GC. Turns out you just write code that doesn’t need to GC. Object pools, off-heap memory etc.

It won’t do the absolute fastest tasks in the stack quite as well but supposedly the coding speed and memory management benefits are more important, and there’s no GC so it’s reliable.

  • > Turns out you just write code that doesn’t need to GC. Object pools, off-heap memory etc.

    Some GCd languages make this easier than others. Java and C# allow you to use primitive types. Even just doing some basic arithmetic in Python (at least CPython) is liable to create temporary objects; locals don't get stack-allocated.

  • That's what we do in games too. If you know the scope of your project and how to avoid dynamic allocation it's fine.

This seems to conflate different things.

Interpreted is not a problem from the predictable behaviour point of view. You may get less absolute performance. Though with Python you can do the heavy lifting in numpy etc which are in native code. And this is what is done here, see eg https://github.com/gpasquero/voog/blob/main/synth/dsp/envelo...

Languages that have garbage collection: not going to rehash the standard back-and-forth here, suffice it to say that the devil is in the details.

  • I was speaking in broad generalities (and did mention Lua as a counter-example).

    If you want realtime safe behavior, your first port of call is rarely going to be an interpreted language, even though, sure, it is true that some of them are or can be made safe.

Not being a dev writing code running in realtime nor an audio type with experience of things not running in realtime, what happens when GC kicks in? Does the entire audio stack go silent? Does it only effect the one filter so it sounds like a drop, or is it a pause so not it is no longer in sync? In theory, I get why it is bad, but I'm curious of what it sounds like when it does go bad.

  • Python mainly uses reference counting for garbage collction, and the reference cycle breaking full-program gc can be manually controlled.

    For RC, each "kick in" of the GC is usually small amount of work, triggered by the reference count of an object going to 0. In this program's case I'd guess you don't hear any artifacts.

  • The audio interface hardware expects to get N samples every M msecs, and stops for no man (or program). So, anything that stops or flows the flow enough that less than N samples are delivered every M msecs causes a click or pop in the output. How bad the pop actually sounds depends on a lot of different things, so its hard to predict.

You can paper over this with very large audio buffers (which makes the synth feel sluggish) or with crossed fingers (which work a surprising amount of the time).

It’s been a while since I was involved in computer audio, but is there a difficulty I’m not seeing with simply using ring buffers and doing memory allocations upfront so as to avoid GC altogether?

  • Even if you avoid GC, you need the memory used by the realtime code to be pinned to physical RAM to avoid paging.

    The problem with GC is not (always) what it does, it's when it does it. You often do not have control over that, and if it kicks in the middle of otherwise realtime code ... not good.

You can get pretty descent results (from past experiments I’ve run) of doing something similar to PyTorch et al, ie, make the compute graph (or in this case, the wiring graph for the synth) in Python then have the real time stuff all inside a compiled extension.

Now I'm curious-- what happens if the author adds a manual garbage collection call at the end of _audio_callback? Can it still Moog, or will that cause it to eternally miss deadlines?

> Lua excepted, with conditions

Where can I read more about this? Is Lua's garbage collector tuned for real-time operations?