← Back to context

Comment by duped

2 days ago

what in theory makes those "super easy" to isolate? Humans are terrible at this to begin with, it takes years to train one of them to do it mildly well. Computers are even worse - blind source separation and the cocktail party problem have been the white whale of audio DSP for decades (and only very recently did tools become passable).

The fact that you can do it with spectral analysis libraries, no LLM required.

This is much easier than source separation. It would be different if I were asking to isolate a violin from a viola or another violin, you’d have to get much more specific about the timbre of each instrument and potentially understand what each instruments part was.

But a vibration made from a string makes a very unique wave that is easy to pick out in a file.

  • Are you making this up? What spectral analysis libraries or tools?

    String instruments create similar harmonic series to horns, winds, and voice (because everything is a string in some dimension) and the major differences are in the spectral envelope, something that STFT tools are just ok at approximating because of the time/frequency tradeoff (aka: the uncertainty principle).

    This is a very hard problem "in theory" to me, and I'm just above casually versed in it.

    • He's not making it up and there's no reason for that tone. Strings are more straightforward to isolate compared to vocals/horns/etc because they produce a near-perfect harmonic series in parallel lines in a spectrogram. The time/frequency tradeoff exists, but it's less of a problem for strings because of their slow attack.

      You can look up HPSS and python libraries like Essentia and Librosa.

      2 replies →

    • If you look at the actual harmonics of a string and of horn, you will see how wrong you are. There is a reason why they sound different to the ear.

      It’s because of this that you can have a relatively inexpensive synthesizer (not sample or PCM based) that does a crude job of mimicking these different instruments by just changing the harmonics.

      1 reply →

>what in theory makes those "super easy" to isolate? Humans are terrible at this to begin with,

Humans are amazing at it. You can discern the different instruments way better than any stem separating AI.