Comment by yunwal

2 days ago

The fact that you can do it with spectral analysis libraries, no LLM required.

This is much easier than source separation. It would be different if I were asking to isolate a violin from a viola or another violin, you’d have to get much more specific about the timbre of each instrument and potentially understand what each instruments part was.

But a vibration made from a string makes a very unique wave that is easy to pick out in a file.

Are you making this up? What spectral analysis libraries or tools?

String instruments create similar harmonic series to horns, winds, and voice (because everything is a string in some dimension) and the major differences are in the spectral envelope, something that STFT tools are just ok at approximating because of the time/frequency tradeoff (aka: the uncertainty principle).

This is a very hard problem "in theory" to me, and I'm just above casually versed in it.

  • He's not making it up and there's no reason for that tone. Strings are more straightforward to isolate compared to vocals/horns/etc because they produce a near-perfect harmonic series in parallel lines in a spectrogram. The time/frequency tradeoff exists, but it's less of a problem for strings because of their slow attack.

    You can look up HPSS and python libraries like Essentia and Librosa.

    • All wind instruments and all bowed string instruments produce a perfect harmonic series while emitting a steady tone. The most important difference between timbres of different instruments is in the attack, where inharmonic tones are also generated. Several old synths used this principle to greatly increase realism, by adding brief samples of attack transients to traditional subtractive synthesis, e.g.:

      https://en.wikipedia.org/wiki/Linear_arithmetic_synthesis

    • Hmmm... was 'tone' a pun?

      Why mention a strings 'slow attack' as less of a problem? No isolation software considers this an easy route.

      Vocals are more effectively isolated by virtue of the fact they are unique sounding. Strings (and other sounds) are the similar in some ways but far more generic. All software out there indicates this, including the examples mentioned.

  • If you look at the actual harmonics of a string and of horn, you will see how wrong you are. There is a reason why they sound different to the ear.

    It’s because of this that you can have a relatively inexpensive synthesizer (not sample or PCM based) that does a crude job of mimicking these different instruments by just changing the harmonics.

    • There is one important difference between the harmonics of string and wind instruments: it's possible to build a wind instrument that suppresses (although not entirely eliminates) the even harmonics, e.g. a stopped organ pipe. If it sounds like a filtered square wave it's definitely a wind instrument. But if it sounds like a filtered sawtooth wave it could be either.