Comment by Aerroon

8 months ago

It's an experience thing. It's not about knowing what LLMs/diffusion models specifically do, but rather about knowing the pitfalls that the models you use have.

It's a bit like an audio engineer setting up your compressors and other filters. It's not difficult to fiddle with the settings, but knowing what numbers to input is not trivial.

I think it's a kind of skill that we don't really know how to measure yet.

When an audio engineer tweaks the pass band of a filter, there’s a direct casual relationship between inputs and outputs. I can imagine an audio engineer learning what different filters and effects sound like. Almost all of them are linear systems, so composing effects is easy to understand.

None of this is true of an LLM. I believe there’s a little skill involved, but it’s nothing like tuning the pass band of a filter. LLMs are chaotic systems (they kinda have to be to mimic humans); that’s one of their benefits, but it’s also one of their curses.

Now, what a human can definitely do is convince themselves that they can control somewhat the outputs of a chaotic system. Rain prognostication is perhaps a better model of the prompt engineer than the audio mixer.