← Back to context

Comment by mahogany

10 hours ago

Every time people point out a limitation or constraint of LLMs, I see a comment that is to the effect of “but humans…”. I don’t understand why this comparison is relevant to this particular thread. Is it just an amusing similarity?

I think it often useful to push the conversation down "we built a system for humans that dealt with this, what from that is or is not applicable for agents in the same context"? Humans randomizing resume review for screening is pretty known; I've seen companies try to fight it with things like hiding information, panel reviews, etc - it's unclear to me how effective those would be for agents (honestly, it was unclear how effective those were for humans). I was depressed about the hiring process before we had AI screening and I remain depressed about it.

It may seem trite but the point is that if separate humans were assigned the same task the LLM was here the results would be similarly non-deterministic.

  • Indeed: LLMs do tasks that would otherwise be assigned to humans. So when pointing out deficiencies in LLM performance they should be compared to the alternative, which also isn't perfect.