Comment by magicalist
15 hours ago
Harness itself was a widely used term by at least the "[LLM] plays pokemon" trend, which was a year ago[1]. That was basically the term of art to use when arguing about just how much special treatment LLMs should get.
"harness engineering" is the term claimed by that article to have originated in February. It does seem obvious in retrospect and I don't remember an origination point, but there's at least one hn comment predating that in December[2] and it doesn't treat it as novel.
I will admit that my bias is against any self congratulatory buzzword fads (I'm still not over "MCP is the USB of LLMs" or whatever and that's been a year now too). "Who coined the term harness engineering?" -> who cares? It was already widely being done.
I read your comment. I think we may be talking about slightly different contexts.
The Pokémon article you linked is basically about benchmarking. In that context, the harness functions as part of the benchmark setup: the controlled environment around the model, the available inputs, tools, and assistance.
The current usage of “harness,” at least in the agent engineering discussion, seems closer to a lower-level runtime layer, almost like an OS around the agent.
So I see this as a transition: from “harness” as a narrower benchmark/control-variable layer to “harness” as the broader operating environment of the agent.
That does not mean I think your point is wrong. With topics like this, the interpretation depends on which part of the lineage one emphasizes. The first appearance of the idea may go back to 2022 or earlier, while the usage that looks closer to the current meaning may have emerged at a different point.
I am probably giving more weight to the SIG article, while you are giving more weight to a different point in the lineage. Both seem reasonable to me.