Comment by alfiedotwtf
14 hours ago
Byproduct? It’s essentially the only part of an LLM that is useful, because it’s the WHOLE product!
It’s the same reason why DRM for audio and video is a non sequitur - if you want a person to see or hear audio or video, eventually at the end of the chain, it’s going to be converted to audio for the ear and light for the eyes - that’s why you attach your tap.
Without a model generating tokens, what’s the point. So if Anthropic somehow disable quality token generation, what’s the point!
That's why the harness is moving server-side: because generating tokens is not the actual point of the model, not for the users. Especially with tool calling giving us agents that can act, most of the tokens generated are not, themselves, critical to the end users. Specifically, a lot of tokens goes into orchestrating actual tool calls, and then most "thinking tokens" are only relevant to users only in so far as they help users keep track of and verify what the LLM is doing. So all those tokens can be hidden or replaced by partial summaries, and all of that can happen server-side, and then there's very little to distill from.
I haven't heard of this happening, do you have links any explainers on this?
Claude on the Web (which includes also at least the Android and Desktop apps) and ChatGPT web app are two examples - they keep gaining agentic capabilities.
Perhaps most striking example for me - I've been using a lot of Claude Code in the past month, most of it was through the web, Desktop (app) or phone interface, running actual harness "remotely" (somewhere on Anthropic-controlled infra).
One way of looking at it: web surfaces are slowly catching up with (fraction of the power of) agentic coding tools. But another way is, the major players are building up SaaS harnesses that start to compete with (their own) local ones. The reason may be ease of use, but the practical side effect is making it much harder to use their models to train competition, as these SaaS harnesses create an abstraction layer on top of LLMs that resides entirely in the vendor's cloud and therefore cannot be worked around.