← Back to context

Comment by urutom

6 hours ago

What I find fascinating about the shared prompt isn’t just the result, but the visible thinking process. Math papers usually skip all the messy parts and just present the polished proof. But here you get something closer to their notepad. I also find it oddly endearing when the AI says things like “Interesting!” It almost feels like a researcher encouraging themselves after a small progress. It gives me rare feeling of watching the search itself, not just the final result.

> the AI says things like “Interesting!”

My experience of those utterance is that it’s purely phatic mimicry: they lack genuine intuitive surprise, it’s just marking a very odd shift in direction. The problem isn’t the lack of path, is that the rhetorical follow-up to those leaps are usually relevant results, so they stream-of-token ends up rapidly over-playing its own conviction. That’s why it’s necessary (and often ineffective) to tell them to validate their findings thoroughly: too much of their training is “That’s odd” followed by “Eureka!” and not “Nevermind…”

  • And what I find fascinating is I see similar mimicking by my 5 year old. Perhaps we shouldn’t be so quick to call this a lack of being genuine. Sometimes emotions are learned in humans but we wouldn’t call them fake.

    I don’t want to declare machines to have emotion outright, but to call mimicry evidence of falsehood is also itself false.

    • Mimicry is how kids learn the expected reactions to particular emotions. A kid mimicking your surprise doesn’t mean they are surprised (as surprise requires an existing expectation of an outcome they may not have the experience for), but when they do feel genuine surprise, they’ll know how to express it.

  • It’s funny that this is probably due to bias in the training texts, right? Humans are way more likely to publish their “Eureka!” moments than their screwups… if they did, maybe models would’ve exhibit this behavior.

    Now that AI labs have all these “Nevermind” texts to train on, maybe it’s getting easier to correct? (Would require some postprocessing to classify the AI outputs as successful or not before training)

    • I think it's more explicit than that, part of post-training to enforce the kind of behavior, I don't think it's emergent but rather researchers steering it to do that because they saw the CoT gets slightly better if the model tries to doubt itself or cheer itself on. Don't recall if there was a paper outlining this, tried finding where I got this from but searches/LLMing turns up nothing so far.

    • My understanding is that it’s the result of these companies making sure to keep you engaged/happy less than the result of data these companies train with.

      I don’t know if it’s true or not but it certainly tracks given LLMs are way more polite than the average post on the internet lol

  • I think that a lot of models have to sprinkle in a lot of "fluff" in their thinking to stay within the right distribution. They only have language as their only medium; the way we annotate context is via brackets and then training them to hopefully respect the brackets. I'd imagine that either top labs explicitly train, or through the RL process the models implicitly learn, to spam tokens to keep them 'within distribution' since everything's going through the same channel and there's no fine grained separation between things.

    Philosophically, it's not like you're a detached observer who simply reasons over all possible hypotheses. Ever get stuck in a dead end and find it hard to dig yourself out? If you were a detached observer, it'd be pretty easy to just switch gears. But it's not (for humans).

    • Language really only exists at the input and output surfaces of the models. In the middle it's all numerical values. Which you might be quick in relating to just being a numeric cypher of the words, which while not totally false, it misses that it is also a numeric cypher of anything. You can train a transformer on anything that you can assign tokens to.

      2 replies →

  • Interestingly this is strikingly similar to how my mind would process something I find genuinely interesting.

  • I've somehow managed to train mine out of trying to fluff me up the whole time, its become very factual.

    Overall it saves me a lot of time reading when it's just focusing on the details.

This is another underrated benefit of working with LLMs. When I work I don't take detailed notes about my thinking, decisions, context, etc. I just focus on code. If I get interrupted it takes me a while to get back into the flow.

With LLMs I just read back a few turns and I'm back in the loop.

The actual iteration through various learned approaches to dealing with problems I'd probably find fascinating if I understood the maths! Especially if I knew it well enough to know which approaches were conventional and which weren't.

I find the AI pronouncing things "interesting!" less interesting on the basis that even though in this case it crops up in the thinking rather than flattering the user in the chat, it's almost as much of an AI affectation as the emdash.

  • I always assumed the "interesting!" markers were actual markers. A kind of tag for the system to annotate its context.

    • Probably does function like that in terms of highlighting context, in this case probably to the system's benefit.

      But in general exclamations of "interesting!" seems like the stereotypical AI default towards being effusive, and we've all seen the chat logs where AI trained to write that way responding with "interesting", "great insight!" towards a user's increasingly dubious inputs is an antipattern...

The simulacrum of a thing is not the thing! Not only is the "interesting!" unrelated to any "thought process", the whole """thinking""" output is not a representation of a thought process but merely a post-facto confabulation that sounds appropriately human-like.

  • Can't help but think of this I re-read recently from Nietzche:

    > When I analyze the process that is expressed in the sentence, "I think," I find a whole series of daring assertions that would be difficult, perhaps impossible, to prove; for example, that it is I who think, that there must necessarily be something that thinks, that thinking is an activity and operation on the part of a being who is thought of as a cause, that there is an "ego," and, finally, that it is already determined what is to be designated by thinking—that I know what thinking is.

  • Yes, I recently got access to an annotations platform for llms, and I've found many projects associated with generating chain of thought outputs.

    These COT outputs are the same sort of illusion as the general output. Someone is feeding them scripts of what it looks like to solve problems, so they generate outputs that look like problem solving.

    I can't remember if I mentioned it previously on here, but an llm seems to be an extremely powerful synthesis machine. If you give it all of the individual components to solve a complex problem that humans might find intractable due to scope or bias, it may be able to crack the problem.