Comment by fenomas

6 months ago

When I see these debates it's always the other way around - one person speaks colloquially about an LLM's behavior, and then somebody else jumps on them for supposedly believing the model is conscious, just because the speaker said "the model thinks.." or "the model knows.." or whatever.

To be honest the impression I've gotten is that some people are just very interested in talking about not anthropomorphizing AI, and less interested in talking about AI behaviors, so they see conversations about the latter as a chance to talk about the former.

As I write this, Claude Code is currently opening and closing various media files on my computer. Sometimes it plays the file for a few seconds before closing it, sometimes it starts playback and then seeks to a different position, sometimes it fast forwards or rewinds, etc.

I asked Claude to write a E-AC3 audio component so I can play videos with E-AC3 audio in the old version of QuickTime I really like using. Claude's decoder includes the ability to write debug output to a log file, so Claude is studying how QuickTime and the component interact, and it's controlling QuickTime via Applescript.

Sometimes QuickTime crashes, because this ancient API has its roots in the classic Mac OS days and is not exactly good. Claude reads the crash logs on its own—it knows where they are—and continues on its way. I'm just sitting back and trying to do other things while Claude works, although it's a little distracting that something else is using my computer at the same time.

I really don't want to anthropomorphize these programs, but it's just so hard when it's acting so much like a person...

  • Would it help you to know that trial and error is a common tactic by machines? Yes, humans do it too, but that doesn't mean the process isn't mechanical. In fact, in computing we might call this a "brute force" approach. You don't have to cover the entire search space to brute force something, and it certainly doesn't mean you can't have optimization strategies and need to grid search (e.g. you can use Bayesian methods, multi-armed bandit approaches, or a whole world of things).

    I would call "fuck around and find out" a rather simple approach. It is why we use it! It is why lots of animals use it. Even very dumb animals use it. Though, we do notice more intelligent animals use more efficient optimization methods. All of this is technically hypothesis testing. Even a naive grid search. But that is still in the class of "fuck around and find out" or "brute force", right?

    I should also mention two important things.

    1) as a human we are biased to anthropomorphize. We see faces in clouds. We tell stories of mighty beings controlling the world in an effort to explain why things happen. This is anthropomorphization of the universe itself!

    2) We design LLMs (and many other large ML systems) to optimize towards human preference. This reinforces an anthropomorphized interpretation.

    The reason for doing this (2) is based on a naive assumption[0]: If it looks like a duck, swims like a duck, and quacks like a duck, then it *probably* is a duck. But the duck test doesn't rule out a highly sophisticated animatronic. It's a good rule of thumb, but wouldn't it also be incredibly naive to assume that it *is* a duck? Isn't the duck test itself entirely dependent on our own personal familiarity with ducks? I think this is important to remember and can help combat our own propensity for creating biases.

    [0] It is not a bad strategy to build in that direction. When faced with many possible ways to go, this is a very reasonable approach. The naive part is if you assume that it will take you all the way to making a duck. It is also a perilous approach because you are explicitly making it harder for you to evaluate. It is, in the fullest sense of the phrase, "metric hacking."

    • It wasn't a simple brute force. When Claude was working this morning, it was pretty clearly only playing a file when it actually needed to see packets get decoded, otherwise it would simply open and close the document. Similarly, it would only seek or fast forward when it was debugging specific issues related to those actions. And it even "knew" which test files to open for specific channel layouts.

      Yes this is still mechanical in a sense, but then I'm not sure what behavior you wouldn't classify as mechanical. It's "responding" to stimuli in logical ways.

      But I also don't quite know where I'm going with this. I don't think LLMs are sentient or something, I know they're just math. But it's spooky.

      7 replies →

Respectfully, that is a reflection of the places you hang out in (like HN) and not the reality of the population.

Outside the technical world it gets much worse. There are people who killed themselves because of LLMs, people who are in love with them, people who genuinely believe they have “awakened” their own private ChatGPT instance into AGI and are eschewing the real humans in their lives.

  • Naturally I'm aware of those things, but I don't think TFA or GGP were commenting on them so I wasn't either.

  • The other day a good friend of mine with mental health issues remarked that "his" chatgpt understands him better than most of his friends and gives him better advice than his therapist.

    It's going to take a lot to get him out of that mindset and frankly I'm dreading trying to compare and contrast imperfect human behaviour and friendships with a sycophantic AI.

    • > The other day a good friend of mine with mental health issues remarked that "his" chatgpt understands him better than most of his friends and gives him better advice than his therapist.

      The therapist thing might be correct, though. You can send a well-adjusted person to three renowned therapists and get three different reasons for why they need to continue sessions.

      No therapist ever says "Congratulations, you're perfectly normal. Now go away and come back when you have a real problem." Statistically it is vanishingly unlikely that every person who ever visited a therapist is in need of a second (more more) visit.

      The main problem with therapy is a lack of objectivity[1]. When people talk about what their sessions resulted in, it's always "My problem is that I'm too perfect". I've known actual bullies whose therapist apparently told them that they are too submissive and need to be more assertive.

      The secondary problem is that all diagnosis is based on self-reported metrics of the subject. All improvement is equally based on self-reported metrics. This is no different from prayer.

      You don't have a medical practice there; you've got an Imam and a sophisticated but still medically-insured way to plead with thunderstorms[2]. I fail to see how an LLM (or even the Rogerian a-x doctor in Emacs) will do worse on average.

      After all, if you're at a therapist and you're doing most of the talking, how would an LLM perform worse than the therapist?

      ----------------

      [1] If I'm at a therapist, and they're asking me to do most of the talking, I would damn well feel that I am not getting my moneys worth. I'd be there primarily to learn (and practice a little) whatever tools they can teach me to handle my $PROBLEM. I don't want someone to vent at, I want to learn coping mechanisms and mitigation strategies.

      [2] This is not an obscure reference.

      1 reply →

    • It's surprisingly common on reddit that people talk about "my chatgpt", and they don't always seem like the type who are "in a relationship" with the bot or unlocking the secrets of the cosmos with it, but still they write "my chatgpt" and "your chatgpt". I guess the custom prompt and the available context does customize the model for them in some sense, but I suspect they likely have a wrong mental model of how this customization works. I guess they imagine it as their own little model being stored on file at OpenAI and as they interact with it, it's being shaped by it, and each time they connect, their model is retrieved from the cloud storage and they connect to it or something.

Most certainly the conversation is extremely political. There are not simply different points of view. There are competitive, gladiatorial opinions ready to ambush anyone not wearing the right colors. It's a situation where the technical conversation is drowning.

I suppose this war will be fought until people are out of energy, and if reason has no place, it is reasonable to let others tire themselves out reiterating statements that are not designed to bring anyone closer to the truth.

  • If this tech is going to be half as impactful as its proponents predict, then I'd say it's still under-politicized. Of course the politics around it doesn't have to be knee-jerk mudslinging, but it's no surprise that politics enters the picture when the tech can significantly transform society.

    • Go politicize it on Reddit, preferably on a political sub and not a tech sub. On this forum, I would like to expect a lot more intelligent conversation.

Wait until a conversation about “serverless” comes up and someone says there is no such thing because there are servers somewhere as if everyone - especially on HN -doesn’t already know that.

  • Why would everyone know that? Not everyone has experience in sysops, especially not beginners.

    E.g. when I first started learning webdev, I didn’t think about ‘servers’. I just knew that if I uploaded my HTML/PHP files to my shared web host, then they appeared online.

    It was only much later that I realized that shared webhosting is ‘just’ an abstraction over Linux/Apache (after all, I first had to learn about those topics).

    • I am saying that most people who come on HN and say “there is no such thing as serverless and there are servers somewhere” think they are sounding smart when they are adding nothing to the conversation.

      I’m sure you knew that your code was running on computers somewhere even when you first started and wasn’t running in a literal “cloud”.

      It’s about as tiring as people on HN who know just a little about LLMs thinking they are sounding smart when they say they are just advanced autocomplete. Both responses are just as unproductive

      5 replies →

    • I think they fumbled with wording but I interpreted them as meaning "audience of HN" and it seems they confirmed.

      We always are speaking to our audience, right? This is also what makes more general/open discussions difficult (e.g. talking on Twitter/Facebook/etc). That there are many ways to interpret anything depending on prior knowledge, cultural biases, etc. But I think it is fair that on HN we can make an assumption that people here are tech savvy and knowledgeable. We'll definitely overstep and understep at times, but shouldn't we also cultivate a culture where it is okay to ask and okay to apologize for making too much of an assumption?

      I mean at the end of the day we got to make some assumptions, right? If we assume zero operating knowledge then comments are going to get pretty massive and frankly, not be good at communicating with a niche even if better at communicating with a general audience. But should HN be a place for general people? I think no. I think it should be a place for people interested in computers and programming.

      4 replies →