← Back to context

Comment by jaennaet

15 hours ago

What would you call this behaviour, then?

Marketing. ”Oh look how powerful our model is we can barely contain its power”

  • This has been a thing since GPT-2, why do people still parrot it

    • I don’t know what your comment is referring to. Are you criticizing the people parroting “this tech is too dangerous to leave to our competitors” or the people parroting “the only people who believe in the danger are in on the marketing scheme”

      fwiw I think people can perpetuate the marketing scheme while being genuinely concerned with misaligned superinteligence

  • Even hackernews readers are eating it right up.

    • Hilarious for this to be downvoted.

      "LLMs are deceiving their creators!!!"

      Lol, you all just want it to be true so badly. Wake the fuck up, it's a language model!

A very complicated pattern matching engine providing an answer based on it's inputs, heuristics and previous training.

  • Great. So if that pattern matching engine matches the pattern of "oh, I really want A, but saying so will elicit a negative reaction, so I emit B instead because that will help make A come about" what should we call that?

    We can handwave defining "deception" as "being done intentionally" and carefully carve our way around so that LLMs cannot possibly do what we've defined "deception" to be, but now we need a word to describe what LLMs do do when they pattern match as above.

    • The pattern matching engine does not want anything.

      If the training data gives incentives for the engine to generate outputs that reduce negative reaction by sentiment analysis, this may generate contradictions to existing tokens.

      "Want" requires intention and desire. Pattern matching engines have none.

      11 replies →