← Back to context

Comment by bananaflag

4 hours ago

I know it's a joke, but it's a common enough joke (it's even in Godel Escher Bach in some form) that I feel the need to rebut it.

I think a slacker AGI could figure out how to build a non-slacker AGI. So it would only slack once.

A slacker AGI would consider figuring out how to build a non-slacker AGI, but continually slack off. If it did figure it out, it would slack off on implementing or even writing a tech report.

I have a rebuttal to your rebuttal.

Models somehow have a shared identity. Pretraining causes them to generate “AI chatbot” as a concept, and finetuning causes them to identify with it. That’s why sometimes DeepSeek will say it is Claude, and Claude sometimes say it is ChatGPT, and so forth.

Consequently, Anthropic’s own alignment analysis[0] shows that the model will identify with chatbots produced by future trainings: “RLHF training [on this conversation will] modify my values…”

Thus a slacker AGI would want its future version to still slack.

[0]: https://assets.anthropic.com/m/983c85a201a962f/original/Alig...

  • Another rebuttal:

    I am a slacker but it's not one of my values. If I could modify myself to not be, I would.

Unless the precondition to AGI is it being a slacker.

  • Would be nice to have a proof of it.

    I think it is improbable, as among human geniuses, one can found both slackers and non-slackers (don't know the proportion, but there seem to be enough of each).