← Back to context

Comment by carterschonwald

10 days ago

i actually think this is too tame. it really has to be stuff youd mever say to a real person.

Does it really? I'd be surprised if abuse actually worked better than sternly worded warnings/instructions, and even if it did, it doesn't seem healthy to get used to that type of prompting.

  • its part of making sure the model actually engages in emotive communication, if i'm inventing insults i've never even thought about, i'm furious :)

    saying i'm "furious" has lower entropy that incredibly implausible abuse. In some first party harnesses it just results in doom loops, but thats usually because the COT is hidden after the immediate turn in those setups. COT persistence helps with a lotta stuff