← Back to context

Comment by rohtashotas

1 year ago

It's not a silly mistake. It was rlhf'd to do this intentionally.

When the results are more extremist than the unfiltered model, it's no longer a 'small mistake'

rlhf: Reinforcement learning from human feedback

  • How is this pronounced out loud?

    • I was just saving folks a google, as I had no idea what the acronym was.

      I propose rill-hiff until someone who actually know what they’re doing shows up!

Realistically it was probably just how Gemini was prompted to use the image generator tool