← Back to context

Comment by whacked_new

4 hours ago

Circa GPT-3.5 to GPT-4o I was involved in some research in figuring out how to make LLMs funny. We tried a bunch of different things, from giving it rules on homonym jokes [1], double-entendre jokes, fine tuning on comedian transcripts, to fine tuning on publicly rated joke boards.

We could not make it funny. Also interesting was that when CoT research was getting a lot of attention, we tried a joke version of CoT, asking GPT4 to explain why a joke was funny in order to produce training set data. Most of the explanations were completely off base.

After this work, I became a lot less worried about the GAI-taking-over narrative.

Funny is very, very hard.

[1] without a dictionary, which at first seems inefficient, but this work demonstrated that GPT could perfectly reconstruct the dictionary anyway

The GPT3 base model was pretty funny if you like nonsense. Instruct tuning and RLHF seem to destroy it when they recalibrate everything.