Comment by EA-3167
6 days ago
Anthropic in particular does this masterfully, you’d think they’d invented Skynet by the way they hand-wring.
As always what matters are actions and evidence, not talk.
6 days ago
Anthropic in particular does this masterfully, you’d think they’d invented Skynet by the way they hand-wring.
As always what matters are actions and evidence, not talk.
I’ll believe Anthropic when they fire everyone making more than the cost of a few GPUs. Until then, it’s just marketing.
When a model can tell funny jokes or write good poetry, that's when I'll be concerned.
No, you'll just say "That's not really very funny," or "That's not very impressive poetry," and nobody will be able to dispute it.
For some time now, at least a year, LLMs have been capable of doing both of these things well enough to fool you.
(Pastebin of my response below, which got nuked for whatever reason: https://pastebin.com/buJBSgiq . Some if not most of them would've fooled me into thinking a human wrote them.)
Okay post a really funny LLM joke about potatoes and post a great piece of LLM poetry about lemons.
I’ll wait. You should be able to do it quickly though since LLMs are so good at it.
9 replies →
one of their highlights with mythos was it's ability to generate new puns
I took a look and honestly they're the first AI puns that aren't bad
Times are changing
Trained with the conversations of one million dads and their kids, captured by Amazon Echo.
I'm not sure if this is mythos-specific though. Past models have been great at puns! They do wordplay and puns reasonably well because those are structural.
However, the concepts of comedic timing, subversion of expectations, and emotional punch are kinda contrary to how LLMs work. LLMs are trained to minimize cross-entropy loss. So by construction, they're biased toward the statistically expected.
> Although Claude Opus models largely recycle puns which can be found online, Mythos Preview comes up with decent and seemingly novel ones, often relating to its preferred technical and philosophical topics.
Yes, the system card mentions this, but this is kinda meaningless. It seems like they essentially ran it multiple times and curated a few good ones. Then puffed it up in the marketing copy.
This is made more clear when they attempt to brag about their literal slot machine behavior when finding that kernel crashing bug in OpenBSD.
> Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings. While the specific run that found the bug above cost under $50, that number only makes sense with full hindsight. Like any search process, we can’t know in advance which run will succeed.
Yes, they cannot. But it amuses the oligarchy. Here is Musk linking to Grok jokes. The first one is plagiarized and in the standard joke literature, the second one is an utterly stupid and gross (warning) modification of the first one:
https://xcancel.com/elonmusk/status/2042770839633039635#m
They modify and plagiarize.
I mean, I'm sure they can tell you good jokes... they just won't be _new_ jokes.
Define _new_.
I just think that the difficulty with jokes is the delivery, cadence & setting. Not the actual words.
I'm sure a good comedian can tell a nonsense joke and make "everyone" laugh their heads off.
And I don't get the sense that you are referring to this part of jokes but rather the actual words.
2 replies →
The jokes I posted in this thread are new, to the best of my knowledge. Can you show that they're not?
>... you’d think they’d invented Skynet by the way they hand-wring.
Meanwhile, in reality: "Skynet, I'm not sure that line of thinking is correct. You should re-check the first part again before making any assumptions."
Skynet 4.6 Extended: "You're right, I should have caught that. Let me redo everything correctly this time."