Comment by wadadadad

3 months ago

This is an interesting idea, but as you stated, it's all logic; it's hard to come up with an idea where you don't have to explain concepts yet still is dissimilar enough to be in the training.

In your second example with the wizards- did you notice that it failed to follow the rules? Step 3, the witch was summoned by the wizard. I'm curious as to why you didn't comment either way on this.

On a related note, instead of puzzles, what about presenting riddles? I would argue that riddles are creative, pulling bits and pieces of meaning from words to create an answer. If AI can solve riddles not seen before, would that count as creative and not solving problems in their dataset?

Here's one I created and presented (the first incorrect answer I got was Escape Room; I gave it 10 attempts and it didn't get the answer I was thinking of):

---

Solve the riddle:

Chaos erupts around

The shape moot

The goal is key

6 comments

wadadadad

shagie 3 months ago

The challenge is: for someone who is convinced that an LLM is only presenting material that they've seen before that was created by some human, how do you show them something that hasn't been seen before?

(Digging in old chats one from 2024 this one is amusing ... https://chatgpt.com/share/af1c12d5-dfeb-4c76-a74f-f03f48ce3b... was a fun one - epic rap battle between Paul Graham and Commander Taco )

Many people seem to believe that the LLM is not much more than a collage of words that it stole from other places and likewise images are a collage of images stolen from other people's pictures. (I've had people on reddit (which tends to be rather AI hostile outside of specific AI subs) downvote me for explaining how to use an LLM as an editor for your own writing or pointing out that some generative image systems are built on top of libraries where the company had rights (e.g. stock photography) to all the images)

With the wizards, I'm not interested necessarily in the correct solution, but rather how it did it and what the representation of the response was. I selected everything with 'W' to see how it handled identifying the different things.

As to riddles... that's really a question of mind reading. Your riddle isn't one that I can solve. Maybe if you told me the answer I'd understand how you got from the answer to the question, but I've got no idea how to go from the hint to a possible answer (does that make me an LLM?)

I feel its a question much more along some other classic riddles...

    “What have I got in my pocket?" he said aloud. He was talking to himself, but Gollum thought it was a riddle, and he was frightfully upset. "Not fair! not fair!" he hissed. "It isn't fair, my precious, is it, to ask us what it's got in its nassty little pocketsess?”

What do I have in my pocket? (and then a bit of "what would it do with that prompt?") https://chatgpt.com/s/t_691fa7e9b49081918a4ef8bdc6accb97

At this point, I'm much more of the opinion that some people are on "team anti-ai" and that it has become part of their identity to be against anything that makes use of AI to augment what a human can do unaided. Attempting to show that it's not a stochastic parrot or next token predictors (anymore than humans are) or that it can do things that help people (when used responsibly by the human) gets met with hostility.

I believe that this comes from the group identity and some of the things of group dynamics. https://gwern.net/doc/technology/2005-shirky-agroupisitsownw...

> The second basic pattern that Bion detailed is the identification and vilification of external enemies. This is a very common pattern. Anyone who was around the open source movement in the mid-1990s could see this all the time. If you cared about Linux on the desktop, there was a big list of jobs to do. But you could always instead get a conversation going about Microsoft and Bill Gates. And people would start bleeding from their ears, they would get so mad.

> ...

> Nothing causes a group to galvanize like an external enemy. So even if someone isn’t really your enemy, identifying them as an enemy can cause a pleasant sense of group cohesion. And groups often gravitate toward members who are the most paranoid and make them leaders, because those are the people who are best at identifying external enemies.

wadadadad 3 months ago
I don't think riddles are necessarily 'solvable' in that there's only one right answer; the very fact that they're open to interpretation, but when you get the 'right' answer it (hopefully) makes sense. So if an AI/LLM can answer such a nebulous thing correctly- that's more of the angle I was going at.
Regarding the wizards example, I'm a bit confused; I was thinking that the best way to judge answers for problem solving/creativity was for correctness. I'll think more on whether the 'thought process' counts in and of itself.
The answer to my riddle is 'ball'.
- shagie 3 months ago
  
  Perfect correctness is what you'd expect from a computer. I could write a program that solved it - and that would be an indication of my creativity as a human solving something that I haven't encountered before. Incidentally, that's also how it approached solving the block problem (by writing a program).
  If you ask me the goat, wolf, cabbage problem I'd be able to recite (as an xkcd fan https://xkcd.com/1134/ and https://xkcd.com/2348/ and the exploration of what else it could do). However, if someone hasn't seen the problem before it could be a useful tool at seeing how they approach solving it.
  The question of how does it tackle a new problem is one of creativity and exploration of thought in a new (untrained) domain.
  A possible claim of "well, it's been trained on the meta-problem of how to solve problems that weren't in its training set" would get a side eye.
  For the "ball" being the answer... consider the second response to https://chatgpt.com/share/6920b9e2-764c-8011-a14a-012e97573f... (make sure you click on the "Thought for 1m 5s" to get the internal process)
- johnisgood 3 months ago
  
  How did you get "ball" from your riddle? I read it and I have no idea! :(
  
  2 replies →