Comment by o10449366

15 hours ago

[flagged]

9 comments

o10449366

"Quirky and obscure" has the functional benefit of ensuring the source question is not in the training data/outside the median user prompt, and therefore making the model less likely to cheat.

We have enough people complaining about Simon Willison's pelican test.

o10449366 10 hours ago

When you program, do you consider using your prior knowledge of programming cheating?

Bjartr 14 hours ago

What would make the prompt a better actual evaluation in your judgement?

leptons 11 hours ago
Not focusing on pokemon for a start. Maybe use something more people can recognize and evaluate. I have zero knowledge of pokemon, I see it as a niche thing for ultra-nerdy people, and not something everyone is familiar with. Nothing about that test can be evaluated by anyone but a pokemon expert. Sorry, but pokemon isn't as mainstream as some people might think it is.
- Bjartr 13 minutes ago
  
  I think you underestimate how popular Pokemon is.
  By most objective measures it's the largest entertainment franchise in all of history.
  Would you also object to any other pop-culture reference for the same reason?

tailscaler2026 14 hours ago

still #opentowork huh

beepbooptheory 13 hours ago
Where does one even use that hashtag?
- minimaxir 12 hours ago
  
  It's a LinkedIn joke.

codemog 15 hours ago

Ah yes, also known as C++ enjoyers.