Comment by dbish
1 day ago
If I recall correctly, a prior interview about claude plays pokemon stated they purposely chose pokemon as a use case that was not meant to be trained/finetuned on. That's what makes it an interesting problem, so hopefully they aren't.
I believe the testing itself is done in very good faith.
But I believe the team at Antrophic looks for popular use cases like this one to improve their datasets. Same for every other big player in the LLM game.