Comment by energy123
6 days ago
Computer vision went through this 2 decades ago. You need to perturb the input data. Same thing may need to be done in RL pipelines.
Someone should make a new public benchmark called GPQA-Perturbed. Give the providers something to benchmaxx towards.
No comments yet
Contribute on Hacker News ↗