← Back to context Comment by simonw 2 days ago OK that is recognizably a pelican, pretty great! 8 comments simonw Reply qingcharles 2 days ago This feels like the best pelicanbike yet. The singularity might be closer than we imagine.Time for a leaderboard? lostmsu 2 days ago Ask and you'll receive: https://pelicans.borg.games/ espadrine 2 days ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation. qingcharles 2 days ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply → prabhasp 2 days ago Lol, can you add a "both of these are terrible" option? 1 reply → taytus 1 day ago I think they (LLMs providers) are manually tuning these cases/examples.Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.
qingcharles 2 days ago This feels like the best pelicanbike yet. The singularity might be closer than we imagine.Time for a leaderboard? lostmsu 2 days ago Ask and you'll receive: https://pelicans.borg.games/ espadrine 2 days ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation. qingcharles 2 days ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply → prabhasp 2 days ago Lol, can you add a "both of these are terrible" option? 1 reply → taytus 1 day ago I think they (LLMs providers) are manually tuning these cases/examples.Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.
lostmsu 2 days ago Ask and you'll receive: https://pelicans.borg.games/ espadrine 2 days ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation. qingcharles 2 days ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply → prabhasp 2 days ago Lol, can you add a "both of these are terrible" option? 1 reply →
espadrine 2 days ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation.
qingcharles 2 days ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply →
taytus 1 day ago I think they (LLMs providers) are manually tuning these cases/examples.Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.
This feels like the best pelicanbike yet. The singularity might be closer than we imagine.
Time for a leaderboard?
Ask and you'll receive: https://pelicans.borg.games/
It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation.
Nice! Is there a way I can click on the leaderboard items so I can view them?
1 reply →
Lol, can you add a "both of these are terrible" option?
1 reply →
I think they (LLMs providers) are manually tuning these cases/examples.
Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.