← Back to context Comment by simonw 20 hours ago OK that is recognizably a pelican, pretty great! 8 comments simonw Reply qingcharles 19 hours ago This feels like the best pelicanbike yet. The singularity might be closer than we imagine.Time for a leaderboard? lostmsu 18 hours ago Ask and you'll receive: https://pelicans.borg.games/ espadrine 15 hours ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation. qingcharles 15 hours ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply → prabhasp 14 hours ago Lol, can you add a "both of these are terrible" option? 1 reply → taytus 10 hours ago I think they (LLMs providers) are manually tuning these cases/examples.Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.
qingcharles 19 hours ago This feels like the best pelicanbike yet. The singularity might be closer than we imagine.Time for a leaderboard? lostmsu 18 hours ago Ask and you'll receive: https://pelicans.borg.games/ espadrine 15 hours ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation. qingcharles 15 hours ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply → prabhasp 14 hours ago Lol, can you add a "both of these are terrible" option? 1 reply → taytus 10 hours ago I think they (LLMs providers) are manually tuning these cases/examples.Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.
lostmsu 18 hours ago Ask and you'll receive: https://pelicans.borg.games/ espadrine 15 hours ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation. qingcharles 15 hours ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply → prabhasp 14 hours ago Lol, can you add a "both of these are terrible" option? 1 reply →
espadrine 15 hours ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation.
qingcharles 15 hours ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply →
taytus 10 hours ago I think they (LLMs providers) are manually tuning these cases/examples.Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.
This feels like the best pelicanbike yet. The singularity might be closer than we imagine.
Time for a leaderboard?
Ask and you'll receive: https://pelicans.borg.games/
It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation.
Nice! Is there a way I can click on the leaderboard items so I can view them?
1 reply →
Lol, can you add a "both of these are terrible" option?
1 reply →
I think they (LLMs providers) are manually tuning these cases/examples.
Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.