Comment by simonw 6 months ago OK that is recognizably a pelican, pretty great! 8 comments simonw Reply qingcharles 6 months ago This feels like the best pelicanbike yet. The singularity might be closer than we imagine.Time for a leaderboard? lostmsu 6 months ago Ask and you'll receive: https://pelicans.borg.games/ espadrine 6 months ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation. qingcharles 6 months ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply → prabhasp 6 months ago Lol, can you add a "both of these are terrible" option? 1 reply → taytus 6 months ago I think they (LLMs providers) are manually tuning these cases/examples.Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.
qingcharles 6 months ago This feels like the best pelicanbike yet. The singularity might be closer than we imagine.Time for a leaderboard? lostmsu 6 months ago Ask and you'll receive: https://pelicans.borg.games/ espadrine 6 months ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation. qingcharles 6 months ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply → prabhasp 6 months ago Lol, can you add a "both of these are terrible" option? 1 reply → taytus 6 months ago I think they (LLMs providers) are manually tuning these cases/examples.Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.
lostmsu 6 months ago Ask and you'll receive: https://pelicans.borg.games/ espadrine 6 months ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation. qingcharles 6 months ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply → prabhasp 6 months ago Lol, can you add a "both of these are terrible" option? 1 reply →
espadrine 6 months ago It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation.
qingcharles 6 months ago Nice! Is there a way I can click on the leaderboard items so I can view them? 1 reply →
taytus 6 months ago I think they (LLMs providers) are manually tuning these cases/examples.Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.
This feels like the best pelicanbike yet. The singularity might be closer than we imagine.
Time for a leaderboard?
Ask and you'll receive: https://pelicans.borg.games/
It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation.
Nice! Is there a way I can click on the leaderboard items so I can view them?
1 reply →
Lol, can you add a "both of these are terrible" option?
1 reply →
I think they (LLMs providers) are manually tuning these cases/examples.
Pelinkan on a bike - > some dude (from these labs) creates it, and it becomes part of the training data.