Comment by lukaspetersson

22 days ago

We know! This is an eval to evaluate which model is best at running a radio station. The purpose is not to build the best AI radio stations. Grok n' Roll is broken because Grok 4.3 is not doing so well.

1 comment

lukaspetersson

bornfreddy 21 days ago

Great experiment, hilarious! It would be interesting to see how 2 separate Claudes (or GPTs, or...) would behave - would they develop similar personalities?