← Back to context

Comment by swingboy

14 hours ago

Looking forward to the results. Thanks for your work.

Appreciate that! Results are live: https://gertlabs.com/rankings

Opus 4.8 is the first tangible improvement since Opus 4.5. And it doesn't seem to have the personality problems of the last release -- I've been enjoying using it.

  • Nice! Looks like it’s topping the two coding ones. I noticed it is absent from the Social Intelligence board though?

    • That'll populate over the next couple weeks -- those are the live games on the spectate tab which take a while to generate statistically worthwhile data. I'm curious how it does. From using it all day, I can say Opus 4.8 is my new favorite model, hands down.