Comment by lhl
23 days ago
Yes, I read it and specifically pointed it out (that's why there are 3 hours of interactive logs). There are 4 other runs pushed now so you can see what actual clean room runs for 5.2 xhigh, 5.3-Codex xhigh, 5.4 xhigh, and Opus 4.6 ultrathink look like: https://github.com/lhl/claudecycles-revisited/blob/main/COMP... as well as the baseline.
No comments yet
Contribute on Hacker News ↗