Comment by jrjeksjd8d

2 days ago

I have started using the most token-intensive model I can find and asking for complicated tasks (rewrite this large codebase, review the resulting code, etc.)

The agent will churn in a loop for a good 15-20 minutes and make the leaderboard number go up. The result is verbose and useless but it satisfies the metrics from leadership.