Comment by elAhmo

20 hours ago

What is ultracode mode?

14 comments

elAhmo

senko 18 hours ago

It's a combination of reasoning effort (max) + enabling workflow that orchestrates multiple sub-agents.

After some interrogation, here's how it organized the work:

1. Design workflow (rts-game-design, 11 agents, ~13 min) ran first, produced SPEC.md + DESIGN.md:

1.1. Proposals (3 parallel agents): each designed a complete RTS from a different philosophy

1.2 Judge (1 agent): evaluated all three and synthesized one unified design, committing to specific numbers (costs, HP, map size, etc.).

1.3 Deep-dives (6 parallel agents): each wrote an implementation-ready spec for one subsystem, all consistent with the chosen design

1.4 Synthesis (1 agent): merged the design + all six subsystem specs into one conflict-free master spec

2. Code-review workflow (rts-code-review, 25 agents, ~5 min), ran after the main agent had written and tested the code:

2.1 Review (6 agents, read-only Explore type): each scrutinized one dimension and returned structured findings.

2.2. Verify (19 agents): every finding got its own skeptic agent told to try to refute it, Result: 19 flagged → 16 confirmed, 3 rejected as non-bugs.

What the main agent did in the main loop:

- Wrote all ~2,400 lines of index.html by hand from the spec.

- All browser testing/debugging via headless Chrome (I told it to use rodney by @simonw, love the tool :)

- Applied all 16 fixes from the review and re-verified them in the browser.

33MHz-i486 16 hours ago
seems like a rube-goldberg esque way to consume 10x tokens. is this really where the industry is heading?
- e12e 14 hours ago
  
  I like to think of it like the difference between dropping a ball on a roulette wheel (get one random number/sequence of repeated) - vs dropping a ball on a carved topographic map, where valleys guide the ball to a particular outcome.
  If you can stand a little AI expansion - here are a few points Gemini came up with - I think the idea has some merit:
  https://g.co/gemini/share/b5b97867eeb1
  (Maybe the better analogy is roulette vs pinball machine)
- derac 15 hours ago
  
  Why is it Rube Goldbergesque? The process doesn't seem arbitrary.
  
  3 replies →
artur_makly 2 hours ago
Just to confirm - you did not generate this plan/orchestration/harness - it did all that on its own?
- senko 1 hour ago
  
  Correct, that's the "workflows" part they introduced in claude code alongside the new model.
chrisweekly 3 hours ago
Did you start with a clean slate or do you have global ~/.claude/CLAUDE.md and/or specific skills, plugins, etc?
- senko 1 hour ago
  
  I don't have global CLAUDE.md and the only non-default skill I have that was used here is the one to use rodney[0] headless browser. I didn't expressly tell Claude to do browser testing, it decided to do it on its own.
  So no extra guidance beyond the prompt.
  [0] https://github.com/simonw/rodney/
jmtame 15 hours ago

Thanks for sharing this. Going to try it out on a game inspired by Rust. It's helpful re: the point on rodney - I've had a hard time getting the testing to work well in the browser.

tcoff91 19 hours ago

it's a brand new mode

colechristensen 18 hours ago

Biases the model to solve problems with teams of agents