Comment by simonw
4 days ago
This is a pretty sophisticated setup. I particularly like how it uses Tailscale.
I've been using the simpler but not as flexible alternative: I'm running Claude Code for web (Anthropic's version of Codex Cloud) via the Claude iPhone app, with an environment I created called "Everything" which allows all network access.
(This is moderately unsafe if you're working with private source code or environment variables containing API keys and other secrets, but most of my stuff is either open source or personal such that I don't care if the source code leaks.)
Anthropic run multiple ~21GB VMs for me on-demand to handle sessions that I start via the app. They don't charge anything extra for VM time which is nice.
I frequently have 2-3 separate Claude Code for web sessions running at once, often prompted from my phone, some of them started while I'm out walking the dog. Works really well!
I don't like claude code web due to its lack of planning mode. I found the result is often lackluster compare to claude code cli.
My current setup: Tailscale + Terminus(ipad) + home machine(code base)
Need to look into how to work on multiple features at the same time next.
I've been using git worktrees with Claude and it's pretty awesome:
https://www.youtube.com/watch?v=up91rbPEdVc
Pair worktrees with the ralph-wiggum plugin and I can have Claude work for hours without needing any input:
https://looking4offswitch.github.io/blog/2026/01/04/ralph-wi...
Worktrees took way too much setup and hand-holding for me, but https://conductor.build made it easy!
3 replies →
I haven't missed planning mode myself. I tend to tell it "write a detailed plan first in a file called spec.md for me to review", then use that as the ongoing plan.
I like that it ends up in the repo as it means it survives compaction or lets me start a fresh session entirely.
I was doing the same, but recently I noticed that Claude now writes its plans to a markdown file somewhere nested in the ~/.claude/plans directory. It will carry a reference to it through compaction. Basically mimicking my own workflow!
This can be customized via a shell env variable that I cannot remember ATM.
The downside (upside?) is that the plan will not end up in your repo. Which sometimes I want. I love the native plan mode though.
Plans in plan mode also survive compaction
The lack of Plan Mode is puzzling, I'm sure they must get to it at some point. But until then it CAN still plan, you just have to ask it to write a plan and not write code yet.
I've been really impressed with https://github.com/BloopAI/vibe-kanban to do this. Really really impressed.
Can you not use PAL MCP for this? Have one top agent as controller etc? It's not ideal but it feels like the space of multi agent stuff is evolving ... I notice that there are a lot of posts on hn about these things so we are trying to do the same thing really.
Not sure if this works in claude code web, but running non-interactive claude code I can still get it to use plan mode by simply asking it. It's just a tool call.
I'm surprised to see people getting value from "web sandbox"-type setups, where you don't actually have access to the source code. Are folks really _that_ confident in LLMs as to entirely give up the ability to inspect the source code, or to interact with a running local instance of the service? Certainly that would be the ideal, but I'm surprised that confidence is currently running that high.
I still get the full source code back at the end, I tell it to include code it wrote in the PR.
I also wrote my own tool to extract and format the complete transcript, it gives me back things like this where I can see everything it did including files and scripts it didn't commit. Here's an example: https://gistpreview.github.io/?3a76a868095c989d159c226b7622b...
Oh fascinating - so you're reviewing "your own" code in-PR, rather than reviewing it before PR submission? I can see that working! Feels weird, but I can see it being a reasonable adaptation to these tools - thanks!
What about running services locally for manual testing/poking? Do you open ports on the Anthropic VM to serve the endpoints, or is manual testing not part of your workflow?
13 replies →
Right - I’m missing how you get the source code in the OP. It says you tmux in with ssh agent forwarding for GH. But you can’t do that on your iOS device? So you have to set up all your repos in the morning before leaving the house, then collect and push all your branches when you return home?
I could imagine this working for a small number of branches/changes.
The output from Jules is a PR. And then it's a toss-up between "spot on, let's merge" and "nah, needs more work, I will check out the branch and fix it properly when I am the keyboard". And you see the current diff on the webpage while the agent is working.
Claude Code on the web, ChatGPT Codex and Google Jules are not the same as Claude, ChatGPT and Gemini. They are entire apps where you authorize Github access and they work via PRs.
They'll include screenshots on your PRs etc.
I like using them a lot when I can.
Right, yes, that was precisely my point - it was weird to me that people were comfortable operating on a codebase that they don't have locally, that they can't directly interact with.
5 replies →
[dead]
Are those VM specs documented anywhere, I have used Claude Code for web a lot and never really bothered with the details. Just connect it to my repo and let it cook
Not documented, so I had Claude code for web write me a report: https://github.com/simonw/research/blob/main/environment-rep...
Check out superconductor.dev (I’m building it), if you want live app previews, docker-in-docker functionality, multiple agents in one mobile app, and more.
[dead]