Comment by simonw

1 month ago

This is a pretty sophisticated setup. I particularly like how it uses Tailscale.

I've been using the simpler but not as flexible alternative: I'm running Claude Code for web (Anthropic's version of Codex Cloud) via the Claude iPhone app, with an environment I created called "Everything" which allows all network access.

(This is moderately unsafe if you're working with private source code or environment variables containing API keys and other secrets, but most of my stuff is either open source or personal such that I don't care if the source code leaks.)

Anthropic run multiple ~21GB VMs for me on-demand to handle sessions that I start via the app. They don't charge anything extra for VM time which is nice.

I frequently have 2-3 separate Claude Code for web sessions running at once, often prompted from my phone, some of them started while I'm out walking the dog. Works really well!

45 comments

simonw

elpalek 1 month ago

I don't like claude code web due to its lack of planning mode. I found the result is often lackluster compare to claude code cli.

My current setup: Tailscale + Terminus(ipad) + home machine(code base)

Need to look into how to work on multiple features at the same time next.

LatencyKills 1 month ago
I've been using git worktrees with Claude and it's pretty awesome:
https://www.youtube.com/watch?v=up91rbPEdVc
Pair worktrees with the ralph-wiggum plugin and I can have Claude work for hours without needing any input:
https://looking4offswitch.github.io/blog/2026/01/04/ralph-wi...
- scubbo 1 month ago
  
  Worktrees took way too much setup and hand-holding for me, but https://conductor.build made it easy!
  
  3 replies →
simonw 1 month ago
I haven't missed planning mode myself. I tend to tell it "write a detailed plan first in a file called spec.md for me to review", then use that as the ongoing plan.
I like that it ends up in the repo as it means it survives compaction or lets me start a fresh session entirely.
- s900mhz 1 month ago
  
  I was doing the same, but recently I noticed that Claude now writes its plans to a markdown file somewhere nested in the ~/.claude/plans directory. It will carry a reference to it through compaction. Basically mimicking my own workflow!
  This can be customized via a shell env variable that I cannot remember ATM.
  The downside (upside?) is that the plan will not end up in your repo. Which sometimes I want. I love the native plan mode though.
  
  1 reply →
- dbbk 1 month ago
  
  Plans in plan mode also survive compaction
eclipxe 1 month ago

I've been really impressed with https://github.com/BloopAI/vibe-kanban to do this. Really really impressed.
dbbk 1 month ago

The lack of Plan Mode is puzzling, I'm sure they must get to it at some point. But until then it CAN still plan, you just have to ask it to write a plan and not write code yet.
nobodywillobsrv 1 month ago

Can you not use PAL MCP for this? Have one top agent as controller etc? It's not ideal but it feels like the space of multi agent stuff is evolving ... I notice that there are a lot of posts on hn about these things so we are trying to do the same thing really.
bakies 1 month ago

Not sure if this works in claude code web, but running non-interactive claude code I can still get it to use plan mode by simply asking it. It's just a tool call.

scubbo 1 month ago

I'm surprised to see people getting value from "web sandbox"-type setups, where you don't actually have access to the source code. Are folks really _that_ confident in LLMs as to entirely give up the ability to inspect the source code, or to interact with a running local instance of the service? Certainly that would be the ideal, but I'm surprised that confidence is currently running that high.

simonw 1 month ago
I still get the full source code back at the end, I tell it to include code it wrote in the PR.
I also wrote my own tool to extract and format the complete transcript, it gives me back things like this where I can see everything it did including files and scripts it didn't commit. Here's an example: https://gistpreview.github.io/?3a76a868095c989d159c226b7622b...
- scubbo 1 month ago
  
  Oh fascinating - so you're reviewing "your own" code in-PR, rather than reviewing it before PR submission? I can see that working! Feels weird, but I can see it being a reasonable adaptation to these tools - thanks!
  What about running services locally for manual testing/poking? Do you open ports on the Anthropic VM to serve the endpoints, or is manual testing not part of your workflow?
  
  13 replies →
nl 1 month ago
Claude Code on the web, ChatGPT Codex and Google Jules are not the same as Claude, ChatGPT and Gemini. They are entire apps where you authorize Github access and they work via PRs.
They'll include screenshots on your PRs etc.
I like using them a lot when I can.
- scubbo 1 month ago
  
  Right, yes, that was precisely my point - it was weird to me that people were comfortable operating on a codebase that they don't have locally, that they can't directly interact with.
  
  5 replies →
theptip 1 month ago

Right - I’m missing how you get the source code in the OP. It says you tmux in with ssh agent forwarding for GH. But you can’t do that on your iOS device? So you have to set up all your repos in the morning before leaving the house, then collect and push all your branches when you return home?
I could imagine this working for a small number of branches/changes.
smarx007 1 month ago

The output from Jules is a PR. And then it's a toss-up between "spot on, let's merge" and "nah, needs more work, I will check out the branch and fix it properly when I am the keyboard". And you see the current diff on the webpage while the agent is working.
theshrike79 1 month ago

Imagine you're a billionaire with infinite resources. You'd have gofers for everything. Get a weird idea while golfing? Shoot a text or call someone and it'll get done (or just tell the assistant that's always following you)
These web agents are similar. You pull out your phone while queueing in the shop, change to your "research" repo and tell it "investigate a way to create a privacy-preserving mobile application to store barcode-based loyalty cards", hit execute and put your phone away.
When you get to it, you can check out what it did for you.
Or if your system is set up properly, you can ask the same thing to make changes to any project, like adding a new feature you just thought up. Get back, review PR, maybe merge it.
suninsight 1 month ago

[dead]

vidar 1 month ago

Are those VM specs documented anywhere, I have used Claude Code for web a lot and never really bothered with the details. Just connect it to my repo and let it cook

simonw 1 month ago

Not documented, so I had Claude code for web write me a report: https://github.com/simonw/research/blob/main/environment-rep...

sergeyk 1 month ago

Check out superconductor.dev (I’m building it), if you want live app previews, docker-in-docker functionality, multiple agents in one mobile app, and more.

bschmidt25001 1 month ago

[dead]