Comment by insin

5 hours ago

I don't think it needs to specifically be a coding agent for the average user, creating apps for whatever they want to do, just something that can use code and has appropriate access for what they're already asking it to do (instead of the model bullshitting to them that it can do it, annoying them), and some way to make it repeatable when needed, like skills.

I'm currently doing something like this in the internal model-independent LLM chat app I work on at a F100, specifically targeted at our everyday users. <input type="file" webkitdirectory> lets the user give the model read and write access to a local folder (and OPFS lets us reuse the same fs tools we give the model for files manually attached to the chat, or for files tools want to create if they haven't granted folder access).

Every time we used to release a new version it was "still can't handle the 6MB Excel file I drop into it" when that was being extracted to CSV and added to context - now it can poke about in the big Excel file directly with SheetJS to pull the sheets/headers and inspect the shape of the data, and use locally sandboxed code execution to write code against either extracted data or the spreadsheet itself via SheetJS for pivot tables and such (all locally - none of which need go into the context).

The base models are good enough at tool calling (I really mean Claude, though, the GPTs just go on a tear calling tools with no context for the user) they're already decent at automating stuff for the user without a dedicated harness (our default system prompt is still "You are a helpful AI assistant", lol). Add tools for Graph API stuff, and now it can pull the nightly batch file from a support inbox, unzip the spreadsheet within, diff it against yesterday's and generate an import file for new users and draft an email to welcome them, something that used to be a daily support task (which I'd already automated most of - but now you don't need a dev for this kind of thing). Or go find the big 450,000+ row spreadsheet that's being automated somewhere on SharePoint, pull it down in 150,000 row chunks (Graph Excel REST API limit) and write code to go figure out whatever the user is asking.

Having implemented and used it, I like this setup so much it kinda ruined Claude.ai and ChatGPT.com for me, so I've hooked up similar access for them using a browser extension to add the folder picker input, with the extension talking to a local server to tell it which folder to give access to, and Claude/ChatGPT talking to the same server over MCP via a CloudFlare Tunnel to work with the selected folder.