Comment by MatrixMan

2 months ago

I'm interested to see where we'll land re: organizing larger codebases to accommodate agents.

I've been having a lot of fun taking my larger projects and decomposing them into directed graphs where the nodes are nix flakes. If I launch claude code in a flake devshell it has access to only those tools, and it sees the flake.nix and assumes that the project is bounded by the CWD even though it's actually much larger, so its context is small and it doesn't get overwhelmed.

Inputs/outputs are a nice language agnostic mechanism for coordinating between flakes (just gotta remember to `nix flake update --update-input` when you want updated outputs from an adjacent flake). Then I can have them write feature requests for each other and help each other test fixtures and features. I also like watching them debate over a design, they get lazy and assume the other "team" will do the work, but eventually settle on something reasonable.

I've been running with the idea for a few weeks, maybe it's dumb, but I'd be surprised if this kind of rethinking didn't eventually yield a radical shift in how we organize code, even if the details look nothing like what I've come up with. Somehow we gotta get good at partitioning context so we can avoid the worst parts of the exponential increase in token volume that comes from submitting the entire chat session history just to get the next response.

5 comments

__MatrixMan__

salty_frog 2 months ago

Id be keen to read/hear more about the experiment you've been undertaking as I too have been thinking the impact on the design/architecture/organising of software.

The focus mainly seems to be on enhancing existing workflows to produce code we currently expect - often you hear its like a junior dev.

The type of rethinking you outlined could have code organised in such a way a junior dev would never be able to extend but our 'junior dev' LLM can iterate through changes easily.

I care more about the properties of software e.g. testable, extendable, secure than how it organised.

Gets me to think of questions like

- what is the correlation between how code is organised vs its properties? - what is the optimal organisation of code to facilitate llms to modify and extend software?

__MatrixMan__ 2 months ago

Its not even a POC at this point, just a readme and a sandbox for testing it while I work on it. But you might find the readme interesting:
https://github.com/MatrixManAtYrService/poag
I'm especially pleased with how explicit it makes the inner dependency graph. Today I'm tinkering with pact (https://docs.pact.io/). I like that I'm forced to add the pact contracts generated during consumer testing as flake outputs (so they can then be inputs to whichever flake does provider testing). It's potentially a bit more work than it would be under other schemes, but it also makes the directionality of the dependency into a first class citizen and not an implementation detail. Otherwise it would be easy to forget which batch of tests depends on artifacts generated by the other.
I suppose there's things like Bazel for that sort of thing also but I don't think you can drop an agent into a bazel... thingy... and expect it to feel at home.

quinnjh 2 months ago

yeah this is an interesting approach, both for the context-partitioning but also for reproducibility and dependency pinning. i was toying with this before needing to run with just docker on a project. would be nice to find a tool that streamlines some of this

__MatrixMan__ 2 months ago

Re: dependency pinning, I put together a little write-up about that: https://gist.github.com/MatrixManAtYrService/6eaf50373448c0b...
You can use it as an alternative to `git bisect` where only you're only bisecting the history of a single subflake. I imagine writing a new test that indicates the presence of an old bug, and then going back in time to see when the bug was reintroduced. With git bisect, going back in time means your new test goes away too.

Comment by __MatrixMan__

Comment by MatrixMan