Comment by kuehle

5 months ago

> I find the most frustrating part of using CI to be to wait for a CI run to finish on a server and then try to deduce from the run log what went wrong. I’ve alleviated this by writing an extension to rad to run CI locally: rad-ci.

locally running CI should be more common

Agreed. I've got a crazy idea that I think might help...

Most tests have a step where you collect some data, and another step where you make assertions about that data. Normally, that data only ever lived in a variable, so it is not kept around for later analysis. All you get when you're viewing a failed test is logs with either exception or a failed assertion. It's not enough to tell a full story, and I think this contributes to the frustration you're talking about.

I've been playing with the idea that all of the data generation should happen first (since it's the slow part), it then gets added to a commit (overwriting data from the previous CI run) and then all of the assertions should run afterwards (this is typically very fast).

So when CI fails, you can pull the updated branch and either:

- rerun the assertions without bothering to regenerate the data (faster, and useful if the fix is changing an assertion)

- diff the new data against data from the previous run (often instructive about the nature of the breakage)

- regenerate the data and diff it against whatever caused CI to fail (useful for knowing that your change will indeed make CI happy once more)

Most tools are uncomfortable with using git to transfer state from the failed CI run to your local machine so you can just rerun the relevant parts locally, so there's some hackery involved, but when it works out it feels like a superpower.

Hear, hear.

Although I'd slightly rephrase that to "if you don't change anything, you should end up running pretty much the same code locally as in CI".

GitHub Actions is really annoying for this, as it has no supported local mode. Act is amazing, but insufficient: the default runner images are huge, so you can't use the same environment, and it's not supported.

Pre-commit on the other hand is fantastic for this kind of issue, as you can run it locally and it'll fairly trivially run the same checks in CI as it does locally. You want it to be fast, though, and in practice I normally wind up having pre-commit run only cacheable tests locally and exclude any build and test hooks from CI because I'll run them as separate CI jobs.

I did release my own GHA action for pre-commit (https://github.com/marketplace/actions/cached-pre-commit), because the official one doesn't cache very heavily and the author prefers folk to use his competing service.

  • I have to disagree about Act, my experience is that it only works for extremely simple workflows, and even then it’s easy to run into differences between Act and GitHub Actions. I’ve raised many bugs but AFAIK there’s like one guy working on it in his own time.

    It’s terrible that the community has had to invent something like this when it should be provided by GitHub. I suspect the GitHub Actions team is a skeleton crew because nothing ever seems to get done over there.

    • I think we're in agreement, generally -- and GitHub have no incentive to help, because Actions are a moat and if it's too easy to run outside their environment then it's too easy to move away from their environment.

      It's no shade on the author of Act (or on the quality of the code they've written) that they can't keep up with GitHub.

    • > AFAIK there’s like one guy working on it in his own time

      That may very well be true of nektos/act itself, but it now has at least two (depending on how one counts such things) ongoing forks over in the Gitea and Forgejo projects as they foolishly try to squat on it instead of using a real CI tool

I've been using brisk to run my CI from my local machine (so it runs in the cloud from my local terminal). The workflow is just a drop in replacement for local running. They've recently changed their backend and it seems to be working pretty smoothly. It works very well with AI agents too that are running in the terminal - they can run my tests for me if they make a change and it doesn't kill my machine.

Yeah, and I will never understand why developers accept anything less. GitHub CI is really bad for this. GitLab is a lot better as you can just run the exact thing through Docker locally. I like tools like tox too that automate your whole test matrix and do it locally just as well.

This can be achieved by running in CI what commonly runs on local.

E.g. if your build process is simply invoking `build.sh`, it should be trivial to run exactly that in any CI.

  • This is fine until your run into differences between your machine and the CI one (or you're writing code for a different architecture than the one you're using), but I agree, this is definitely the first step.

  • Be sure to run it in a container, so you have a semblance of parity.

    • Where possible. (If your build process builds containers and your tests get them up and make them talk, doing that in a container is a challenge.)

      However, there are stateless VMs and stateless BMs too.

      2 replies →

I've poked at this a few times, and I think it breaks down for CI that needs/wants to run integration tests against other services. Eg it wants to spin up a Postgres server to actually execute queries against.

Managing those lifetimes is annoying, especially when it needs to work on desktops too. On the server side, you can do things like spin up a VM that CI runs in, use Docker in the VM to make dependencies in containers, and then delete the whole VM.

That's a lot of tooling to do locally though, and even then it's local but has so many abstractions that it might as well be running in the cloud.

nix-ci.com is built with this as one of the two central features. The other is that it figures out what to do by itself; you don't have to write any YAML.

dagger does this, at the expense of increased complexity.

  • I tried out Dagger hoping to tidy up some of our CI pipelines. I really wasn’t impressed to be honest. I initially just wanted to build a basic Docker image using BuildKit and even that was nightmare that the Dagger guys basically told me just isn’t supported.

It should!

And yet, that's technically not CI.

The whole point we started using automation servers as an integration point was to avoid the "it works on my machine" drama. (Have watched at least 5 seasons of it - they were all painful!).

+1 on running the test harness locally though (where feasible) before triggering the CI server.

prbly have really bad code and tests if everything passes locally but fails on CI reguarly.

  • Not necessarily. For one, the local dev environment may be different or less pristine than what's encountered in the CI. I use bubblewrap (the sandboxing engine behind flatpak) sometimes to isolate the dev environment from the base system. Secondly, CI often does a lot more than what's possible on the local system. For example, it may run a lot more tests than what's practical on a local system. Or the upstream Repo may have code that you don't have in your local repo yet.

    Besides all that, this is not at all what the author and your parent commenter is discussing. They are saying that the practice of triggering and running CI jobs entirely locally should be more common, rather than having to rely on a server. We do have CI runners that work locally. But the CI job management is still done largely from servers.

    • > For example, it may run a lot more tests than what's practical in a local system.

      yes this is what i was taking about. If there are a lots of tests that are not practical to run locally then they are bad tests no matter how useful one might think they are. only good tests are the ones that run fast. It is also a sign that code itself is bad that you are forced to write tests that interact with outside world.

      For example, you can extract logic into a presention layer and write unit test for that instead of mixing ui and business logic and writing browser tests for it. there are also well known patterns for this like 'model view presenter'.

      I would rather put my effort into this than trying to figure out how to run tests that launch databases, browsers, call apis , start containers ect. Everywhere i've seen these kind of tests they've contributed to "it sucks to work on this code" feeling, bad vibes is the worst thing that can happen to code

      9 replies →