← Back to context

Comment by pxc

4 days ago

You still inevitably need a bunch of CI platform-specific bullshit for determining "is this a pull request? which branch am I running on?", etc. Depending on what you're trying to do and what tools you're working with, you may need such logic both in an accursed YAML DSL and in your build script.

And if you want your CI jobs to do things like report cute little statuses, integrate with your source forge's static analysis results viewer, or block PRs, you have to integrate with the forge at a deeper level.

There aren't good tools today for translating between the environment variables or other things that various CI platforms expose, managing secrets (if you use CI to deploy things) that are exposed in platform-specific ways, etc.

If all you're doing with CI is spitting out some binaries, sure, I guess. But if you actually ask developers what they want out of CI, it's typically more than that.

A lot of CI platforms (such as GitHub) spit out a lot of environment variables automatically that can help you with the logic in your build script. If they don't, they should give you a way to set them. One approach is to keep the majority of the logic in your build script and just use the platform-specific stuff to configure the environment for the build script.

Of course, as you mention, if you want to do things like comment on PRs or report detailed status information, you have to dig deeper.

  • Yes, and real portability for working with the environment variables is doable but there's nothing out there that provides it for you afaik. You just have to read a lot carefully.

    My team offers integrations of static analysis tools and inventorying tools (SBOM generation + CVE scanning) to other teams at my organization, primarily for appsec purposes. Our organization's departments have a high degree of autonomy, and tooling varies a lot. We have code hosted in GitLab, GitHub, Azure DevOps, and in distant corners my team has not yet worked with, elsewhere. Teams we've worked with run their CI in GitLab, GitHub, Azure DevOps, AWS CodeBuild, and Jenkins. Actual runners teams use may be SaaS-provided by the CI platform, or self-hosted on AWS or Azure. In addition to running in CI, we provide the same tools locally, for use on macOS as well as Linux via WSL.

    The tools my team uses for these scans are common open-source tools, and we distribute them via Nix (and sometimes Docker). That saves us a lot of headaches. But every team has their own workflow preferences and UI needs, and we have to meet them on the platforms they already use. For now we manage it ourselves, and it's not too terrible. But if there were something that actually abstracted away boring but occasionally messy differences like which environment variables mean in different CI systems, that would be really valuable for us. (The same goes for even comment bots and PR management tools. GitHub and GitLab are popular, but Azure DevOps is deservedly marginal, so even general-purpose tools rarely support both Azure DevOps and other forges.)

    If your concern is that one day, a few years from now, you'll need to migrate from one forge to another, maybe you can say "my bash script handles all the real build logic" and get away with writing off all the things it doesn't cover. Maybe you spend a few days or even a few weeks rewriting some platform-specific logic when that time comes and forget about it. But when you're actually contending with many such systems at once, you end up wishing for sane abstractions or crafting them yourself.

  • how can you build your containers in parallel?

    over multiple machines? I'm not sure that a sh script can do that with github

    • If you build them with Nix, you can. Just call `nix build` with a trailing `&` a bunch of times.

      But it's kind of cheating, because the Nix daemon actually handles per-machine scheduling and cross-machine orchestration for you.

      Just set up some self-hosted runners with Nix and an appropriately configured remote builders configuration to get started.

      If you really want to, you can graduate after that to a Kubernetes cluster where Nix is available on the nodes. Pass the Nix daemon socket through to your rootless containers, and you'll get caching in the Nix store for free even with your ephemeral containers. But you probably don't need all that anyway. Just buy or rent a big build server. Nix will use as many cores as you have by default. It will be a long time before you can't easily buy or rent a build server big enough.

    • these problems are general ones and the solution is the same as running programs in parallel or across machines. When needing to build different architectures (and needing a host to provide the toolchains), what's stopping you from issuing more than 1 command in your CI/CD pipeline? Most pipelines have a way of running something on a specific host. So does k8s, ecs, <pick your provider>, and probably your IT team.

      My experience, when it gets time to actually build the thing. A one-liner (with args if you need them) is the best approach. If you really REALLY need to, you can have more than one script for doing it - depending on what path down the pipeline you take. Maybe it's

          1) ./build.sh -config Release
          2) ./deploy.sh -docker -registry=<$REGISTRY> --kick
      
      

      Just try not to go too crazy. The larger the org, the larger this wrangling task can be. Look at Google and gclient/gn. Not saying it's bad, just saying it's complicated for a reason. You don't need that (you'll know if you do).

      The point I made is I hate when I see 42 lines in a build workflow yaml that isn't syntax highlighted because it's been |'d in there. I think the yaml's of your pipelines, etc, should be configuration for the pipeline and the actual execution should be outsourced to a script you provide.

There are some very basic tools that can help with portability, such as https://github.com/milesj/rust-cicd-env , but I agree that there is a lot of proprietary, vendor-specific, valuable functionality available in the average "CI" system that you cannot make effective use of with this approach. Still, it's the approach I generally favor for a number of reasons.