Comment by reactordev

4 days ago

The rule for CI/CD and DevOps in general is boil your entire build process down to one line:

    ./build.sh

If you want to ship containers somewhere, do it in your build script where you check to see if you’re running in “CI”. No fancy pants workflow yamls to vendor lock yourself into whatever CI platform you’re using today, or tomorrow. Just checkout, build w/ params, point your coverage checker at it.

This is also the same for onboarding new hires. They should be able to checkout, and build, no issues or caveats, setup for local environment. This ensures they are ready to PR by end of the day.

(Fmr Director of DevOps for a Fortune 500)

Yeah, that's a good rule. Except, do you want to build Debug or Release? Or maybe RelWithDebugInfo? And do you want that with sanitizers maybe? And what the sanitizers' options should be? Do you want to compile your tests too, if you want to run them later on a different machine? And what about that dependency that takes two hours to compile, maybe you just want to reuse the previous compilation of it? And if so, where to take that from? Etc. etc.

Before long, you need another script that will output the train of options to your `build.sh`.

(If Fortune 500 companies can do a one-line build with zero parameters, I suspect I'd be very bored there.)

  • Of course we had parameters but we never ship debug builds. Treat everything like production.

    If you want to debug, docker compose or add logs and metrics to seek what you find.

You still inevitably need a bunch of CI platform-specific bullshit for determining "is this a pull request? which branch am I running on?", etc. Depending on what you're trying to do and what tools you're working with, you may need such logic both in an accursed YAML DSL and in your build script.

And if you want your CI jobs to do things like report cute little statuses, integrate with your source forge's static analysis results viewer, or block PRs, you have to integrate with the forge at a deeper level.

There aren't good tools today for translating between the environment variables or other things that various CI platforms expose, managing secrets (if you use CI to deploy things) that are exposed in platform-specific ways, etc.

If all you're doing with CI is spitting out some binaries, sure, I guess. But if you actually ask developers what they want out of CI, it's typically more than that.

  • A lot of CI platforms (such as GitHub) spit out a lot of environment variables automatically that can help you with the logic in your build script. If they don't, they should give you a way to set them. One approach is to keep the majority of the logic in your build script and just use the platform-specific stuff to configure the environment for the build script.

    Of course, as you mention, if you want to do things like comment on PRs or report detailed status information, you have to dig deeper.

    • Yes, and real portability for working with the environment variables is doable but there's nothing out there that provides it for you afaik. You just have to read a lot carefully.

      My team offers integrations of static analysis tools and inventorying tools (SBOM generation + CVE scanning) to other teams at my organization, primarily for appsec purposes. Our organization's departments have a high degree of autonomy, and tooling varies a lot. We have code hosted in GitLab, GitHub, Azure DevOps, and in distant corners my team has not yet worked with, elsewhere. Teams we've worked with run their CI in GitLab, GitHub, Azure DevOps, AWS CodeBuild, and Jenkins. Actual runners teams use may be SaaS-provided by the CI platform, or self-hosted on AWS or Azure. In addition to running in CI, we provide the same tools locally, for use on macOS as well as Linux via WSL.

      The tools my team uses for these scans are common open-source tools, and we distribute them via Nix (and sometimes Docker). That saves us a lot of headaches. But every team has their own workflow preferences and UI needs, and we have to meet them on the platforms they already use. For now we manage it ourselves, and it's not too terrible. But if there were something that actually abstracted away boring but occasionally messy differences like which environment variables mean in different CI systems, that would be really valuable for us. (The same goes for even comment bots and PR management tools. GitHub and GitLab are popular, but Azure DevOps is deservedly marginal, so even general-purpose tools rarely support both Azure DevOps and other forges.)

      If your concern is that one day, a few years from now, you'll need to migrate from one forge to another, maybe you can say "my bash script handles all the real build logic" and get away with writing off all the things it doesn't cover. Maybe you spend a few days or even a few weeks rewriting some platform-specific logic when that time comes and forget about it. But when you're actually contending with many such systems at once, you end up wishing for sane abstractions or crafting them yourself.

    • how can you build your containers in parallel?

      over multiple machines? I'm not sure that a sh script can do that with github

      2 replies →

  • There are some very basic tools that can help with portability, such as https://github.com/milesj/rust-cicd-env , but I agree that there is a lot of proprietary, vendor-specific, valuable functionality available in the average "CI" system that you cannot make effective use of with this approach. Still, it's the approach I generally favor for a number of reasons.

The other rule is that script should run as a user. Solely on that working directory.

There are too many scripts like that which start, ask for sudo and then it's off to implementing someones "great idea" about your systems network interfaces.

  • sudo should not be required to build software.

    If there’s something you require that requires sudo, it’s a pre-build environment setup on your machine. On the host. Or wherever. It’s not part of the build. If you need credentials, get them from secrets or environment variables.

    • For use cases like making tar files with contents owned by root, Debian developed the tool "fakeroot", which intercepts standard library functions so that when the build script sets a file to be owned by root and then reads the ownership later, it sees it's owned by root, so it records that in the tar file.

      1 reply →

You’re not wrong but your suggestion also throws away a lot of major benefits of CI. I agree jobs should be one liners but we still need more than one…

The single job pipeline doesn’t tell you what failed. It doesn’t parallelize unit and integration test suites while dealing with the combinatorial matrix of build type, target device, etc.

At some point, a few CI runners become more powerful than a developer’s workstation. Parallelization can really matter for reducing CI times.

I’d argue the root of the problem is that we are stuck on using “make” and scripts for local build automation.

We need something descriptive enough to describe a meaningful CI pipeline but also allow local execution.

Sure, one can develop a bespoke solution, but reinventing the wheel each time gets tiring and eventually becomes a sizable time sink.

In principle, we should be able to execute pieces of .gitlab-ci.yml locally, but even that becomes non trivial with all the nonstandard YAML behaviors done in gitlab, not to mention the varied executor types.

Instead we have a CI workflow and a local workflow and hope the two are manually kept in sync.

In some sense, the current CI-only automation tools shouldn’t even need to exist (gitlab, Jenkins, etc) — why didn’t we just use a cron job running “build.sh” ?

I argue these tools should mainly only have to focus on the “reporting/artifacts” with the pipeline execution parts handled elsewhere (or also locally for a developer).

Shame on you GitLab!

  • You are mistaking a build for a pipeline. I still believe in pipelines and configuring the right hosts/runners to produce your artifacts. Your actual build on that host/runner should be a one-liner.

How do you get caching of build steps with this approach? Or do you just not?

  • Use a modern hermetic build system with remote caching or remote execution. Nix, Bazel, buck, pants. Many options

    • This is like fighting complexity with even more complexity. Nix and bazel are definitely not close to actually achieving hermetic build at scale. And when they break the complexity increases exponentially to fix.

      5 replies →

  • Even just makefiles have 'caching', provided you set dependencies and output correctly.

    A good makefile is really nice to use. Not nice to read or trace unfortunately though.

  • We get them with docker.

    Everything becomes a container so why not use the container engine for it. If you know how layers work…