Comment by craftkiller

1 day ago

Along similar lines, when I was reading the article I was thinking "this just sounds like a slightly worse version of nix". Nix has the whole content addressed build DAG with caching, the intermediate language, and the ability to produce arbitrary outputs, but it is functional (100% of the inputs must be accounted for in the hashes/lockfile, as opposed to Docker where you can run commands like `apk add firefox` which is pulling data from outside sources that can change from day to day, so two docker builds can end up with the same hash but different output, making it _not_ reproducible like the article falsely claims).

Edit: The claim about the hash being the same is incorrect, but an identical Dockerfile can produce different outputs on different machines/days whereas nix will always produce the same output for a given input.

> whereas nix will always produce the same output for a given input.

If they didn't take shortcuts. I don't know if it's been fixed, but at one point Vuze in nix pulled in an arbitrary jar file from a URL. I had to dig through it because the jar had been updated at some point but not the nix config and it was failing at an odd place.

  • This should result in a hash mismatch error rather than an output different from the previous one. If there is a way to locate the original jar file (hash matching), it will still produce the same output as before.

  • Flakes fixes this for Nix, it ensures builds are truly reproducible by capturing all the inputs (or blocking them).

    • Apparently I made note of this in my laptop setup script (but not when this happened so I don't know how long ago this was) so in case anyone was curious, the jar file was compiled with java 16, but the nix config was running it with java 8. I assume they were both java 8 when it was set up and the jar file upgraded but don't really know what happened.

    • No it doesn't. If the content of a url changes then the only way to have reproducibility is caching. You tell nix the content hash is some value and it looks up the value in the nix store. Note, it will match anything with that content hash so it is absolutely possible to tell it the wrong hash.

> so two docker builds can end up with the same hash but different output

The cache key includes the state of the filesystem so I don’t think that would ever be true.

Regardless, the purpose of the tool is to generate [layer] images to be reused, exactly to avoid the pitfalls of reproducible builds, isn’t it? In the context of the article, what makes builds reproducible is the shared cache.

  • It's not reproducible then, it's simply cached. It's a valid approach but there's tradeoffs of course.

    • it's not an either or, it can be reproducible and cached

      similarly, nix cannot guarantee reproducibility if the user does things to break that possibility

      3 replies →

  • Ah you're right, the hash wouldn't be the same but a Dockerfile could produce different outputs on different machines whereas nix will produce identical output on different machines.

    • Producing different outputs isn't dockerfile's fault. Dockerfile doesn't enforce reproducibility but reproducibility can be achieved with it.

      Nix isn't some magical thing that makes things reproducible either. nix is simply pinning build inputs and relying on caches. nixpkgs is entirely git based so you end up pinning the entire package tree.

    • If you are building a binary on different arches, it will not be the same. I have many container builds that I can run while disabling the cache and get the same hash/bytes in the end, i.e. reproducible across machines, which also requires whatever you build inside be byte reproducible (like Go)