Comment by hungryhobbit
1 day ago
Fun article, but it leaves out my favorite "almost ignore" feature in Git: `.gitattributes`.
This file lets you specify that git should "ignore" the diff from certain files. For instance, Node projects have a `package-lock.json` that is pure noise from a Git standpoint (it's just massive amounts of diff specifying specific versions of libraries, and the real human-readable version is in a separate `package.json` file).
With `.gitattributes` in the root of your project, you can just add a line:
`package-lock.json -diff`
Now, that file will still get staged/committed (which you want) ... but when you `git diff` you won't see the massive amounts of pointless diff in that file.
> that is pure noise from a Git standpoint
It shouldn't be noise. Don't update it if you're not intentionally trying to, otherwise you're exposing yourself to supply-chain risk for no reason. If you are regularly getting unexpected `package-lock.json` changes then you are doing something wrong.
It also directs Github to automatically collapse those files to the "Show Diff" interface by default. I'd still call the contents of things like lockfiles, protobuf output, big JSON blobs, etc, "noise" when reviewing PRs for code changes, but that doesnt mean I dont look at them.
It's not about unexpected changes. It's about DX in git CLI. You don't want to see massive diffs that are basically unreadable for humans, you just want to see that the file changed.
> you just want to see that the file changed
I check the diff for uv.lock (Python counterpart of package-lock.json) every time I merge a PR. It is important to know which direct or transient dependencies have been updated. We don't blindly bump all dependencies to the latest versions (you shouldn't either).
7 replies →
If your diffs are too large to review your project structure needs change. I go by the broad statement that EVERY line should be read, understood and explainable by the developer.
For critical files like package-lock.json I'd also expect developers to explain why a library was added or a version was changed and the impact of the version change. The lack of such basic hygiene is why supply chain attacks are so common these days.
But it's not always massive, it's a good practice to see what the diff is and ensure there is no weird dependency (aka supply chain attack) showing up in there.
1 reply →
Are you saying "as a developer, you don't want to see what code you ship as transitive dependencies"?
I guess it's the norm in the software industry, but that's slightly irresponsible.
1 reply →
You know what’s bad DX? Your company’s product having a massive security breach, people stop using it, and having to lay off all the software engineers
DX = ? Developer experience maybe?
1 reply →
The point is that it should not be massive.
1 reply →
It's a CLI. DX is not the only concern. What about scripts that expect the default git behavior?
You could argue "those scripts are dumb then! outta my way!", but then you shouldn't be using a CLI for whatever it is you're trying to do. If you insist, you can just grep or use the --stat option.
We already know the git CLI has plenty of antifeatures like this. It is up to the devs how they want to proceed, but it doesn't change the fact that hiding things is a footgun.
1 reply →
I think you're missing the point there. It's like I need to commit my project files for the project to compile, they're in xml format so they're human readable. But that doesn't mean I need to see the diff because I'm not going to review them
People are jumping on it being an important file to review. You don't want to ignore the diff.
Even if that's true, you definitely do not want to attempt merge two lock files, and using the .gitattributes file to set the merge strategy is a good idea!
There are also "semantic" diff and merge tools for a variety of languages, and a few specialized for JSON. That stuff was always pretty niche, but it's becoming more popular with AI agents and not wanting to or not being able to review every merge by hand.
eg.
thank you for being the first in this thread to actually prescribe what strategy to use, its been infuriating reading thru but not having the same level of knowledge
package-lock.json shows all your transitive dependencies, package.json just shows your direct dependencies. It is simply not true that the latter is "the real human-readable version". They serve different purposes and it is dangerous to say you can always ignore the diff in your lock file.
It also leaves my favorite way of ignoring files in Git: actually just ignore them yourself and never use shortcuts that break that (like "git commit -a" or "git add .")!
[dead]
To me it still sounds like a build artifact, and not source code: yes, you want to keep it and track changes to it, but freeze tools should allow one to easily get a reproducible build of package-lock.json too (eg. by passing a timestamp, it should be able to regenerate the lock file with latest-as-of-timestamp).
Maybe they do — I am not too deep in JS ecosystem — but that should be the basis of a true SBoM (generated, static artifact tied to a release build) and reproducible builds (able to regenerate byte-for-byte identical artifacts from actual source of truth which is your package.json).
Better: set up a git diff driver so you see the semantic changes, not line-by-line changes.
as someone who deals with dep upgrades and forensics when trying to figure out a bug I would get _so mad_ if `git diff` didn't show the diffs to lock files.
I get what you're saying about it being line noise but when you need it you need it!
and in today's world of constant supply-chain attacks, you do probably _do_ need it!
We've adapted: - our CI and git hooks so that our dependency or .lock files are visible when they change, and error if they change inconsistently - and our team procedures to confine dependency updates to dedicated commits
The idea being that when you see one of those "messy" .lock file changes...you were expecting it. If you see one and are annoyed by it (like OP) that's actually a waving red flag that a dependency changed.
Sounds like a powerful feature for subverting code review…
You should 100% track package-lock.json, and I'll go a step further and say you should most likely track node_modules too.
If the underlying infrastructure does not provide reproducible builds, I'd suggest you should instead fix that.
This is probably the most batshit insane insecure advice I've ever read on Hacker News ever. And everyone is wondering why NPM based attacks are so prevalent? Advice like this is being followed.
Explain the attack that gets mitigated by reading the diff of a lockfile?
Every major npm attack I can think of essentially follows the pattern of "version X.Y.Z is secretly evil". How does seeing package@X.Y.Z in your lockfile alert you to that?
I think you misunderstand the functionality. It doesn't ingnore the diff completely. it just replaces the full contents with "`Binary files differ"
> Use -diff to completely hide the internal file content during a diff. Git will only report `Binary files differ` if the file changes.
Same like you would binary files. It's still good advice to actually review the lockfile changes at some point.
You can also apparently write transformers to make it more human readable.
It’s fine imo, you’ll still see the diffs in PRs before merging, but majority of the time it’s just noise when developing locally. LLM agents also use git diffs frequently, why spend 10x the tokens analyzing package lock diffs instead of actual business logic changes.