Comment by ramon156

12 hours ago

> The 20 most-changed files in the last year. The file at the top is almost always the one people warn me about. “Oh yeah, that file. Everyone’s afraid to touch it.”

The most changed file is the one people are afraid of touching?

28 comments

ramon156

rbonvall 11 hours ago

Just like that place that's so crowded nobody goes there anymore.

dewey 11 hours ago

I've just tried this, and the most touched files are also the most irrelevant or boring files (auto generated, entry-point of the service etc.) in my tests.

nulltrace 11 hours ago

Yeah same thing happens with lockfiles and CI configs. You end up filtering out half the list before it tells you anything useful.
pydry 10 hours ago
I just tried it too and it basically just flagged a handful of 1500+ line files which probably ought to be broken up eventually but arent causing any serious problems.
- Cthulhu_ 9 hours ago
  
  If it's (like in my case) dependency management, localization or config files, breaking them up will likely only cause more issues. Make sure that it's an actual improvement before breaking things up.

jbjbjbjb 10 hours ago

This command needs a warning. Using this command and drawing too many conclusions from it, especially if you’re new, will make you look stupid in front of your team mates.

I ran this on the repo I have open and after I filtered out the non code files it really can only tell me which features we worked on in the last year. It says more about how we decided to split up the features into increments than anything to do with bugs and “churn”.

Pay08 10 hours ago
Good thing that the article contains that warning, then.
- jbjbjbjb 10 hours ago
  
  Not really strong enough in a post about what to do in a codebase you’re not familiar with. In that situation you’re probably new to the team and organisation and likely to get off on the wrong foot with people if you assume their code “hurts”.
  
  3 replies →
functional_dev 5 hours ago

I found it interesting, that Git itself has built in similarity notion... when it packs objects, it groups files by path+size, runs delta cmpression to find which are close.
Very different from just counting commits - https://vectree.io/c/delta-compression-heuristics-and-packfi...
mayama 8 hours ago

These commands are just about what files to start looking at to understand new codebase.
Eldt 9 hours ago
Better for people to know they're just blindly copying tools and parroting their output as if it's automatically meaningfully. Any warning against that should be built into the individual, for their own sake
- thiisguy 4 hours ago
  
  Right? Some of these comments feel “you gave me commands to run and I should be able to turn my brain off to interpret the outputs”. These aren’t newbie commands so the assumption would be that you kinda know what you’re doing at least a little bit. If not, then don’t run them… similar to how you should approach all commands/things from the internet

berkes 6 hours ago

Plotting Churn against Complexity is far more useful than merely churn.

It shows places that are problematic much better. High churn, low complexity: fine. Its recognized and optimizef that this is worked on a lot (e.g. some mapping file, a dsl, business rules etc). Low churn high complexity: fine too. Its a mess, but no-one has to be there. But both? Thats probably where most bugs originate, where PRs block, where test coverage is poor and where everyone knows time is needed to refactor.

In fact, quite often I found that a teams' call "to rewrite the app from scratch" was really about those few high-churn-high-complexity modules, files or classes.

Complexity is a deep topic, but even simple checks like how nested smt is, or how many statements can do.

agumonkey 1 hour ago

Maybe it's a start to find conflict-prone regions ?

otherwise you're right, it could be a long linear list of appends where people are happy to contribute.

mememememememo 12 hours ago

Yes. Because the fear is butressed with necessity. You have to edit the file, and so does everyone else and that is a recipe for a lot of mess. I can think back over years of files like this. Usually kilolines of impossible to reason about doeverything.

mchaver 11 hours ago

Definitely not in my experience. The most changed are the change logs, files with version numbers and readmes. I don't think anyone is afraid of keeping those up to date.

zikani_03 4 hours ago

pom.xml and package.json came up on couple of separate projects I ran the commands on. Which makes sense because the versions get bumped rather frequently. I guess context matters, as usual.

jollyllama 9 hours ago

Yeah, the truth is going to be a lot more subtle than this.

furyofantares 8 hours ago

The LLM that wrote the copy is an idiot.

jamwil 6 hours ago

This is such obvious LLM slop.

szszrk 11 hours ago

Could be also that a frequently edited file had most opportunity to be broken. And it was edited by the most random crowd.

KptMarchewa 10 hours ago

In my case, it's .github/CODEOWNERS.

Nobody is afraid of changing it.

mayama 8 hours ago

Why does github owners need frequent change? Do members in you team change so often?