← Back to context

Comment by suzzer99

1 month ago

Dev productivity is like quantum mechanics. The second you try to measure it, the wave function collapses, and you've fundamentally altered the thing you were trying to measure.

That said, at every place I've worked you could get a starting point idea of who the top devs were by looking at overall code commits. This Tim sensei situation may be more common than I think, but I've never run into it.

> at every place I've worked you could get a starting point idea of who the top devs were by looking at overall code commits.

This works for Junior through Senior level roles, but it falls apart quickly when you have Staff+ roles in your company. Engineers in those roles still code, but they write a fraction of what they used to and that is by design—you want these people engineering entire initiatives, integrations, and migrations. This kind of work is essential and should be done by your top devs, but it will lead to many fewer commits than you get out of people who are working on single features.

With a few exceptions for projects with a high degree of source-level complexity, a Staff+ engineer who's committing as much as your Senior engineers is probably either misleveled or misused.

  • Yeah, I've always worked at non-tech companies with pretty small teams and no roles like that. But it seems like any place with Staff+ engineers should know better than to try to stack them up against more junior devs based on some metric.

    • Case in point, I worked with a Staff software engineer (I was also the same level) who consistently, half over half, had zero diffs to their name.

      Because they were leaning more into a product archetype - which it turns out they were very suited to.

  • I have a decent number of commits as a staff engineer, but barely any are on ‘features’ as measured by story points.

    We’re building the things that other people use to deliver features better and faster.

I've worked on projects where every single ounce of enthusiasm is coming out of one mediocre developer. Nobody pays attention to team dynamics like this, but they're all over the place.

With my minor dabbling in game theory, I've considered funny angles like secret-ish internal metrics and punish those who simply game the incentives. Example: bot detections and banwaves in MMOs. Instead of instantly banning plausible bots, the company has to hide its (ever-changing) internal algo and ban in waves instead.

Basically, treating (non-)productivity like bot detection, lol.

  • A few years ago we started tracking both office attendance and PR count. Management swore up and down that they understood the nuances, and these metrics would only be used to follow up on outliers.

    But then one middle manager sets a target. His peers, not to be outdone, set more ambitious targets. And their underlings follow up aggressively with whomever is below target. For a few months there was an implicit "no vacations" policy in much of the company, until they updated the attendance dashboard to account for PTO. And now an engineer's position in the stack rank is presumed to be her position in the PR count rank barring very strong evidence to the contrary.

    The approach you outline is probably optimal, but I don't think middle management as a culture would be capable of pulling it off. You give them a KPI in a dashboard, they're going to compete on it, in a very open and simple way. Metrics are hard to argue with, and nuance requires constant vigilance to maintain.

> That said, at every place I've worked you could get a starting point idea of who the top devs were by looking at overall code commits

Yea, in the current place I work at we do not measure coding performance metrics, but if you look at the top commiters to the repo — by number of commits — you will see all of the people who bring the most value to the company. Even at staff eng the best engineers are still the ones who crank out code, albeit maybe as not as much as a senior engineer. Staff engineers who only commit 1 a month are usually the ones who don’t bring value to the company, especially given their high pay.

  • At my last gig I spent the last year and a half as staff engineer and made almost no commits because I was constantly writing proposals, defending them to execs, doing architecture reviews, doing design consultations, and planning long term responses to large incidents. I know for a fact that I brought a ton of value to the company but it was very uncoupled to commits. I didn’t really like it because I need to code, so I’ve moved on to a role where I commit like a maniac again. :)

    • Yea it is dependent on your company.

      When I think of highly functioning staff engineers Im thinking of people like Jeff Dean and Mitchell Hashimoto. Similarly at companies I have been at, even the highest level principal super staff++ engineers still code. They do the high level strategy and design, but they are still writing PRs at least every week or two. If you are spending 100% of your time on politics and architecture as a staff eng I think that shows a somewhat dysfunctional organization.

    • Just check all of your proposals and design documents into source control, and you’ll still have plenty of commits!

    • Writing proposals, doing architecture reviews and doing design consultations sounds to me largely like busywork and bureaucracy rather than real and productive work.

      Maybe in a business where the cost of iteration is very high, it makes sense to do this. But in a good business you should not be doing architecture reviews and writing proposals, you should be doing refactoring and creating prototypes, respectively.

      2 replies →

> The second you try to measure it, the wave function collapses, and you've fundamentally altered the thing you were trying to measure.

Related to Goodhart's law: "When a measure becomes a target, it ceases to be a good measure".

I'm wondering if this is only true if the act of measurement is transparent (so that the people being measured can game them). If the act is opaque enough such that the form of the metric cannot easily be guessed SEO-style, then maybe wave function collapse can still be avoided?

You probably can't do it observationally. You could give several developers the same project, and probably see pretty big differences in timeline and quality. But no one is going to put in the work to run that kind of assessment at scale with the level of realism and time duration (probably several weeks) to make it meaningful.

I like looking at lines removed and, in context with, lines added or PR count. Someone more experienced will be refactoring and removing old stuff while adding new stuff and will have some balance (barring vendoring or moving file trees between repos)

  • Lines removed should count like 5x lines added. I don't trust any developer who doesn't get a thrill out of removing code.