Comment by morsecodist

1 month ago

The idea of measuring individual developer productivity is kind of absurd to me. I'm not saying that what we do is magic, there are just so many variables.

Measuring story points or lines of code is kind of the opposite of productivity. This encourages developers to do as much meaningless work as possible. I'd want a developer to make a task simpler or use an existing tool which means less time writing code. The value of them saving that work from knowing another way is high but hard to measure.

What you want is measuring business outcomes but those are hard to attribute to a particular developer.

I think unfortunately we're left with our subjective judgement here. I think we'd do better admitting to ourselves that we can't measure this than to pretend we have some sort of science here.

I had a director that was obsessed with github enterprise stats. He forbid people from squashing commits and told people to commit every day, even if you're in the middle of something. This was so that he could see who was writing the most code.

One of our interns was close to the end of his term and this director wanted to hire him. He thought the intern was amazing based on the amount of code he wrote. The problem was that this intern was bad, so we had him write unit tests. However, he was also bad at writing unit tests. He would run the code and then write tests that enforced the current results, instead of considering that the current behaviour might be incorrect. Thankfully we didn't hire him after myself and others explained why the intern had so many commits and lines of code written.

  • Reminds me of an old joke about a policeman who pulls over a poor driver. The driver complain "But I haven't been in a single accident", the reply from the policeman, "Yes, but you've caused dozens".

    Some people are just oblivious to the trail of destruction left in their wake.

    • “Almost every software development organization has at least one developer who takes tactical programming to the extreme: a tactical tornado. The tactical tornado is a prolific programmer who pumps out code far faster than others but works in a totally tactical fashion. When it comes to implementing a quick feature, nobody gets it done faster than the tactical tornado. In some organizations, management treats tactical tornadoes as heroes. However, tactical tornadoes leave behind a wake of destruction. They are rarely considered heroes by the engineers who must work with their code in the future. Typically, other engineers must clean up the messes left behind by the tactical tornado, which makes it appear that those engineers (who are the real heroes) are making slower progress than the tactical tornado.” ― John Ousterhout, A Philosophy of Software Design

      6 replies →

  • Reminds me of a manager and a QA I once knew. QA was a nice guy, but a terrible QA. Would fail stories on the most arbitrary guidelines. Story is about changing the font size on the home page? He'd fail it because he wasn't able to log in (he tested while the auth service was undergoing planned maintenance).

    Manager loved this guy, and pushed him through several promotions. Eventually other employees got tired of being skipped for promotions and left the company, creating a minor staffing crisis

    • We kinda have this, but… At least we have a dedicated QA team. That spends their whole day literally just confirming that the shit we write is up to spec. We spend a lot of time resolving discrepancies between implementation, spec and test, but when things roll out the other end, they work.

      1 reply →

    • Then the other employees found that this is how the world works: don't promote people that deliver results or you may have to replace them.

    • Well, I seen a fair share of "blind" programmers who just did their (very limited) scope, perhaps they did it well... yet the whole tool just didnt work and nobody cared.

      I imagine that if the guy pointed out things like this - he wasnt popular with the developmemt team, but popular with management and users.

    • people get promoted based upon their ability to conform to management culture, and their ability to deliver outcomes; in most places the former is far more important than the latter but not always

      1 reply →

  • > who was writing the most code

    There's yer problem right there. Code quantity is not correlated with value. In fact, it can be negatively correlated with value if it's buggy and laden with technical debt. Measuring productivity by lines of code produced actively discourages writing clean, maintainable, bug-free code.

    • I’m getting a similar sense from many of the AI “success” stories we’ve been hearing. There’s amazement about how many lines of code the AI produces in a short amount of time.

      But not so much about maintaining and debugging all those lines of code once they’re created.

      2 replies →

    • Turning a 100+ lines function of 5+ intertwined control flow mess into a single expression using a combination of method chaining of just a few lines is always a delighting experience to my mind.

      7 replies →

    • > Code quantity is not correlated with value. In fact, it can be negatively correlated with value if it's buggy and laden with technical debt.

      ** "No Code" or Nihilist Software Engineering **

      No code runs faster than no code.

      No code has fewer bugs than no code.

      No code uses less memory than no code.

      No code is easier to understand than no code.

      No code is the best way to have secure and reliable applications. Write nothing; deploy nowhere.

      One of my most productive days was throwing away 1,000 lines of code. -- Ken Thompson

      The cheapest, fastest, and most reliable components are those that aren’t there. -- Gordon Bell

      Deleted code is debugged code. -- Jeff Sickel

      Measuring programming progress by lines of code is like measuring aircraft building progress by weight. -- Bill Gates

      * Master Foo and the Ten Thousand Lines *

      Master Foo once said to a visiting programmer: “There is more Unix-nature in one line of shell script than there is in ten thousand lines of C.”

      The programmer, who was very proud of his mastery of C, said: “How can this be? C is the language in which the very kernel of Unix is implemented!”

      Master Foo replied: “That is so. Nevertheless, there is more Unix-nature in one line of shell script than there is in ten thousand lines of C.”

      The programmer grew distressed. “But through the C language we experience the enlightenment of the Patriarch Ritchie! We become as one with the operating system and the machine, reaping matchless performance!”

      Master Foo replied: “All that you say is true. But there is still more Unix-nature in one line of shell script than there is in ten thousand lines of C.”

      The programmer scoffed at Master Foo and rose to depart. But Master Foo nodded to his student Nubi, who wrote a line of shell script on a nearby whiteboard, and said: “Master programmer, consider this pipeline. Implemented in pure C, would it not span ten thousand lines?”

      The programmer muttered through his beard, contemplating what Nubi had written. Finally he agreed that it was so.

      “And how many hours would you require to implement and debug that C program?” asked Nubi.

      “Many,” admitted the visiting programmer. “But only a fool would spend the time to do that when so many more worthy tasks await him.”

      “And who better understands the Unix-nature?” Master Foo asked. “Is it he who writes the ten thousand lines, or he who, perceiving the emptiness of the task, gains merit by not coding?”

      Upon hearing this, the programmer was enlightened.

      Source: http://www.catb.org/~esr/writings/unix-koans/ten-thousand.ht...

      6 replies →

    • code is liability for both the business and the thinker at the end of the day. It's worth it to spend as much time as possible figuring out how to avoid writing it and keep it minimalized.

  • https://yosefk.com/blog/engineers-vs-managers-economics-vs-b...

    > ...It's a common story and an interesting angle, but the "best vs good enough" formulation misses something. It sounds as if there's a road towards "the best" – towards the 100%. Engineers want to keep going until they actually reach 100%. And managers force them to quit at 70%:

    > > There comes a time in the life of every project where the right thing to do is shoot the engineers and ship the fucker.

    > However, frequently the road towards "the best" looks completely different from the road to "the good enough" from the very beginning. The different goals of engineers and managers make their thinking work in different directions. A simple example will illustrate this difference.

    > Suppose there's a bunch of computers where people can run stuff. Some system is needed to decide who runs what, when and where. What to do?

    > * An engineer will want to keep as many computers occupied at every moment as possible – otherwise they're wasted.

    > * A manager will want to give each team as few computers as absolutely necessary – otherwise they're wasted.

    > These directions aren't just opposite – "as many as possible" vs "as few as necessary". They focus on different things. The engineer imagines idle machines longing for work, and he wants to feed them with tasks. The manager thinks of irate users longing for machines, and he wants to feed them with enough machines to be quiet. Their definitions of "success" are barely related, as are their definitions of "waste".

    > The "good enough" is not 70% of "the best" – it's not even in the same direction. In fact, it's more like -20%: once the "good enough" solution is deployed, the road towards "the best" gets harder. You restrict access to machines, and you get people used to the ssh session interface, which "the best" solution will not provide.

  • Probably not the point but committing every day is one methodology for TBD and it’s great for generating productivity. Not so much useful for measuring productivity though.

    I can’t imagine how miserable it would be to try to measure dev productivity by quantifying commits on a daily basis. What a waste.

  • This story tracks pretty closely with the Dilbert cartoon in the article. Except unintentionally.

  • > The problem was that this intern was bad, so we had him write unit tests.

    Please tell me you assigned this person a "increase code coverage" task and not "I wrote a new feature, write the unit tests for it" task

    • Yes, it was an "increase code coverage" task. The same director wanted 100% code coverage so I had the intern write tests for existing code that didn't have many tests.

Dev productivity is like quantum mechanics. The second you try to measure it, the wave function collapses, and you've fundamentally altered the thing you were trying to measure.

That said, at every place I've worked you could get a starting point idea of who the top devs were by looking at overall code commits. This Tim sensei situation may be more common than I think, but I've never run into it.

  • > at every place I've worked you could get a starting point idea of who the top devs were by looking at overall code commits.

    This works for Junior through Senior level roles, but it falls apart quickly when you have Staff+ roles in your company. Engineers in those roles still code, but they write a fraction of what they used to and that is by design—you want these people engineering entire initiatives, integrations, and migrations. This kind of work is essential and should be done by your top devs, but it will lead to many fewer commits than you get out of people who are working on single features.

    With a few exceptions for projects with a high degree of source-level complexity, a Staff+ engineer who's committing as much as your Senior engineers is probably either misleveled or misused.

    • Yeah, I've always worked at non-tech companies with pretty small teams and no roles like that. But it seems like any place with Staff+ engineers should know better than to try to stack them up against more junior devs based on some metric.

      1 reply →

    • I have a decent number of commits as a staff engineer, but barely any are on ‘features’ as measured by story points.

      We’re building the things that other people use to deliver features better and faster.

  • I've worked on projects where every single ounce of enthusiasm is coming out of one mediocre developer. Nobody pays attention to team dynamics like this, but they're all over the place.

  • With my minor dabbling in game theory, I've considered funny angles like secret-ish internal metrics and punish those who simply game the incentives. Example: bot detections and banwaves in MMOs. Instead of instantly banning plausible bots, the company has to hide its (ever-changing) internal algo and ban in waves instead.

    Basically, treating (non-)productivity like bot detection, lol.

    • A few years ago we started tracking both office attendance and PR count. Management swore up and down that they understood the nuances, and these metrics would only be used to follow up on outliers.

      But then one middle manager sets a target. His peers, not to be outdone, set more ambitious targets. And their underlings follow up aggressively with whomever is below target. For a few months there was an implicit "no vacations" policy in much of the company, until they updated the attendance dashboard to account for PTO. And now an engineer's position in the stack rank is presumed to be her position in the PR count rank barring very strong evidence to the contrary.

      The approach you outline is probably optimal, but I don't think middle management as a culture would be capable of pulling it off. You give them a KPI in a dashboard, they're going to compete on it, in a very open and simple way. Metrics are hard to argue with, and nuance requires constant vigilance to maintain.

  • > That said, at every place I've worked you could get a starting point idea of who the top devs were by looking at overall code commits

    Yea, in the current place I work at we do not measure coding performance metrics, but if you look at the top commiters to the repo — by number of commits — you will see all of the people who bring the most value to the company. Even at staff eng the best engineers are still the ones who crank out code, albeit maybe as not as much as a senior engineer. Staff engineers who only commit 1 a month are usually the ones who don’t bring value to the company, especially given their high pay.

    • At my last gig I spent the last year and a half as staff engineer and made almost no commits because I was constantly writing proposals, defending them to execs, doing architecture reviews, doing design consultations, and planning long term responses to large incidents. I know for a fact that I brought a ton of value to the company but it was very uncoupled to commits. I didn’t really like it because I need to code, so I’ve moved on to a role where I commit like a maniac again. :)

      5 replies →

  • > The second you try to measure it, the wave function collapses, and you've fundamentally altered the thing you were trying to measure.

    Related to Goodhart's law: "When a measure becomes a target, it ceases to be a good measure".

    I'm wondering if this is only true if the act of measurement is transparent (so that the people being measured can game them). If the act is opaque enough such that the form of the metric cannot easily be guessed SEO-style, then maybe wave function collapse can still be avoided?

  • You probably can't do it observationally. You could give several developers the same project, and probably see pretty big differences in timeline and quality. But no one is going to put in the work to run that kind of assessment at scale with the level of realism and time duration (probably several weeks) to make it meaningful.

  • I like looking at lines removed and, in context with, lines added or PR count. Someone more experienced will be refactoring and removing old stuff while adding new stuff and will have some balance (barring vendoring or moving file trees between repos)

    • Lines removed should count like 5x lines added. I don't trust any developer who doesn't get a thrill out of removing code.

> What you want is measuring business outcomes but those are hard to attribute to a particular developer.

I think we could solve this by eliminating some middle tiers and putting the developers on the actual customer calls every week. Each senior developer owns at least one customer. That sort of thing.

It's a completely different ball game when you are a developer on some B2B/SaaS and you are answering directly to the customer regarding your work. There's no one to hide behind - your teammates are now aside you and can only render supplemental aid. Once you have developers answering directly to the customers, you have a simple, ~boolean, qualitative metric that virtually any manager/investor can get behind: Is the customer happy?

If the developers are keeping the customers happy, then they are performing. If the customer is unhappy, then there is a performance issue. Whether or not most or all of it can be attributed to blockers on the customer's side is irrelevant at the senior levels of play. Developers should be helping the customers get unblocked. Offer meetings & screen share time. Generally, make yourself as available as possible. If you do this correctly, the customer is likely to take note and provide you with some amount of cover (i.e., not be in a raging fit when they call your CEO).

There is no reality in which you can inject more middle management or process to get developers to take more accountability. The only thing that works is to get them out of the bullshit matrix and in front of the customers. They need to experience how it feels to work directly with a happy customer and then realize how important it is to keep them that way.

This path also happens to solve a whole host of other maladies in technology business, most notably shiny rabbit chases. When you have the fear of god in you regarding the client, you aren't going to be as inclined to play in traffic with a NuSQL engine on their watch.

  • lol no, I have worked with companies that do that. Many developers have very specific personality types, and they also think in technical terms rather than customer terms. It isn’t a bad thing, it is why they are good at what they do.

    The most successful companies I have worked with have a good product manager that can take customer input and work with a technical manager to balance priorities/effort. The technical manager consults the team before making decisions (such as SCRUM meetings if using that)

    Issues come into play when folks start distrusting developers. When we say it will take 4 weeks or 8 weeks to implement something, there is a reason for that. We know the code. We know how much of a PITA it is to work with, and we aren’t being misleading. On the flip side, we have been trained to give conservative estimates thanks to crap management and unexpected pitfalls, so we try to understand promise and over deliver. If management could recognize this, everyone would be happy, provided they recognize that 8-week timeline is fine and they don’t promise something different to the client and trust devs to do their thing.

    EDIT: Managers also tend to forget that we aren't a machine in a factory. We have good days, bad days, and everything in between. We excel with using our brains, however our brains suffer from anything between lack of sleep, depression, and other nonsense due to just plain having a bad day. I feel like it is more noticeable with us because we rely on our brains so much more than other folks in other professions do. Shoot, even random noise in an office, whether working at home or at an office, can hurt productivity.

    Now I have myself missing dev work. Hoping to go back soon. Currently unemployed.

    • Farming would be extremely effective if the crops would be plowed down into the ground when they're ripe, instead of bothering with all the work of harvesting, refining, packaging and distributing to customers.

      But fortunately, software developers are not involved in architecting farming.

      What I'm saying is that developing for developments sake is completely pointless. The purpose of software is to help in the real world. And every developer should understand that connection, no matter how shy or antisocial.

Could a company keep a subjective poor performer on for the lifespan of the company? As in, what is the plus or minus in overall revenue or profit from this charity? What if all companies did that? Could we distribute the "burden" of the charity across all companies for a better society? My point is, I don't even know if metrics are good or bad, we may need to look at why we see each other like this. Is it so offensive to the mind of the captain of a ship that they may have a few of not the best sailors? It's a chance for them to be on a ship, go on a journey. The concept of a "free ride" appears to be a serious moral hazard for us, but I can't figure out why.

  • Sure, as long as they don't have any responsibilities. I've been a great performer at some companies and a poor performer at others, including one that kept me around without letting me have any responsibilities. It was not good for me. But I have been through worse.

    A better alternative I've seen is to help the poor performer hunt for a new job.

  • This is what Japan does. Employment is presumed to be for life. There are games companies play to get out very poor performers but most people get approximately lifetime employment.

  • Are you going to accept such free rider in your team? You will eventually do both your and his tasks but for the same or worse pay. Given you have a fixed budget for the team - do you want to split it more ways?

    Where can I send a CV?

    • If we were to do this, we wouldn't just pick someone so out of the range of what's needed. So basically, if you reframe your question, would I hire a 70 average student on a team of 90 average students? Yes. We would not pick the 50 average students. This is all possible within reason, the kind of stuff being discussed in this thread is to purge people who are 80 average students (free riders in a 95 average class).

      Will we need to tutor and support the 70 average student? Yes. Why would we do this? It's good for the soul. The stuff I'm talking about has no place in business, as far as we care to understand as a society at the moment.

      7 replies →

  • This reminds me a bit of how VCs bet on a bunch of startups in full knowledge that many will fail. The failures in some sense get a “free ride”. Why not the same with people?

This is why I like working on my own stuff. All that nonsense goes away and I’m left with just the two numbers that matter to me:

- how much money did it bring in this month, and

- how many hours did I have to work to make that happen.

The goal is to make the first number as big as possible while keeping the other as close to zero as possible.

Where I work, the CI pipelines measure changes in code coverage.

If I refactor repetitive code into something more concise and maintainable —- code coverage goes DOWN! (Now the proportion of untested code is larger!)

Yeah, measurement is hard…

the worst programmer nowadays is the vibe coder. vibe coding is so overrated imo https://www.lycee.ai/blog/why-vibe-coding-is-overrated

  • Vibe coding is now a skill requirement for real, actual jobs but, thankfully I guess, no place I want to work (yet). The other day I saw a YC24 company in the "financial services industry" with a "vibe coder position". A prerequisite for the position was at least 50% of your current code being generated by AI; vibe-coding experience was "non-negotiable". Traditional programmers need not apply. And you'd better be ready to grind, 12 hours a day up to 7 days a week. It pays up to $120,000 a year plus squat for equity and relocation to San Francisco is required. That's like, McDonald's money in San Francisco, so guess they figure running a vibe-coding sweatshop is a huge savings for them.

    Oh, and the cherry on top? The "financial service" this startup provides? Automated robocalls demanding money from debtors on behalf of debt collectors.

    https://www.ycombinator.com/companies/domu-technology-inc/jo...

    YC has entered its villain arc.

    • “Your onboarding will be making collection calls.”

      So, your AI debt collection startup onboards engineers by...having them work as human call-center debt collectors?

      Is there really AI at all?

      Oh, but after passing through the onboarding, your vibe-coding engineers will be be focussed on building product by vibe coding, right?

      “(what you will be doing...) Onboard customers, talk to them and travel to visit them.”

      So, a traveling sales rep. Your onboarding is pretending to be the product, and after that you are a traveling sales rep who vibe codes on the side?

    • Fortunately, vibe coding skills are easy to demonstrate through vibe statements of such; recruiters should be skilled in judging the vibe of the statement claiming vibe coding skills and accepting or rejecting it accordingly.

      If, however, you are having trouble doing that as a recruiter, I am offering professional vibe-rating services as well as a full-blown software solution, the first of its kind, Job Market Vibe-Rater™.

      1 reply →

    • it's just a marketing stunt like prompt engineer with 200K salary back in the day. And the fact that you are talking about it and sharing the link to the company's profile means the stunt is working.

    •   > +3 years of experience.
      

      Ummmm..... what?

      This seems like it was written by someone who loves Dystopian Sci-Fi. Loves in the way that it is the future they want to live in.

    • I honestly call this a scam position:

      - Vibe coding "non negotiable" - 3 years of experience (of what, exactly? if the "Vibe coding" term was coined just the other day...) - 12 to 15 hours a day

      No thanks...

  • I've been writing code professionally for a decade but I've never written so much code in my free time because I can just vibe code it all. I wouldn't do it at work and I probably wouldn't trust a juniors vibes as much as my own but no tools have made me feel quite so powerful.

    • I never tried vibe code as described in this article, but I could see where it breaks. Initially, code works. Eventually, there's a bug, and the LLM isn't able to fix it. At that stage, you're on your own with a potentially large codebase which is totally new to you.

      Even on small projects, sometimes I'm tired, I just try to get the LLM do some refactoring for me or write a couple of tests. First, whatever the LLM writes, it's going to code review and I'm not submitting code that I haven't read and understood just to have colleagues complaining. Second, if the code doesn't work, it gets frustrating. For LLMs to help, I like to have a pretty clear idea of what I want, how the "right" code looks like so I can give more indications.

      2 replies →

    • Vibe coding works for experienced developers who give very detailed instructions including edge cases and potential issues or solutions. It is almost code already as it has already included all important code business logic (leaving the simple parts for Ai to fill in ). It is a maintenance disaster for junior developers or non-coders.

Well said. There’s a noticeable lack of judgement in many places these days; people think they can abdicate it to metrics and data but this is mistaken.

Everyone on a programming team knows who the good and not-so-good programmers are.

  • Kind of meaningless statement though. Even if they do, which Im not sure is true in general case because you have to be good yourself to know who’s really good, they are not telling you anyway.

    • In every organization I've worked in, everyone knew who the productive and non-productive people were.

      Just like in school. Everyone knew who the good teachers were, and who the good students were.

      I don't fathom how you can work with other people and not be aware of it.

      11 replies →

  • This is the key. Ask all the developers for a ranked top-3 list of who they would most prefer to go to for help with a programming problem. Put that data together and you know who your best are.

    Before the inevitable protest: I believe the ability and willingness to communicate with other developers is a vital skill in this business.

I don't think productivity itself is absurd but it's definitely a hard thing to do, especially in small time intervals.

I think it's important to measure employee performance and you can definitely can a macro look and say "what have you accomplished in the last year" which might be measured of delivery of "t-shirt" sized projects

I don’t think it’s that we _can’t_ accurately measure developer value/productivity, it’s that doing so isn’t feasible with any methodology we currently have. As you said, we’re not doing magic, therefore our value is measurable. Measuring just requires an impractical degree of time and energy.

but also, the reason we come together to work as groups is so that we can achieve more than we would as the sum of the parts. The same is true for horses, two horses that work well together put out more power than two individual horses. So the problem becomes when you identify one of your horses has 1.5x the output of the other, and think firing the weaker horse will lead to minimal productivity gains, you end up with egg on your face when you realise you are not left with a horse that works 1.5x, but rather a horse now working at 0.5x what what it was in a team. The exact same thing is true for team work, team sports, etc. There is no one person 'holding it all together' if there is, then that person needs way better support and realistically, should get the entire departments pay when the remaining department is let go.

  > The idea of measuring individual developer productivity is kind of absurd to me.

It is absurd to do in a lot of settings. The other day we had a similar story with the measurement of "value" just the other day[0]. I think there's been a big bureaucratic takeover where people are measuring for the sake of measuring rather than recognizing that these things are proxies and you need to be careful to ensure your proxy is properly aligned with the actual thing you want to measure.

You see this in research and academia by trying to measure by number of papers/citations/venue/novelty. You see this in the workforce with things like lines of code, story points, and any of that other bullshit. You see this with how people think about money. Like just think what a million dollars a month would mean what you __*couldn't*__ do wit that money.

The problem is that we've created scoring systems and people see the ultimate goal as maximizing their scores. This doesn't matter if the score is actually meaningful or not, but we'll sure as hell passionately argue that they do. I'll refer to my comment to [0] for more[1].

I've been calling this concept: Goodhart's Hell

The truth is that every system has noise in it. I completely understand wanting to remove all possible noise from the system, but after a certain point, attempts to remove noise are actually ignoring noise. Maybe the problem is that people don't recognize that randomness is itself a measurement of uncertainty. There is no measuring device that is without uncertainty. Embrace the noise. Remove as much as you can, but you need to embrace what is impossible to remove. You create more noise by ignoring the noise.

[0] https://news.ycombinator.com/item?id=43448860

> The idea of measuring individual developer productivity is kind of absurd to me.

that idea is very attractive to every manager on planet earth. they seem to quickly lose their ability to maintain contact with reality and they start using metrics to track everything, rewarding those who game the metrics best.

If you are not committing any code that is a problem. But software development is more than just code. It is also more than just the individual.

It is not absurd. From manager's point of view, when they are deciding to let someone go or promote them, they often need some arguments. And since software is invisible, it is often hard to just say straight away who is over/underperforming and why you think so apart from a "vibe" you are having. Measuring some relevant metrics can strengthen your argument and/or your overview of the situation.

I don't think that the blog makes a compelling argument why measuring productivity is bad per se. It makes a compelling argument that metrics should not be interpreted blindly, but the metrics in this case identified a guy that was doing something unusual, and the managers managed to interpret why this is the case. But if it was an IC that is supposed to deliver, or if you did not want "Tim" to spend his time coaching people, this could still have been valuable info.

  • I don't think it is absurd to reason about individual contributions, just to focus on measuring them with metrics. I have been on the management side as well though I admit I am much more of an IC. But the way I think that should go is laddering up to business value by describing what the person did and why that contribution was important.

    So for example I would say something like: X deserves a promotion because their work was important to delivering project A on time. Project A was difficult but worthwhile because it allowed us as a business to meet goals 1 and 2.

    That said I mostly work in smaller orgs. I have never been in a situation where a manager would be so removed from a team they would need a sort of proxy metric to direct them where to focus their attention to understand what the people on their team are doing day to day. I can see how this would get more difficult as a company grows.