← Back to context

Comment by AllegedAlec

2 days ago

> Like can we determine the productivity of doctors, lawyers, journalists, or pastry chefs?

Yes, yes we can.

Programmers really need to stop this cope about us being such special snowflakes that we can't be assessed and that our maangers just need to take that we're worth keeping around on good faith.

News to me. How do you determine the productivity of a doctor? Patients seen? Patients cured? (for real, where did you get that data?) Number of medicines prescribed? Procedures performed? Does a triple bypass surgery count the same a pap smear? Hours worked? Amount of help they provided to colleagues? Easy to come up with another 100 other metrics that might be worth looking out. How are they all weighted?

Like I get that in SWE (like all other fields), managers have to make judgement calls and try to evaluate which reports contribute the most, but the GP post seemed surprised that this wasn't a solved problem by now, which just seems incomprehensible to me.

  • > How do you determine the productivity of a doctor?

    At the end of the road. Patient outcome and contentedness compared to others with similar indications. Patients seen and all that is that sort of short-term BS that you see everywhere that's giving metrics a bad name. It'd be like determining a mechanic's productivity by how many times he twisted a wrench.

    • > Patient outcome and contentedness compared to others with similar indications

      Which would incentivise doctors to refuse to treat patients who are more ill, lest they risk their ratings go down.

      1 reply →

    • > At the end of the road. Patient outcome and contentedness compared to others with similar indications.

      Well I would first of all remark that this doesn’t seem and to be how it’s normally done as I’ve never been asked to rate my “contentedness” or similar with my medical care.

      And where is the “end of the road?” Most medical interventions could be plausibly evaluated at all manner of different intervals.

      Also, “similar indications” is doing a lot of work here. Patient outcomes are often influenced more by the individual than the doctor. By the time you bucket all the patients by age, diet, activity level, smoking status, alcohol intake, metabolic health, bmi, family history, etc…buckets are going to be pretty tiny. Clinics and hospitals aren’t that big, there won’t be anything to compare. If you only bucket the most obvious categories like age, you’ll have comparisons, but it will just be noise.

    • > Patient outcome and contentedness compared to others with similar indications.

      and how you would achieve it? "similar indications" would be coming from doctor that you are trying to rate

      rating "contentedness" gets you doctors prescribing useless medications to keep patients happy

      expert surgeons have often bad survival rates as they get complicated cases, and trying to rate how complicated cases are to compare two experts would be nightmare as bad as rating doctors - so you only replace one hard problem with another as hard problem

  • life expectancy, standardized qol metrics. or patients seen, revenue per patients, hours worked etc can be metrics if those _were_ what you wanted. the point is, the answer is yes, they have measures of better/worse docs in their field.

    • Yea, those are all pretty shit metrics.

      We have that shit like that too in SWE. Lines of code, github issues closed, features shipped, etc…

> Yes, yes we can.

Of course we can. But can we do it in a meaningful way, such that the metric itself doesn't become a subject to optimization?

"When a measure becomes a target, it ceases to be a good measure"

  • > the metric itself doesn't become a subject to optimization

    By making the metrics part of a sustaintable company-wide goal. If there's a company-wide goal to increase X kind of revenue by Y% making actionable targets on how a team can contribute (not lazy shit like "our changes should contribute Z% of that Y%"), and within that create for a person another smaller metric based on that.

    • It's all so great in theory, where you get to imagine an ideal world in which all our incentives align and we're all rowing on a big boat towards success.

      In real world, most things don't work out that way. What metrics do you use to measure surgeons' success? If you use fatality rate, then as a result surgeons will refuse to do more risky surgeries which will put their ratings at risk, which makes the healthcare worse, instead of better.

    • This would be difficult to apply to R&D orgs or anything seen as a typical cost center.

      Also, medical facilities… you certainly could define it as profit, but that bothers me and many other people.

      You could define it as patients seen, or “cured” but that incentivizes very quick but probably poor care.

      You could define it as intensity of treatment or amount of care given, but you’d probably end up in a situation where 1 incredibly sick person has every doctor treating them.

      You could define it as…

      1 reply →

> Yes, yes we can.

Could you make an effort to explain how, or at the very least link to some reasoning? Otherwise your comment is basically the equivalent of “nuh-uh”, which doesn’t meaningfully contribute to the discussion.

> Programmers really need to stop this cope about us being such special snowflakes

Which is not at all what is happening in your parent comment. On the contrary, they’re putting developers on even footing with other professions.

  • >Could you make an effort to explain how, or at the very least link to some reasoning? Otherwise your comment is basically the equivalent of “nuh-uh”, which doesn’t meaningfully contribute to the discussion.

    You can look at the kind of work they're doing, how effective their solutions are, and how long it takes them to do it. That's the basics of it across a wide range of professions. Now, there's no one-size-fits-all metric or formula you can just calculate based on objective facts for most of this, because the work is more varied than e.g. factory work, but it's also not impossible to make the comparison, if you actually understand the work reasonably and you use judgement.

    In the case of this study, because the assignment of the comparison they were doing was random, then just measuring time to completion across a range of tasks is a perfectly reasonable metric, because there's nothing to really bias the outcome, just a lot of factors that add noise instead. But it is worth noting that the result is a very broad average, and there is likely a very complicated distribution of details underneath, which is much harder to measure.

    • > You can look at the kind of work they're doing, how effective their solutions are, and how long it takes them to do it. That's the basics of it across a wide range of professions. Now, there's no one-size-fits-all metric or formula you can just calculate based on objective facts for most of this, because the work is more varied than e.g. factory work, but it's also not impossible to make the comparison, if you actually understand the work reasonably and you use judgement.

      AKA, be subjective! Which people are wary of, because what it brings is politics and tribalism.

"Software engineers can be qualitatively assessed for the purposes of pay and promotion" and "software engineers can have their productivity measured and quantified" are two very different things.