Comment by overfeed

21 days ago

> we've merged almost 1,000 pull requests contributed by Copilot

I'm curious to know how many Copilot PRs were not merged and/or required human take-overs.

20 comments

overfeed

textbook survivorship bias https://en.wikipedia.org/wiki/Survivorship_bias

every bullet hole in that plane is the 1k PRs contributed by copilot. The missing dots, and whole missing planes, are unaccounted for. Ie, "ai ruined my morning"

n2d4 21 days ago
It's not survivorship bias. Survivorship bias would be if you made any conclusions from the 1000 merged PRs (eg. "90% of all merged PRs did not get reverted"). But simply stating the number of PRs is not that.
- tines 20 days ago
  
  As with all good marketing, the conclusions omitted and implied, no?
  
  4 replies →
- krainboltgreene 20 days ago
  
  Given that Github is continuing with the product and marketing to us it feels sufficient to count that as a conclusion.
MoreQARespect 21 days ago

If they measured that too it would make it harder to justify a MSFT P/E ratio of 29.6.

philipwhiuk 20 days ago

I'm curious how many were much more than Dependabot changes.

xeromal 20 days ago

I see number of PRs as modern LOC, something that doesn't tell me anything about quality.

literalAardvark 21 days ago

"We need to get 1000 PRs merged from Copilot" "But that'll take more time" "Doesn't matter"

worldsayshi 21 days ago
I do agree that some scepticism is due here but how can we tell if we're treading into "moving the goal posts" territory?
- overfeed 21 days ago
  
  I'd love to know where you think the starting position of the goal posts was.
  Everyone who has used AI coding tools interactively or as agents knows they're unpredictably hit or miss. The old, non-agent Copilot has a dashboard that shows org-wide rejection rates for for paying customers. I'm curious to learn what the equivalent rejection-rate for the agent is for the people who make the thing.
- taurath 20 days ago
  
  I think the implied promise of the technology, that it is capable of fundamentally changing organizations relationships with code and software engineering, deserves deep understanding. Companies will be making multi million dollar decisions based on their belief in its efficacy
- internet101010 20 days ago
  
  When someone says that the number given is not high enough. I wouldn't consider trying to get an understanding of PR acceptance rate before and after Copilot to be moving the goal posts. Using raw numbers instead of percentages is often done to emphasize a narrative rather than simply inform (e.g. "Dow plummets x points" rather than "Dow lost 1.5%").
Cthulhu_ 20 days ago
I feel the same about automated dependency updates, but if your tests and verifications are good, these become trivial.
- skydhash 20 days ago
  
  Sometimes there are some paradigms shift in the dependency that get past the current tests you have. So it’s always good to read the changelog and plan the update accordingly.
- no_wizard 20 days ago
  
  Strong automated tests and verifications seem to be nearly as rare as unicorns, at least if you take most of developers feelings on this.
  It seems places don't prioritize it, so you don't see it very often. Some developers are outright dismissive of the practice.
  Unfortunately, AI won't seemingly help with that