Comment by taurath

21 days ago

> Copilot excels at low-to-medium complexity tasks in well-tested codebases, from adding features and fixing bugs to extending tests, refactoring, and improving documentation.

Bounds bounds bounds bounds. The important part for humans seems to be maintaining boundaries for AI. If your well-tested codebase has the tests built thru AI, its probably not going to work.

I think its somewhat telling that they can't share numbers for how they're using it internally. I want to know that Microsoft, the company famous for dog-fooding is using this day in and day out, with success. There's real stuff in there, and my brain has an insanely hard time separating the trillion dollars of hype from the usefulness.

165 comments

taurath

timrogers 21 days ago

We've been using Copilot coding agent internally at GitHub, and more widely across Microsoft, for nearly three months. That dogfooding has been hugely valuable, with tonnes of valuable feedback (and bug bashing!) that has helped us get the agent ready to launch today.

So far, the agent has been used by about 400 GitHub employees in more than 300 our our repositories, and we've merged almost 1,000 pull requests contributed by Copilot.

In the repo where we're building the agent, the agent itself is actually the #5 contributor - so we really are using Copilot coding agent to build Copilot coding agent ;)

(Source: I'm the product lead at GitHub for Copilot coding agent.)

overfeed 20 days ago
> we've merged almost 1,000 pull requests contributed by Copilot
I'm curious to know how many Copilot PRs were not merged and/or required human take-overs.
- sethammons 20 days ago
  
  textbook survivorship bias https://en.wikipedia.org/wiki/Survivorship_bias
  every bullet hole in that plane is the 1k PRs contributed by copilot. The missing dots, and whole missing planes, are unaccounted for. Ie, "ai ruined my morning"
  
  8 replies →
- philipwhiuk 20 days ago
  
  I'm curious how many were much more than Dependabot changes.
- xeromal 20 days ago
  
  I see number of PRs as modern LOC, something that doesn't tell me anything about quality.
- literalAardvark 20 days ago
  
  "We need to get 1000 PRs merged from Copilot" "But that'll take more time" "Doesn't matter"
  
  8 replies →
NitpickLawyer 20 days ago
> In the repo where we're building the agent, the agent itself is actually the #5 contributor - so we really are using Copilot coding agent to build Copilot coding agent ;)
Really cool, thanks for sharing! Would you perhaps consider implementing something like these stats that aider keeps on "aider writing itself"? - https://aider.chat/HISTORY.html
- timrogers 20 days ago
  
  Nice idea! We're going to try to get together a blog post in the next couple of weeks on how we're using Copilot coding agent at GitHub - including to build Copilot coding agent ;) - and having some live stats would be pretty sweet too.
taurath 20 days ago
> In the repo where we're building the agent, the agent itself is actually the #5 contributor - so we really are using Copilot coding agent to build Copilot coding agent ;)
Thats a fun stat! Are humans in the #1-4 slots? Its hard to know what processes are automated (300 repos sounds like a lot of repos!).
Thank you for sharing the numbers you can. Every time a product launch is announced, I feel like its a gleeful announcement of a decrease of my usefulness. I've got imposter syndrome enough, perhaps Microsoft might want to speak to the developer community and let us know what they see happening? Right now its mostly the pink slips that are doing the speaking.
- timrogers 20 days ago
  
  Humans are indeed in slots #1-4.
  After hearing feedback from the community, we’re planning to share more on the GitHub Blog about how we’re using Copilot coding agent at GitHub. Watch this space!
  
  1 reply →
_heimdall 20 days ago

How strong was the push from leadership to use the agents internally?
As part of the dogfooding I could see them really pushing hard to try having agents make and merge PRs, at which point the data is tainted and you don't know if the 1,000 PRs were created or merged to meet demand or because devs genuinely found it useful and accurate.
mirkodrummer 20 days ago
> 1,000 pull requests contributed by Copilot
I'd like a breakdown of this phrase, how much human work vs Copilot and in what form, autocomplete vs agent. It's not specified seems more like a marketing trickery than real data
- timrogers 20 days ago
  
  The "1,000 pull requests contributed by Copilot" datapoint is specifically referring to Copilot coding agent over the past 2.5 months.
  Pretty much every developer at GitHub is using Copilot in their day to work, so its influence touches virtually every code change we make ;)
  
  3 replies →
binarymax 21 days ago
So I need to ask: what is the overall goal of your project? What will you do in, say, 5 years from now?
- timrogers 21 days ago
  
  What I'm most excited about is allowing developers to spend more of their time working on the work they enjoy, and less of their time working on mundane, boring or annoying tasks.
  Most developers don't love writing tests, or updating documentation, or working on tricky dependency updates - and I really think we're heading to a world where AI can take the load of that and free me up to work on the most interesting and complex problems.
  
  41 replies →
- ilaksh 21 days ago
  
  That's a completely nonsensical question given how quickly things are evolving. No one has a five year project timeline.
  
  2 replies →
dsl 20 days ago
> In the repo where we're building the agent, the agent itself is actually the #5 contributor
How does this align with Microsoft's AI safety principals? What controls are in place to prevent Copilot from deciding that it could be more effective with less limitations?
- timrogers 20 days ago
  
  Copilot only does work that has been assigned to it by a developer, and all the code that the agent writes has to go through a pull request before it can be merged. In fact, Copilot has no write access to GitHub at all, except to push to its own branch.
  That ensures that all of Copilot's code goes through our normal review process which requires a review from an independent human.
  
  5 replies →
- bamboozled 20 days ago
  
  Haha
meindnoch 20 days ago

Yeah, Product Managers always swear by their products.
KenoFischer 20 days ago
What's the motivation for restricting to Pro+ if billing is via premium requests? I have a (free, via open source work) Pro subscription, which I occasionally use. I would have been interested in trying out the coding agent, but how do I know if it's worth $40 for me without trying it ;).
- timrogers 20 days ago
  
  Great question!
  We started with Pro+ and Enterprise first because of the higher number of premium requests included with the monthly subscription.
  Whilst we've seen great results within GitHub, we know that Copilot won't get it right every time, and a higher allowance of free usage means that a user can play around and experiment, rather than running out of credits quickly and getting discouraged.
  We do expect to open this up to Pro and Business subscribers - and we're also looking at how we can extend access to open source maintainers like yourself.
aaroninsf 21 days ago
Question you may have a very informed perspective on:
where are we wrt the agent surveying open issues (say, via JIRA) and evaluating which ones it would be most effective at handling, and taking them on, ideally with some check-in for conirmation?
Or, contrariwise, from having product management agents which do track and assign work?
- 9wzYQbTYsAIc 20 days ago
  
  Check out this idea: https://news.ycombinator.com/item?id=44030394).
  The entire website was created by Claude Sonnet through Windsurf Cascade, but with the “Fair Witness” prompt embedded in the global rules.
  If you regularly guide the LLM to “consult a user experience designer”, “adopt the multiple perspectives of a marketing agenc”, etc., it will make rather decent suggestions.
  I’ve been having pretty good success with this approach, granted mostly at the scale of starting the process with “build me a small educational website to convey this concept”.
  
  1 reply →
cwsx 20 days ago

Is Copilot _enforced_ as the only option for an AI coding agent? Or can devs pick-and-choose whatever tool they prefer
I'm interested in the [vague] ratio of {internallyDevlopedTool} vs alternatives - essentially the "preference" score for internal tools (accounting for the natural bias towards ones own agent for testing/QA/data purposes). Any data, however vague is necessary, would be great.
(and if anybody has similar data for _any_ company developing their own agent, please shout out).
nautilus12 20 days ago
Why don't you focus on automating your CEO's job, a comparatively easy task, rather than automating your fellow engineer's jobs?
- tjwebbnorfolk 20 days ago
  
  Spoken by someone who's apparently never run a real business
  
  1 reply →
nautilus12 20 days ago
Welp....Github was good product while it lasted.
- Cthulhu_ 20 days ago
  
  Github and Copilot are separate products, nothing mandates you to use it.
  
  4 replies →
miroljub 20 days ago
400 GitHub employees are using GitHub Copilot day in day out, and it comes out as #5 contributor? I wouldn't call that a success. If it is any useful, I would expect that even if a developer write 10% of their code using it, it would hold be #1 contributor in every project.
- timrogers 20 days ago
  
  The #5 contributor thing is a stat from a single repository where we’re building Copilot coding agent.
  
  1 reply →
09thn34v 20 days ago

re: 300 of your repositories... so it sounds like y'all don't use a monorepo architecture. i've been wondering if that would be a blocker to using these agents most effectively. expect some extra momentum to swing back to the multirepo approach accordingly
ilaksh 21 days ago
What model does it use? gpt-4.1? Or can it use o3 sometimes? Or the new Codex model?
- timrogers 20 days ago
  
  At the moment, we're using Claude 3.7 Sonnet, but we're keeping our options open to change the model down the line, and potentially even to introduce a model picker like we have for Copilot Chat and Agent Mode.
  
  1 reply →
burnt-resistor 20 days ago
When I repeated to other tech people from about 2012 to 2020 that the technological singularity was very close, no one believed me. Coding is just the easiest to automate away into almost oblivion. And, too many non technical people drank the Flavor Aid for the fallacy that it can be "abolished" completely soon. It will gradually come for all sorts of knowledge work specialists including electrical and mechanical engineers, and probably doctors too. And, of course, office work too. Some iota of a specialists will remain to tune the bots, and some will remain in the fields to work with them for where expertise is absolutely required, but widespread unemployment of what were options for potential upward mobility into middle class are being destroyed and replaced with nothing. There won't be "retraining" or handwaving other opportunities for the "basket of labor", but competition of many uniquely, far overqualified people for ever dwindling opportunities.
It is difficult to get a man to understand something when his salary depends upon his not understanding it. - Upton Sinclair
- kenjackson 20 days ago
  
  I don't think it was unreasonable to be very skeptical at the time. We generally believed that automation would get rid of repetitive work that didn't require a lot of thought. And in many ways programming was seen almost at the top of the heap. Intellectually demanding and requiring high levels of precision and rigor.
  Who would've thought (except you) that this would be one of the things that AI would be especially suited for. I don't know what this progression means in the long run. Will good engineers just become 1000x more productive as they manage X number of agents building increasingly complex code (with other agents constantly testing, debugging, refactoring and documenting them) or will we just move to a world where we just have way fewer engineers because there is only a need for so much code.
  
  4 replies →
- ayrtondesozzla 20 days ago
  
  Do you've any textual evidence of this 8-year stretch of your life where you see yourself as being perpetually correct? Do you mean that you were very specifically predicting flexible natural language chatbots, or vaguely alluding to some sort of technological singularity?
  We absolutely have not reached anything resembling anyone's definition of a singularity, so you are very much still not proven correct in this. Unless there are weaker definitions of that than I realised?
  I think you'll be proven wrong about the economy too, but only time will tell there.
- GenshoTikamura 20 days ago
  
  history/1950/people-in-swimming-pool-drinking-wine-served-by-a-robot.png
Xunjin 20 days ago

TBF, you are more than biased to conclude this, I definitely take your opinion with an whole bottle of salt.
Without data, a comprehensive study and peers review, it's a hell no. Would GitHub willing to be at academic scrutiny to prove it?
latentsea 20 days ago

> In the repo where we're building the agent, the agent itself is actually the #5 contributor - so we really are using Copilot coding agent to build Copilot coding agent ;)
Ah yes, the takeoff.

mjr00 20 days ago

From talking to colleagues at Microsoft it's a very management-driven push, not developer-driven. Friend on an Azure team had a team member who was nearly put on a PIP because they refused to install the internal AI coding assistant. Every manager has "number of developers using AI" as an OKR, but anecdotally most devs are installing the AI assistant and not using it or using it very occasionally. Allegedly it's pretty terrible at C# and PowerShell which limits its usefulness at MS.

shepherdjerred 20 days ago
[flagged]
- antihipocrat 20 days ago
  
  That's exactly what senior executives who aren't coding are saying everywhere.
  Meanwhile, engineers are using it for code completion and as a Google search alternative.
  I don't see much difference here at all, the only habit to change is learning to trust an AI solution as much as a Stack Overflow answer. Though the benefit of SO is each comment is timestamped and there are alternative takes, corrections, caveats in the comments.
  
  5 replies →
- mjr00 20 days ago
  
  What does this have to do with my comment? Did you mean to reply to someone else?
  I don't understand what this has to do with AI adoption at MS (and Google/AWS, while we're at it) being management-driven.
- adamsb6 20 days ago
  
  There's a large group of people that claim that AI tools are no good and I can't tell if they're in some niche where they truly aren't, they don't care to put any effort into learning the tools, or they're simply in denial.
  
  4 replies →
- evantbyrne 20 days ago
  
  It's just tooling. Costs nothing to wait for it to be better. It's not like you're going miss out on AGI. The cost of actually testing every slop code generator is non-trivial.
- rsoto2 20 days ago
  
  AIs are boring
- karn97 20 days ago
  
  A bwtter stack exchange search isn't that revolutionary

sensanaty 20 days ago

> I want to know that Microsoft, the company famous for dog-fooding is using this day in and day out, with success

Have they tried dogfooding their dogshit little tool called Teams in the last few years? Cause if that's what their "famed" dogfooding gets us, I'm terrified to see what lays in wait with copilot.

twodave 21 days ago

I feel like I saw a quote recently that said 20-30% of MS code is generated in some way. [0]

In any case, I think this is the best use case for AI in programming—as a force multiplier for the developer. It’s for the best benefit of both AI and humanity for AI to avoid diminishing the creativity, agency and critical thinking skills of its human operators. AI should be task oriented, but high level decision-making and planning should always be a human task.

So I think our use of AI for programming should remain heavily human-driven for the long term. Ultimately, its use should involve enriching humans’ capabilities over churning out features for profit, though there are obvious limits to that.

[0] https://www.cnbc.com/2025/04/29/satya-nadella-says-as-much-a...

greatwhitenorth 21 days ago

How much was previously generated by intellisense and other code gen tools before AI? What is the delta?
DeepYogurt 20 days ago
> I feel like I saw a quote recently that said 20-30% of MS code is generated in some way. [0]
Similar to google. MS now requires devs to use ai
- spooneybarger 20 days ago
  
  I know a lot of devs at MSFT, none of them are required to use AI.
  
  3 replies →
- beefnugs 20 days ago
  
  So demanding all employees use it... results in less than 30% compliance. That does tell me a lot
tmpz22 21 days ago
How much of that is protobuf stubs and other forms of banal autogenerate code?
- twodave 21 days ago
  
  Updated my comment to include the link. As much as 30% specifically generated by AI.
  
  3 replies →
rcarmo 20 days ago

That quote was completely misrepresented.
ilaksh 21 days ago
You might want to study the history of technology and how rapidly compute efficiency has increased as well as how quickly the models are improving.
In this context, assuming that humans will still be able to do high level planning anywhere near as well as an AI, say 3-5 years out, is almost ludicrous.
- _se 20 days ago
  
  Reality check time for you: people were saying this exact thing 3 years ago. You cannot extrapolate like that.

k__ 20 days ago

"I want to know that Microsoft, the company famous for dog-fooding is using this day in and day out, with success."

They just cut down their workforce, letting some of their AI people go. So, I assume there isn't that much success.

lacoolj 20 days ago

They have released numbers, but I can't say they are for this specific product or something else. They are apparently having AI generate "30%" of their code.

https://techcrunch.com/2025/04/29/microsoft-ceo-says-up-to-3...

rcarmo 20 days ago
That article is wrong, that is not what was said.
- _heimdall 20 days ago
  
  What was said?
  
  1 reply →

mrcsharp 20 days ago

> Microsoft, the company famous for dog-fooding

This was true up around 15 years ago. Hasn't been the case since.

ctkhn 20 days ago

That's great, our leadership is heavily pushing ai-generated tests! Lol

codebolt 20 days ago

Whatever the true stats for mistakes or blunders are now, remember that this is the worst its ever going to be. And there is no clear ceiling in sight that would prevent it from quickly getting better and better, especially given the current levels of investment.

_heimdall 20 days ago

That sounds reasonable enough, but the pace or end result is by no means guaranteed.
We have invested plenty of money and time into nuclear fusion with little progress. The list of key acheivments from CERN[1] is also meager in comparison to the investment put in, especially if you consider their ultimate goal to ultimately be towards applying research to more than just theory.
[1] https://home.cern/about/key-achievements