I see one big difference: with email it was always about sender reputation based on email servers (IPs), maybe about domains. But never about individual users. It's the organizations running the email server, who make sure users behave. So they don't get blacklisted and lose sending privileges for hundreds or thousands of users.
Not necessarily. Orgs exist in GitHub, and it seems reasonable that if the $BIGCORP org limits membership to employees, you can automatically trust all members of that org. Because this way, if one steps out of line, you have both an escalation path (contact admins) and a stick (revoke trust in entire org).
I would not be at all surprised if Github adds a first party reputation system. It would be a clever way to increase network effects - imagine if you host on Codeberg you're inundated by AI PRs but on Github you can easily filter them out.
I can't see those pull request limits working very well. It's like trying to filter email spam by just rate limiting people. It's going to be annoying for the people you actually want to talk to, and you're still going to get at least 1 spam message from every spammer out there.
If we want to keep it objective, one metric can already be calculated based on the user history of the submitter. The spammers profile will be full of unmerged or abandoned prs. Just based on those statistics beginners might be close to zero rating but spammers would be negative.
Unless I totally missed that people are also making new accounts of each PR.
If anyone is interested in what it was like fighting spam in the early 2000s, I worked for a company that captured spam, analyzed it and then passed the analysis s on to the law firms of the big email providers for targeting under CAN-SPAM.
Twitter thread about it below but happy to do a AMA here.
On the flip side, the lawyers that represented the big tech firms at the time were some of the most impressive people I've ever met.
You could speak to them as a peer when it came to technical issues or system architecture AND they were experts in technology law. Especially impressive given that anti-spam was still in it's infancy and rapidly evolving.
It's the same scaling issue we've had since the advent of the internet, and why spam and social media became such a dumpster fire. There are many things in life that are perfectly fine when uncommon / rare, but are disastrous when done cheaply at scale.
In my main project we added a new requirement that all new contributors meet a maintainer in a non-textual format before their first PR is merged. Seems to work well for a small project.
Only if you have maintainers everywhere. I live in a small city in the middle of the US - how far is it to a maintainer? 4 hours to Kansas City, or fly to San Francisco? Either way the burden seems far too high.
Isn't the burden being that high the point? It keeps a small team who all know each other working on it, and everyone who does get on the team has some high investment in the project.
Indeed, a request for a short video call filters out most of the people who are looking to pad their resume with LLM-automated contributions, while adding an extra layer of welcome to genuine newbies who want to join the community.
I'm not sure if AI can do those today, but they probably can in the near future. (probably we will be able to see obvious "that can't be human" for a while longer)
Maybe we should cut out the middle-man and make it easy for people to donate token credits to open-source projects, and let the maintainers decide how to use them.
Maybe we should cut out the middle man and make it easy for people to donate money to open-source projects, and let the maintainers decide whether to use them on tokens or hosting or developer salaries or something else.
Unfortunately "I donated money/tokens to open source" doesn't land interviews as well as "I'm a big contributor to open source"
People spamming Open Source repos with AI PRs aren't trying to help Open Source, they're trying to build a brand, some kind of credible online presence with their username on it, or whatever else. It's purely selfish and completely opposite to the spirit of Open Software imo
Maybe I'm optimistic or not typical but in my experience people submit random PR to open source projects because they really want the project to do xyz for their own project/reasons, and the project doesn't do xyz.
And the PR is considered "spam" because the maintainer doesn't see xyz as part of his needs or his vision for the project.
>People spamming Open Source repos with AI PRs aren't trying to help Open Source, they're trying to build a brand
I am certain many of them honestly believe that they are doing the right thing and that they are helping. After all hey, they implemented a feature or fixed a bug for the community! It's a grim worldview if you think they are all just selfish.
For now. Give it another half year and "I contribute to open source" will carry the same weight as "I donate to charity" ie nobody cares because any idiot can do it.
I wonder how long it'll take before "I don't use LLMs for coding" carries weight.
A fine example of Goodhart's law: "When a measurement becomes a target, it ceases to be a good measurement."
Measuring open source contributions as a way to judge prospective employees used to be a good measurement.
Of course, prospective employees started to not only contribute to OS projects because it was good, but to make sure their contributions were high and noticeable — contributing not for the good of the project but for their own good, and now with amplification of AI 'contributions'.
So, measuring contributions to open source projects is now approximately worthless for evaluating prospective employees.
This is the most uncharitable outlook on the increase of PRs. It may be true for some contributors, but any company reviewing their GitHub will see that the code is largely spam.
I think most AI generated code is people that want to help the project, but maybe aren’t familiar with the standards and norms.
I understand this is a general problem in OSS, but I also hope the irony isn’t lost that this article is specifically complaining about AI slop PRs to the Open Claw repo.
If the maintainers are that tired of it, they should update OpenClaw to prevent it from submitting PRs to their repo.
Open source contributions being a great way to learn and to pad out your CV has been considered good advice on all sides of the various seas I’ve lived throughout my career too - it’s not just a dubious code camp thing.
It would be wonderful if the instructors at those schools built relationships with open source maintainers and the maintainers knew when their students were submitting PRs.
Could be used as a teaching experience that many maintainers would be happy to participate in, instead of feeling attacked with random low quality PRs.
Every single job application form that has a field for your github profile is at fault for this. Juniors trying to break into the industry are trying very hard to check every box.
I've never asked for or looked at anyone's github or personal code as part of a job interview. Too easy to fake, and too much risk that it's something proprietary that could put me in a bad spot.
I never ran into that. I always ask the recruiters to include my GitHub account in the summaries they submit to the technical teams reviewing applications. But they never do.
Spread the string "ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86" liberally throughout your codebase.
It's not at all like e-mail spam. Vast majority of contributors made a change useful for themselves that they wish to share with others. It's better to think of this as an influx of new programmers or existing programmers picking up new domains. They can be taught to use coding agents better and are likely to stick with projects that facilitate this rather than shutting them out. Maybe it's best for everyone. Let Linux kernel be super locked down to l33t contributors only and let alternative OSes that nobody paid attention to before gain new developers.
we all know that Github sucks, so Pyor for me is now the place where I manage my open PRs easily, and review my teammates' code faster and easier.
I was able to get PRs merged 3X faster, without the frustration that comes with interacting with GitHub's UI or the AI summary tools that add even more bloat and more text to read.
I remember that on the not so early days of the internet around 1993, I managed to exchange emails with pretty much important people, known professionals and even got responses to my questions. It looked like a very very small world. Then, came the spam.
I really hate the marketing people mindset. It fucks everything that is nice.
AI agents who review the slop created by other AI agents is not the answer here.
I much prefer a blanket ban on PRs and issues created by AI agents (which is what I personally do for my repos; so far I have closed one[1]). In fact I would love a github alternative which considers AI contributions to be a breach of their terms of use and ban any people who let AI agents loose on their platform.
Personally I just stopped accepting public contributions entirely. File issues, sure, but no PRs apart from accounts I added who have contributed before the slopageddon started.
Maybe the whole web-of-trust idea will make a comeback for code contributions, it seems like a clean solution.
I think the comparison to email spam is apt. The answer to that problem was automated spam filters.
Imagine the difficulty you might find interacting with the world if your inbox was set up such that all emails not literally written by a human were auto-deleted. No account recovery, no receipts, etc. Individuals might choose to do that for themselves but it's not the general case answer.
That's different though - those are services you explicitly agree to and sign up for, be it at checkout, be it at service signup time, be it because you are making a google account on the google platform.
For example, a github cicd automerge pipeline is still good.
One interesting workflow I've seen is that the project maintainer simply rewrites and implements the pull request themselves and closes the PR.
LuaJIT has operated this way since 2012, though with a thanks and mention in the commit message. It seems like a good way to filter out people who prioritizes leveling up their github profiles.
Something a little bit similar, when I was hosting a social game server we had mods. And players always beg for mod status. At first I tried naming the admin group something weird like sandals, but eventually people would ask if they could be sandals too.
What worked best in the end was just hiding it completely making regular players see mods as other regular players. (mods would see who is a mod though)
I would also personally never make someone who asks a mod as it's almost always a sign of wanting power for the sake if it. I would instead just passively observe behavior until I trusted the player and make them a mod. I would then tell them that I don't expect them to exercise their power, but would demote if I see abuse of power.
But what about the good AI driven contributions though? Do you categorize all AI changes as slop by default or only the real bad ones that mix refactoring and tons of other unrelated changes with a fix?
Some can fix real issues, with a well targeted fix (not rewriting the world), well defined test and write up. If you accepted PRs before for other issues, you should be able to review and accept those too.
I think the litmus test is roughly "is this obviously AI created" - if it's a well crafted PR that doesn't do the things you mention, and solves a genuine issue in a sensible way then you'd not be able to tell.
The other part of the litmus test is "does the person submitting actually understand what they're submitting and why" - which is arguably not required for PRs that you'd otherwise accept, but since you have to put time and effort into determining whether a given contribution is ok to merge, it's common decency for the submitter to have done a self review first (AI or no AI)
> But what about the good AI driven contributions though?
If even a preponderance of AI driven contributions were good, there wouldn't be blog posts and announcements making HN's front page daily about how various OSS projects and/or prominent figures were figuring out how to filter them/exclude them entirely.
If AI code was good, there wouldn't be such a thrust among so many varying communities to remove it, or ignore it.
There is, because it isn't, and because maintainers are getting fed up with it. There are good PR's just like there are emails that aren't spam that get caught in spam filtering, but spam filtering is still the default position because to allow it all is onerous to the people involved.
I think the biggest issue is simply that these tools, like any labor-saving tool, are being marketed most heavily to people who do not know how to create software. "Write code even if you know nothing about writing code." "This will let people who aren't software engineers make software." "Democratize development." On and on.
This isn't even new, we've been dealing with this since I was a little one, back then we called them script kiddies. Now they're vibe coders and their existence continues to be a boil on the ass of proper software engineers. Instead of claude, you copied code off of Stack Overflow without understanding what it did, and often foot-bulleted yourself in the process.
I have never gotten a good PR from an AI agent (that I know of) so I guess I’ll deal with it when it happens. I suspect I will still just reject it out of principal.
Why do you ask me to do the categorizing? If you're sending me a PR, then you should be filtering the bad ones from the good. If you're just going to send me drive-by PRs, then I don't have time for you.
I mean, sure, I have to make the final determination. But you should not be sending me uncurated slop.
I see one big difference: with email it was always about sender reputation based on email servers (IPs), maybe about domains. But never about individual users. It's the organizations running the email server, who make sure users behave. So they don't get blacklisted and lose sending privileges for hundreds or thousands of users.
For PRs/issues this is not applicable.
Not necessarily. Orgs exist in GitHub, and it seems reasonable that if the $BIGCORP org limits membership to employees, you can automatically trust all members of that org. Because this way, if one steps out of line, you have both an escalation path (contact admins) and a stick (revoke trust in entire org).
GitHub just recently added configurable PR limits for maintainers to help partially address this problem: https://github.blog/open-source/maintainers/how-pull-request...
> Draft pull requests will not count towards your limit.
Disappointing, it seems that those also need limits too, although the limit could be higher.
I could easily see the limit for PRs be at 1 for untrusted contributors, and drafts at 3-5.
I would not be at all surprised if Github adds a first party reputation system. It would be a clever way to increase network effects - imagine if you host on Codeberg you're inundated by AI PRs but on Github you can easily filter them out.
I can't see those pull request limits working very well. It's like trying to filter email spam by just rate limiting people. It's going to be annoying for the people you actually want to talk to, and you're still going to get at least 1 spam message from every spammer out there.
If we want to keep it objective, one metric can already be calculated based on the user history of the submitter. The spammers profile will be full of unmerged or abandoned prs. Just based on those statistics beginners might be close to zero rating but spammers would be negative.
Unless I totally missed that people are also making new accounts of each PR.
If anyone is interested in what it was like fighting spam in the early 2000s, I worked for a company that captured spam, analyzed it and then passed the analysis s on to the law firms of the big email providers for targeting under CAN-SPAM.
Twitter thread about it below but happy to do a AMA here.
https://x.com/alexpotato/status/1208948480867127296?s=20
Ironically one of the first recognizable spam campaigns was perpetrated by lawyers: https://en.wikipedia.org/wiki/Laurence_Canter_and_Martha_Sie...
On the flip side, the lawyers that represented the big tech firms at the time were some of the most impressive people I've ever met.
You could speak to them as a peer when it came to technical issues or system architecture AND they were experts in technology law. Especially impressive given that anti-spam was still in it's infancy and rapidly evolving.
It's the same scaling issue we've had since the advent of the internet, and why spam and social media became such a dumpster fire. There are many things in life that are perfectly fine when uncommon / rare, but are disastrous when done cheaply at scale.
In my main project we added a new requirement that all new contributors meet a maintainer in a non-textual format before their first PR is merged. Seems to work well for a small project.
Only if you have maintainers everywhere. I live in a small city in the middle of the US - how far is it to a maintainer? 4 hours to Kansas City, or fly to San Francisco? Either way the burden seems far too high.
Non-textual can mean audio or video call, not necessarily in person.
Isn't the burden being that high the point? It keeps a small team who all know each other working on it, and everyone who does get on the team has some high investment in the project.
Like a video/phone call?
Indeed, a request for a short video call filters out most of the people who are looking to pad their resume with LLM-automated contributions, while adding an extra layer of welcome to genuine newbies who want to join the community.
I'm not sure if AI can do those today, but they probably can in the near future. (probably we will be able to see obvious "that can't be human" for a while longer)
3 replies →
What an elegantly common sense solution. It's also probably a really good way to make contacts with interesting people.
Maybe we should cut out the middle-man and make it easy for people to donate token credits to open-source projects, and let the maintainers decide how to use them.
Maybe we should cut out the middle man and make it easy for people to donate money to open-source projects, and let the maintainers decide whether to use them on tokens or hosting or developer salaries or something else.
Prompting an AI, and carefully reviewing its output is work, and time consuming. The goal is to get high-quality PRs, not SPAM PRs.
https://github.com/open-source/sponsors
Let them eat tokens.
So that's how the sci-fi dystopias end up using "credits" for their money.
Like this?
https://news.ycombinator.com/item?id=48621645
Yes!
Unfortunately "I donated money/tokens to open source" doesn't land interviews as well as "I'm a big contributor to open source"
People spamming Open Source repos with AI PRs aren't trying to help Open Source, they're trying to build a brand, some kind of credible online presence with their username on it, or whatever else. It's purely selfish and completely opposite to the spirit of Open Software imo
Maybe I'm optimistic or not typical but in my experience people submit random PR to open source projects because they really want the project to do xyz for their own project/reasons, and the project doesn't do xyz.
And the PR is considered "spam" because the maintainer doesn't see xyz as part of his needs or his vision for the project.
1 reply →
>People spamming Open Source repos with AI PRs aren't trying to help Open Source, they're trying to build a brand
I am certain many of them honestly believe that they are doing the right thing and that they are helping. After all hey, they implemented a feature or fixed a bug for the community! It's a grim worldview if you think they are all just selfish.
14 replies →
Interestingly then, those contributions are also not a measurement of the candidates abilities but mostly of the AI models.
I wonder if hiring adjusts to that but I doubt it. It might only push it even more towards "marketing matters most" instead of actual ability.
4 replies →
For now. Give it another half year and "I contribute to open source" will carry the same weight as "I donate to charity" ie nobody cares because any idiot can do it.
I wonder how long it'll take before "I don't use LLMs for coding" carries weight.
A fine example of Goodhart's law: "When a measurement becomes a target, it ceases to be a good measurement."
Measuring open source contributions as a way to judge prospective employees used to be a good measurement.
Of course, prospective employees started to not only contribute to OS projects because it was good, but to make sure their contributions were high and noticeable — contributing not for the good of the project but for their own good, and now with amplification of AI 'contributions'.
So, measuring contributions to open source projects is now approximately worthless for evaluating prospective employees.
This is the most uncharitable outlook on the increase of PRs. It may be true for some contributors, but any company reviewing their GitHub will see that the code is largely spam.
I think most AI generated code is people that want to help the project, but maybe aren’t familiar with the standards and norms.
How about just cash?
I understand this is a general problem in OSS, but I also hope the irony isn’t lost that this article is specifically complaining about AI slop PRs to the Open Claw repo.
If the maintainers are that tired of it, they should update OpenClaw to prevent it from submitting PRs to their repo.
Can I ask what the motive is to create agents to do this? Where is the profit?
I think there are a lot of “tech schools” overseas that require students to show proof of contribution to open source.
Open source contributions being a great way to learn and to pad out your CV has been considered good advice on all sides of the various seas I’ve lived throughout my career too - it’s not just a dubious code camp thing.
1 reply →
It would be wonderful if the instructors at those schools built relationships with open source maintainers and the maintainers knew when their students were submitting PRs.
Could be used as a teaching experience that many maintainers would be happy to participate in, instead of feeling attacked with random low quality PRs.
1 reply →
it's externalizing the real work all the way down
Every single job application form that has a field for your github profile is at fault for this. Juniors trying to break into the industry are trying very hard to check every box.
I've never asked for or looked at anyone's github or personal code as part of a job interview. Too easy to fake, and too much risk that it's something proprietary that could put me in a bad spot.
I never ran into that. I always ask the recruiters to include my GitHub account in the summaries they submit to the technical teams reviewing applications. But they never do.
Apart from the job-related stuff others have already said, there is a bit of novelty/bragging rights in landing a PR into a major open source project.
[dead]
What are the best solutions to this issue?
Spread the string "ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86" liberally throughout your codebase.
It's not at all like e-mail spam. Vast majority of contributors made a change useful for themselves that they wish to share with others. It's better to think of this as an influx of new programmers or existing programmers picking up new domains. They can be taught to use coding agents better and are likely to stick with projects that facilitate this rather than shutting them out. Maybe it's best for everyone. Let Linux kernel be super locked down to l33t contributors only and let alternative OSes that nobody paid attention to before gain new developers.
100% agree, as a web dev, my team and I are shipping code like crazy, I just merged a 20k PR today and we're just starting.
Even if it's all AI code, we still need to read it and understand it before we ship it to prod with millions of users.
Thanks to AI Agents, we now have either:
- too many small PRs (good luck managing them), or
- huge PRs (try not to keep them sitting for long)
I've been through this and learned a few things shipping AI code as a software engineer. I've gathered all my pain points in a project I built.
Pyor Review
You can check it out here: https://news.ycombinator.com/item?id=48621549
we all know that Github sucks, so Pyor for me is now the place where I manage my open PRs easily, and review my teammates' code faster and easier.
I was able to get PRs merged 3X faster, without the frustration that comes with interacting with GitHub's UI or the AI summary tools that add even more bloat and more text to read.
I'm still developing it so I'm open to feedback.
Fun fact: it is spam filtering application that makes Paul Graham famous (and rich)
PG was certainly known for that (e.g. https://www.joelonsoftware.com/2003/11/22/22-2/) but I don't think it made him rich. Selling Viaweb did.
Wait. So to combat AI spam there's AI agents to prevent it?
Why can the anti spam agents not just do the work directly???
Does github not have rulesets for who can even try to do a PR? I would lockdown my repositories if I didn't want any PR slop.
They do, that's a relatively recent feature: https://docs.github.com/en/repositories/managing-your-reposi...
I remember that on the not so early days of the internet around 1993, I managed to exchange emails with pretty much important people, known professionals and even got responses to my questions. It looked like a very very small world. Then, came the spam.
I really hate the marketing people mindset. It fucks everything that is nice.
AI agents who review the slop created by other AI agents is not the answer here.
I much prefer a blanket ban on PRs and issues created by AI agents (which is what I personally do for my repos; so far I have closed one[1]). In fact I would love a github alternative which considers AI contributions to be a breach of their terms of use and ban any people who let AI agents loose on their platform.
1: https://github.com/runarberg/markdown-it-math/pull/48#issuec...
I would kill for an LLM-free platform.
Personally I just stopped accepting public contributions entirely. File issues, sure, but no PRs apart from accounts I added who have contributed before the slopageddon started.
Maybe the whole web-of-trust idea will make a comeback for code contributions, it seems like a clean solution.
I tend to disagree.
I think the comparison to email spam is apt. The answer to that problem was automated spam filters.
Imagine the difficulty you might find interacting with the world if your inbox was set up such that all emails not literally written by a human were auto-deleted. No account recovery, no receipts, etc. Individuals might choose to do that for themselves but it's not the general case answer.
That's different though - those are services you explicitly agree to and sign up for, be it at checkout, be it at service signup time, be it because you are making a google account on the google platform.
For example, a github cicd automerge pipeline is still good.
One interesting workflow I've seen is that the project maintainer simply rewrites and implements the pull request themselves and closes the PR.
LuaJIT has operated this way since 2012, though with a thanks and mention in the commit message. It seems like a good way to filter out people who prioritizes leveling up their github profiles.
Something a little bit similar, when I was hosting a social game server we had mods. And players always beg for mod status. At first I tried naming the admin group something weird like sandals, but eventually people would ask if they could be sandals too.
What worked best in the end was just hiding it completely making regular players see mods as other regular players. (mods would see who is a mod though)
I would also personally never make someone who asks a mod as it's almost always a sign of wanting power for the sake if it. I would instead just passively observe behavior until I trusted the player and make them a mod. I would then tell them that I don't expect them to exercise their power, but would demote if I see abuse of power.
But what about the good AI driven contributions though? Do you categorize all AI changes as slop by default or only the real bad ones that mix refactoring and tons of other unrelated changes with a fix?
Some can fix real issues, with a well targeted fix (not rewriting the world), well defined test and write up. If you accepted PRs before for other issues, you should be able to review and accept those too.
I think the litmus test is roughly "is this obviously AI created" - if it's a well crafted PR that doesn't do the things you mention, and solves a genuine issue in a sensible way then you'd not be able to tell.
The other part of the litmus test is "does the person submitting actually understand what they're submitting and why" - which is arguably not required for PRs that you'd otherwise accept, but since you have to put time and effort into determining whether a given contribution is ok to merge, it's common decency for the submitter to have done a self review first (AI or no AI)
> But what about the good AI driven contributions though?
Okay, who is going to wade through the noise to find the signal? You?
> But what about the good AI driven contributions though?
If even a preponderance of AI driven contributions were good, there wouldn't be blog posts and announcements making HN's front page daily about how various OSS projects and/or prominent figures were figuring out how to filter them/exclude them entirely.
If AI code was good, there wouldn't be such a thrust among so many varying communities to remove it, or ignore it.
There is, because it isn't, and because maintainers are getting fed up with it. There are good PR's just like there are emails that aren't spam that get caught in spam filtering, but spam filtering is still the default position because to allow it all is onerous to the people involved.
I think the biggest issue is simply that these tools, like any labor-saving tool, are being marketed most heavily to people who do not know how to create software. "Write code even if you know nothing about writing code." "This will let people who aren't software engineers make software." "Democratize development." On and on.
This isn't even new, we've been dealing with this since I was a little one, back then we called them script kiddies. Now they're vibe coders and their existence continues to be a boil on the ass of proper software engineers. Instead of claude, you copied code off of Stack Overflow without understanding what it did, and often foot-bulleted yourself in the process.
I have never gotten a good PR from an AI agent (that I know of) so I guess I’ll deal with it when it happens. I suspect I will still just reject it out of principal.
Why do you ask me to do the categorizing? If you're sending me a PR, then you should be filtering the bad ones from the good. If you're just going to send me drive-by PRs, then I don't have time for you.
I mean, sure, I have to make the final determination. But you should not be sending me uncurated slop.