The biggest surprise to me with all this low-quality contribution spam is how little shame people apparently have. I have a handful of open source contributions. All of them are for small-ish projects and the complexity of my contributions are in the same ball-park as what I work on day-to-day. And even though I am relatively confident in my competency as a developer, these contributions are probably the most thoroughly tested and reviewed pieces of code I have ever written. I just really, really don't want to bother someone with low quality "help" who graciously offers their time to work on open source stuff.
Other people apparently don't have this feeling at all. Maybe I shouldn't have been surprised by this, but I've definitely been caught off guard by it.
It's because a lot of people that werent skilful werent on your path before.
Now that pandora's box has been re-opened, those people feel "they get a second chance at life". It's not that they have no shame, they have no perspective to put that shame.
You on the other hand, have for many years honed your craft. The more you learn, the more you discover to learn aka , you realize how little you know.
They don't have this. _At all_.
They see this as a "free ticket to the front row" and when we politely push back (we should be way harsher in this, its the only language they understand) all they hear is "he doesn't like _me_." which is an escape.
You know how much work you ask of me, when you open a PR on my project, they don't. They will just see it as "why don't you let me join, since I have AI I should have the same skill as you".... unironically.
In other words, these "other people" that we talk about haven't worked a day in the field in their life, so they simply don't understand much of it, however they feel they understand everything of it.
This is so completely spot on. It’s happening in other fields too, particularly non-coding (but still otherwise specialized or technical) areas. AI is extremely empowering but what’s happening is that people are now showing up in all corners of the world armed with their phone at the end of their outstretched arm saying “Well ChatGPT says…” and getting very upset when told that, no, many apologies, but ChatGPT is wrong here too.
That all makes sense. But the more I know, the more I realize that a lot of software engineering isn't about crazy algorithms and black magic. I'd argue a good 80% of it is the ability to pick up the broken glass, something even many students can pull off. 15% of that comes down to avoiding landmines in a large field as you pick up said glass.
But that care isn't even evident here. People submitting prs that don't even compile, bug reports for issues that may not even exist. The minimum I'd expect is to check the work of whatever you vibe coded. We can't even get that. It's some. Odd form of clout chasing as if repos are a factor of success, not what you contribute to them.
I find that interesting because for the first 10 years of my career, I didn’t feel any confidence in contributing to open source at all because I didn’t feel I had the expertise to do so. I was even reluctant to file bugs because I always figured I was on the wrong and I didn’t want to cause churn for the maintainers.
This is easily the most spot-on comment I've read on HN in a long time.
The humility of understanding what you don't know and the limitations of that is out the window for many people now. I see time and time again the idea that "expertise is dead". Yet it's crystal clear it's not. But those people cannot understand why.
It all boils down to a simple reality: you can't understand why something is fundamentally bad if you don't understand it at all.
It's not as if there weren't that sort of people in our profession even before the rise of LLMs, as evidenced by the not infrequent comments about "gatekeeping" and "nobody needs to know academic stuff in a real day-to-day job" on HN.
> The biggest surprise to me with all this low-quality contribution spam is how little shame people apparently have.
ever had a client second guess you by replying you a screenshot from GPT?
ever asked anything in a public group only to have a complete moron replying you with a screenshot from GPT or - at least a bit of effor there - a copy/paste of the wall of text?
no, people have no shame. they have a need for a little bit of (borrowed) self importance and validation.
Which is why i applaud every code of conduct that has public ridicule as punishment for wasting everybody's time
Problem is people seriously believe that whatever GPT tells them must be true, because… I don't even know. Just because it sounds self-confident and authoritative? Because computers are supposed to not make mistakes? Because talking computers in science fiction do not make mistakes like that? The fact that LLMs ended up having this particular failure mode, out of all possible failure modes, is incredibly unfortunate and detrimental to the society.
This sounds a bit like the "Asking vs. Guessing culture" discussion on the front page yesterday. With the "Guesser" being GP who's front-loading extra investigation, debugging and maintenance work so the project maintainers don't have to do it, and with the "Asker" being the client from your example, pasting the submission to ChatGPT and forwarding its response.
Not OP, but I don't consider these the same thing.
The client in your example isn't a (presumably) professional developer, submitting code to a public repository, inviting the scrutiny of fellow professionals and potential future clients or employers.
Keep in mind that many people also contribute to big open source projects just because they believe it will look good ok their CV/GitHub and help them get a job. They don't care about helping anyone, they just want to write "contributed to Ghostty" in their application.
TBH Im not sure if this is a "growing up in a good area" vibe. But over the last decade or so I have had to slowly learn the people around me have no sense of shame. This wasnt their fault, but mine. Society has changed and if you don't adapt you'll end up confused and abused.
I am not saying one has to lose their shame, but at best, understand it.
Like with all things in life shame is best in moderation.
Too little or too much shame can lead to issue.
Problem is no one tells you what too little or too much actually is and there are many different situations where you need to figure it out on your own.
So I think sometimes people just get it wrong but ultimately everyone tries their best. Truly malicious shameless people are extremely rare in my experience.
For the topic at hand I think a lot of these “shameless” contributions come from kids
The adaption is going to be that competent, knowledgeable people will begin forming informal and formal networks of people they know are skilled and intelligent and begin to scorn the people who aren't skilled and aren't intelligent. They will be less willing to work with people who don't have a proven record of competence. This results in greater stratification and harder for people who aren't already part of the in group to break in.
It doesn't help that it seems like society has been trending to reward individuals with a lack of shame. Fortune favors the bold, that is.
Think of a lot of the inflammatory content on social media, how people have made whole careers and fortunes over outrage, and they have no shame over it.
It really does begin to look like having a good sense of shame isn't rewarded in the same way.
I worked for a major open-source company for half a decade. Everyone thinks their contribution is a gift and you should be grateful. To quote Bo Burnham, "you think your dick is a gift, I promise it's not".
Sounds like everyone's got some main character syndrome, the cure for that is to be a meaningless cog in the enterprise wheels for a while. But then I suspect a lot of open source contributions are done exactly by those people - they don't really matter in their day job, but in open source they can Make A Difference.
Of course, the vast majority of OS work is the same cog-in-a-machine work, and with low effort AI assisted contributions, the non-hero-coding work becomes more prevalent than ever.
Kind of by definition we will not see the people who do not submit frivolous PRs that waste the time of other people. So keep in mind that there's likely a huge amount of survivor bias involved.
Just like with email spam I would expect that a big part of the issue is that it only takes a minority of shameless people to create a ton of contribution spam. Unlike email spam these people actually want their contributions to be tied to their personal reputation. Which in theory means that it should be easier to identify and isolate them.
"Other people" might also just be junior devs - I have seen time and again how (over-)confident newbies can be in their code. (I remember one case where a student suspected a bug in the JVM when some Java code of his caused an error.)
It's not necessarily maliciousness or laziness, it could simply be enthusiasm paired with lack of experience.
Funny, I had a similar experience TAing “Intro to CS” (first semester C programming course). The student was certain he encountered a compiler bug (pushing back on my assumption there was something wrong with their code, since while compilers do have bugs, they are probably not in the code generation of a nested for loop). After spending a few minutes parsing their totally unindented code, the off-by-one error revealed itself
Our postgres replication suddenly stopped working and it took three of us hours - maybe days - of looking through the postgres source before we actually accepted it wasn't us or our hosting provider being stupid and submitted a ticket.
I can't imagine the level of laziness or entitlement required for a student (or any developer) to blame their tools so quickly without conducting a thorough investigation.
have found bugs in native JVM, usually it takes some effort, though. Printing the assembly is the easiest one. (I consider the bug in java.lang/util/io/etc. code not an interesting case)
Memory leaks and issues with the memory allocator are months long process to pin on the JVM...
In the early days (bug parade times), the bugs are a lot more common, nowadays -- I'd say it'd be an extreme naivete to consider JVM the culprit from the get-go.
It's good to regularly see such policies and discussions around them to remind me how staggeringly shameless some people could be and how many of such people out there. Interacting mostly with my peers, friends, acquaintances I tend to forget that they don't represent average population and after some time I start to assume all people are reasonable and act in good faith.
Yep, this. You can just look at the state of FOSS licensing across GitHub to see it in action: licenses are routinely stripped or changed to remove the original developers, even on trivial items, even on forked projects where the action is easily visible, even on licenses that allow for literally everything else. State "You can do everything except this" and loads of people will still actively do it, because they have no shame (or because they enjoy breaking someone else's rules? Because it gives them a power trip? Who knows).
Some people just want their name in the contributor list, whether it's for ego, to build a portfolio, etc. I think that's what it comes down to. Many projects, especially high profile ones, have to deal with low effort contributions - correcting spelling mistakes, reformatting code, etc. It's been going on for a long time. The Linux contributor guidelines - probably a lot of other projects too - specifically call this stuff out and caution people not to do it lest they suffer the wrath of the LKML. AI coding tools open up all kinds of new possibilities for these types of contributors, but it's not AI that's the problem.
A subset of open source contributors are only interested in getting something accepted so they can put it on their resume.
Any smart interviewer knows that you have to look at actual code of the contributions to confirm it was actually accepted and that it was a non-trivial change (e.g. not updating punctuation in the README or something).
In my experience this is where the PR-spammers fall apart in interviews. When they proudly tell you they’re a contributor to a dozen popular projects and you ask for direct links to their contributions, they start coming up with excuses for why they can’t find them or their story changes.
There are of course lazy interviewers who will see the resume line about having contributed to popular projects and take it as strong signal without second guessing. That’s what these people are counting on.
You just have to go take a look at what people write in social media, using their real name and photo, to conclude that no, some people have no shame at all.
I would imagine there are a lot of "small nice to haves" that people submit because they are frustrated about the mere complexity of submitting changes. Minor things that involve a lot of complexity merely in terms of changing some config or some default etc. Something where there is a significant probability of it being wrong but also a high probability of someone who knows the project being able to quickly see if it's ok or not.
i.e. imagine a change that is literally a small diff, that is easy to describe as a mere user and not a developer, and that requires quite a lot of deep understanding merely to submit as a PR (build the project! run the tests! write the template for the PR!).
Really a lot of this stuff ends up being a kind of failure mode of various projects that we all fall into at some point where "config" is in the code and what could be a simple change and test required a lot of friction.
Obviously not all submissions are going to be like this but I think I've tried a few little ones like that where I would normally just leave whatever annoyance I have alone but think "hey maybe it's 10 min faff with AI and a PR".
The structure of the project incentives kind of creates this. Increasing cost to contribution is a valid strategy of course, but from a holistic project point of view it is not always a good one especially assuming you are not dealing with adversarial contributors but only slightly incompetent ones.
To have that shame, you need to know better. If you don’t know any better, having access to a model that can make code and a cursory understanding of the language syntax probably feels like knowing how to write good code. Dunning-Krueger strikes again.
I’ll bet there are probably also people trying to farm accounts with plausible histories for things like anonymous supply chain attacks.
when it comes to enabling opportunities i dont think it becomes a matter of shame for them anymore.
A lot of people (especially in regions where living is tough and competition is fierce) will do anything by hook or crook to get ahead in competition. And if github contributions is a metric for getting hired or getting noticed then you are going to see it become spammed.
Funny enough, reading this makes me feel a little more confident and less... shame.
I've been deep-diving into AI code generation for more niche platforms, to see if it can either fill the coding gap in my skillset, or help me learn more code. And without writing my whole blog post(s) here, it's been fairly mediocre but improving over time.
But for the life of me I would never submit PRs of this code. Not if I can't explain every line and why it's there. And in preparation of publishing anything to my own repos I have a readme which explicitly states how the code was generated and requesting not to bother any upstream or community members with issues from it. It's just (uncommon) courtesy, no?
This is one thing I find funny about all the discussion around AI watermarking. Yes for absolutely nefarious bad actors it is incredibly important, but what seems clear is that the majority of AI users do absolutely nothing to conceal obvious tells of AI generation. Turns out people are shameless!
Two immediate ones I can think of:
- The yellow hue/sepia tone of any image coming out of ChatGPT
- People responding to text by starting with "Good Question!" or inserting hard-to-memorize-or-type unicode symbols like → into text where they obviously wouldn't have used that and have no history of using it.
> The biggest surprise to me with all this low-quality contribution spam is how little shame people apparently have
My guess is that those people have different incentives. They need to build a portfolio of open-source contributions, so shame is not of their concern. So, yeah, where you stand depends on where you sit.
To put this another way, shame is only effective if it's coupled with other repercussions with long standing effects.
An example I have of this is from high school where there were guys that were utterly shameless in asking girls for sex. The thing is it worked for them. Regardless of how many people turned them down they got enough of a hit rate it was an effective strategy. Simply put there was no other social mechanism that provided enough disincentive to stop them.
And to take the position as devil's advocate, why should they feel shame? Shame is typically a moral construct of the culture you're raised in and what to be ashamed for can vary widely.
For example, if your raised in the culture of Abrahamic religions it's very likely you're told to be ashamed for being gay. Whereas non-religious upbringing is more likely to say why the hell would you be ashamed for being gay.
TL:DR, shame is not an effective mechanism on the internet because you're dealing with far too many cultures that have wildly different views on shame, and any particular viewpoint on shame is apt to have millions to billions of people that don't believe the same.
It's because the AI is generating code better than they would write, and if you don't like it then that's fine... they didn't write it
it's easy to not have shame when you have no skin in the game... this is similar to how narcissists think so highly of themselves, it's never their fault
I am seeing the doomed future of AI math: just received another set theory paper by a set theory amateur with an AI workflow and an interest in the continuum hypothesis.
At first glance, the paper looks polished and advanced. It is beautifully typeset and contains many correct definitions and theorems, many of which I recognize from my own published work and in work by people I know to be expert. Between those correct bits, however, are sprinkled whole passages of claims and results with new technical jargon. One can't really tell at first, but upon looking into it, it seems to be meaningless nonsense. The author has evidently hoodwinked himself.
We are all going to be suffering under this kind of garbage, which is not easily recognizable for the slop it is without effort. It is our regrettable fate.
Lots of people cosplay as developers, and "contributing" to open source is a box they must check. It's like they go through the moves without understanding they're doing the opposite of what they should be doing. Same with having a tech blog, they don't understand that the end goal is not "having a blog" but "producing and sharing quality content"
> Other people apparently don't have this feeling at all.
I think this is interesting too. I've noticed the difference in dating/hook-up contexts. The people you're talking about also end up getting laid more but that group also has a very large intersection with sex pests and other shitty people. The thing they have in common though is that they just don't care what other people think about them. That leads some of them to be successful if they are otherwise good people... or to become borderline or actual crininals if not. I find it fascinating actually, like how does this difference come about and can it actually be changed or is it something we get early in life or from the genetic lottery.
The Internet (and developer communities) used to be a high trust society - mostly academics and developers, everyone with shared experiences of learning when it was harder to get resources, etc.
The grift culture has changed that completely, now students face a lot of pressure to spam out PRs just to show they have contributed something.
"The biggest surprise to me with all this low-quality contribution spam is how little shame people apparently have."
And this is one half of why I think
"Bad AI drivers will be [..] ridiculed in public."
isn't a good clause. The other is that ridiculing others, not matter what, is just no decent behavior. Putting it as a rule in your policy document makes it only worse.
> The other is that ridiculing others, not matter what, is just no decent behavior.
Shaming people for violating valid social norms is absolutely decent behaviour. It is the primary mechanism we have to establish social norms. When people do bad things that are harmful to the rest of society, shaming them is society's first-level corrective response to get them to stop doing bad things. If people continue to violate norms, then society's higher levels of corrective behaviour can involve things like establishing laws and fining or imprisoning people, but you don't want to start with that level of response. Although putting these LLM spammers in jail does sound awfully enticing to me in a petty way, it's probably not the most constructive way to handle the problem.
The fact that shamelessness is taking over in some cultures is another problem altogether, and I don't know how you deal with that. Certain cultures have completely abdicated the ability to influence people's behaviour socially without resorting to heavy-handed intervention, and on the internet, this becomes everyone in the world's problem. I guess the answer is probably cultivation of spaces with strict moderation to bar shameless people from participating. The problem could be mitigated to some degree if a Github-like entity outright banned these people from their platform so they could not continue to harass open-source maintainers, but there is no platform like that. It unfortunately takes a lot of unrewarding work to maintain a curated social environment on the internet.
No society can function without enforced rules. Most people do the pro-social thing most of the time. But for the rest, society must create negative experiences that help train people to do the right thing.
What negative experience do you think should instead be created for people breaking these rules?
Getting to live by the rules of decency is a privilege now denied us. I can accept that but I don't have to like it or like the people who would abuse my trust for their personal gain.
On a tangent: the origin of the problems with low-quality drive-by requests is github's social nature. That might have been great when GitHub started, but nowadays many use it as portfolio padding and/or social proof.
"This person contributed to a lot of projects" heuristic for "they're a good and passionate developer" means people will increasingly game this using low-quality submissions. This has been happening for years already.
Of course, AI just added kerosene to the fire, but re-read the policy and omit AI and it still makes sense!
A long term fix for this is to remove the incentive. Paradoxically, AI might help here because this can so trivially be gamed that it's obvious it's not longer any kind of signal.
Your point about rereading without ai makes so much sense.
The economics of it have changed, human nature hasn’t. Before 2023 (?) people also submitted garbage PRs just to be able to add “contributed to X” to their CV. It’s just become a lot cheaper.
Let's not forget the Hacktober Fest, the scourge of open source for over a decade now, the driver of low-quality "contribution" spam by hordes of people doing it for a goddamn free t-shirt.
No, this problem isn't fundamentally about AI, it's about "social" structure of Github and incentives it creates (fame, employment).
Mailing lists essentially solve this by introducing friction: only those who genuinely care about the project will bother to git send-email and defend a patch over an email thread. The incentive for low-quality drive-by submissions also evaporates as there is no profile page with green squares to farm. The downside is that it potentially reduces the number of contributors by making it a lot harder for new contributors to onboard.
I can see this becoming a pretty generally accepted AI usage policy. Very balanced.
Covers most of the points I'm sure many of us have experienced here while developing with AI. Most importantly, AI generated code does not substitute human thinking, testing, and clean up/rewrite.
On that last point, whenever I've gotten Codex to generate a substantial feature, usually I've had to rewrite a lot of the code to make it more compact even if it is correct. Adding indirection where it does not make sense is a big issue I've noticed LLMs make.
> AI generated code does not substitute human thinking, testing, and clean up/rewrite.
Isn't that the end goal of these tools and companies producing them?
According to the marketing[1], the tools are already "smarter than people in many ways". If that is the case, what are these "ways", and why should we trust a human to do a better job at them? If these "ways" keep expanding, which most proponents of this technology believe will happen, then the end state is that the tools are smarter than people at everything, and we shouldn't trust humans to do anything.
Now, clearly, we're not there yet, but where the line is drawn today is extremely fuzzy, and mostly based on opinion. The wildly different narratives around this tech certainly don't help.
> Isn't that the end goal of these tools and companies producing them?
It seems to be the goal. But they seem very far away from achieving that goal.
One thing you probably account for is that most of the proponents of these technologies are trying to sell you something. Doesn't mean that there is no value to these tools, but the wild claims about the capabilities of the tools are just that.
This is such a good write-up and something I'm struggling with very hard. Does quality of code in the traditional sense even matter anymore if e.g. CC can work with said code anyway. I haven't had imposter's in a long time, but it's spiking hard now. Whenever i read or write code I feel like I'm an incompetent dev doing obsolete things.
Everything except the first provision is reasonable. IMO it's none of your damn business how I wrote the code, only that I understand it, and am responsible for it.
It's one of those provisions that seem reasonable, but really have no justification. It's an attempt to allow something, while extracting a cost. If I am responsible for my code, and am considered the author in the PR, than you as the recipient don't have a greater interest to know than my own personal preference not to disclose. There's never been any other requirement to disclose anything of this nature before. We don't require engineers to attest to the operating system or the licensing of the tools they use, so materially outside your own purant interests, how does it matter?
It's a signal vs noise filter, because today, AI can make more mistakes. Your operating system or IDE cannot lead you to make a similar level or amount of mistakes while writing code.
It is of course your responsibility, but the maintainer may also want to change their review approach when dealing with AI generated code. And currently, as the AI Usage Policy also states, because of bad actors sending pull requests without reviewing or taking the responsibility themselves, this acts as a filter to separate your PR which you have taken the responsibility for.
Maintenance, for one. I imagine contributions that are 100% AI generated are more likely to have a higher maintenance burden and lower follow-up participation from the author in case fixes are needed.
I think I’m going to use it as a guide for our own internal AI guideline. We hire a lot of contractors and the amount of just awful code we get is really taking a toll and slowing site buildouts.
> Bad AI drivers will be banned and ridiculed in public. You've been warned. We love to help junior developers learn and grow, but if you're interested in that then don't use AI, and we'll help you. I'm sorry that bad AI drivers have ruined this for you.
Finally an AI policy I can agree with :) jokes aside, it might sound a bit too agressive but it's also true that some people have really no shame into overloading you with AI generated shit. You need to protect your attention as much as you can, it's becoming the new currency.
Well, this is explicitly public ridicule. The penalty isn't just feeling shamed. It's reputational harm, immortalized via Google.
One of the theorized reasons for junk AI submissions is reputation boosting. So maybe this will help.
And I think it will help with people who just bought into the AI hype and are proceeding without much thought. Cluelessness can look a lot like shamelessness at first.
I think it makes sense, both for this, and for curl.
Presumably people want this for some kind of prestige, so they can put it on their CV (contributed to ghostty/submitted security issue to curl).
If we change that equation to have them think "wait, if I do this, then when employers Google me they'll see a blog post saying I'm incompetent" changes calculation that is neutral/positive for if their slop gets accepted to negative/positive.
Doubly so as these bad AI drivers are trading away even the possibility of having attention. It's very possible to render yourself senseless through a habit of deference. Even if you're coming up with ways to optimize AI responses, you are just trying to make a more superior superior to defer to.
If you have good tests, certain types of change can be merged without manual testing. One problem specific to AI is that it has a tendency to game/bypass/nerf/disable tests, as opposed to actually making the code do the correct thing.
Quality of that verification matters, people who might use AI tend to cut corners. This does not completely solve problem with AI slop imo and solution quality. You ask Claude Code to go and implement a new feature in a complex code base, it will, the code might even work, but implementation might have subtle issues and might be missing the broader vision of the repo.
AI is so smart these days that I typically just ask Claude to verify the code for me.
This sort of request may have made sense in the old days but as the quality of generated code rapidly increases, so does the necessity of human intervention decrease.
If you're going to put something on someone else's desk, you're going to have to own it.
If you don't check it yourself, then you're going to own whatever your tooling misses, and also own the amount of others' time you waste through what the project has decided to categorize as negligence, which will make you look worse than if you simply made an honest mistake.
You still live under the impression that they are able to reason and verify it. Verification is done by doing it in The Real, not your or some llm's imagination (which is all their output, at best)
I literally just scrolled past a thread discussing the psychology of shamelessness and undeserved self-confidence that creates all this drive-by AI pull request slop wasting everyone else’s time and see this comment…
If you care so little, why are you even prompting at all? Surely you can leave it to its own devices without troubling it with your wishes? It seems like the farther you go down this path, the more likely it is that it'll have something better to do.
I really like the phrase "bad AI drivers"...AI is a tool, and the stupid drive-by pull requests just mean you're being inconsiderate and unhelpful in your usage of the tool, similar to how "bad drivers" are a nightmare to encounter on a highway...so stop it or you'll end up on the dashcam subreddit of programming.
The experience of using a coding agent is that you're more of a "backseat driver" though. The AI acts as your driver and you tell it where to go, sometimes making corrections if it's going the wrong way.
The experience is what you make of it. Personally I'm quite enjoying using AI as a way to generate code I can disagree with and refactor into what I want.
A factor that people have not considered is that the copyright status of AI generated text is not settled law and precedent or new law may retroactively change the copyright status of a whole project.
Maybe a bit unlikely, but still an issue no one is really considering.
There has been a single ruling (I think) that AI generated code is uncopyrightable. There has been at least one affirmative fair use ruling. Both of these are from the lower courts. I'm still of the opinion that generative AI is not fair use because its clearly substitutive.
I agree with you that generative AI is clearly not fair use.
However, at this point, the economic impact of trying to de tangle this mess would be so large, the courts likely won't do anything about it. You and I don't get to infringe on copyright; Microsoft, Facebook and Google sure do though.
I think the usage is so widespread now that the law will adapt to customs. It is untenable now to say code generated is uncopyrightable IMO. Maybe copyright as is defined right now is not enough, but then the legislation will change it. There is enough pressure on them from the business community to do so.
Some take that in consideration, I did when I until recently was in a CTO role, and I've come across companies that take compliance seriously and have decided against such code synthesis due to the unclear legal status.
I never thought of this, you are right. What happens if, let's say, AI generated text/code is "ilegal"? Especially what happens with all the companies that have been using it for their products? Do they need to rollback? It should be a shit show but super interesting to see it unfold...
“ Ultimately, I want to see full session transcripts, but we don't have enough tool support for that broadly.”
I have a side project, git-prompt-story to attach Claude Vode session in GitHub git notes. Though it is not that simple to do automatic (e.g. i need to redact credentials).
Not sure how I feel about transcripts. Ultimately I do my best to make any contributions I make high quality, and that means taking time to polish things. Exposing the tangled mess of my thought process leading up to that either means I have to "polish" that too (whatever that ends up looking like), or put myself in a vulnerable position of showing my tangled process to get to the end result.
I've thought about saving my prompts along with project development and even done it by hand a few times, but eventually I realized I don't really get much value from doing so. Are there good reasons to do it?
For me it's increasingly the work. I spend more time in Claude Code going back and forth with the agent than I do in my text editor hacking on the code by hand. Those transcripts ARE the work I've been doing. I want to save them in the same way that I archive my notes and issues and other ephemera around my projects.
It's not for you. It's so others can see how you arrived to the code that was generated. They can learn better prompting for themselves from it, and also how you think. They can see which cases got considered, or not. All sorts of good stuff that would be helpful for reviewing giant PRs.
If the AI generated most of the code based on these prompts, it's definitely valuable to review the prompts before even looking at the code. Especially in the case where contributions come from a wide range of devs at different experience levels.
At a minimum it will help you to be skeptical at specific parts of the diff so you can look at those more closely in your review. But it can inform test scenarios etc.
We're just not going to see any code written entirely without AI except in specialist niches, just as we don't see handwritten assembly and binaries. So the disclosure part is going to become boilerplate.
In the old era, the combination 'it works' + 'it uses a sophisticated language' + 'it integrates with a complex codebase' implied that this was an intentional effort by someone who knew what they were doing, and therefore probably safe to commit.
We can no longer make that social assumption. So then, what can we rely on to signal 'this was thoroughly supervised and reviewed and understood and tested?' That's going to be hard and subjective.
Personal reputations and track records are pedigrees and brands are going to become more important in the industry; and the meritocratic 'code talks no matter where you came from' ethos is at risk.
> No AI-generated media is allowed (art, images, videos, audio, etc.). Text and code are the only acceptable AI-generated content, per the other rules in this policy.
I find this distinction between media and text/code so interesting. To me it sounds like they think "text and code" are free from the controversy surrounding AI-generated media.
But judging from how AI companies grabbed all the art, images, videos, and audio they could get their hands on to train their LLMs it's naive to think that they didn't do the same with text and code.
> To me it sounds like "text and code" are free from the controversy surrounding AI-generated media.
It really isn't, don't you recall the "protests" against Microsoft starting to use repositories hosted at GitHub for training their own coding models? Lots of articles and sentiments everywhere at the time.
Seems to have died down though, probably because most developers seemingly at this point use LLMs in some capacity today. Some just use it as a search engine replacement, others to compose snippets they copy-paste and others wholesale don't type code anymore, just instructions then review it.
I'm guessing Ghostty feels like if they'd ban generated text/code, they'd block almost all potential contributors. Not sure I agree with that personally, but I'm guessing that's their perspective.
Right, that's what I'm thinking too (I'll update my statement a bit to make that more clear), but I constantly hear this perspective that it's all good for text and code but when it's media, then it's suddenly problematic. It's equally problematic for text and code.
It's not that code is distinct or "less than" art. It's an authority and boundaries question.
I've written a fair amount of open source code. On anything like a per-capita basis, I'm way above median in terms of what I've contributed (without consent) to the training of these tools. I'm also specifically "in the crosshairs" in terms of work loss from automation of software development.
I don't find it hard to convince myself that I have moral authority to think about the usage of gen AI for writing code.
The same is not true for digital art.
There, the contribution-without-consent, aka theft, (I could frame it differently when I was the victim, but here I can't) is entirely from people other than me. The current and future damages won't be born by me.
Alright, if I understand correctly, what you're saying is they make this distinction because they operate in the "text and code" space but not in the media space.
I've written _a lot_ of open source MIT licensed code, and I'm on the fence about that being part of the training data. I've published it as much for other people to use for learning purposes as I did for fun.
I also build and sell closed source commercial JavaScript packages, and more than likely those have ended up in the training data as well. Obviously without consent. So this is why I feel strong about making this separation between code and media, from my perspective it all has the same problem.
I’m starting to think AI will kill open source... and maybe even platforms like GitHub/GitLab as we know them.
What I’m seeing: a flood of new repos appearing on GitHub with huge codebases and "extensive" documentation, often produced in two or three commits. The problem is that nobody uses them, nobody reads the docs, and many of these projects don’t provide real value. But the infrastructure cost is real: storing it all, indexing it, scanning it, backing it up, mirroring it....
Licensing is another issue. Licenses protect against copying, but AI changes totally the game: someone can download a repo, ask Claude to analyze and understand it, and then generate a similar solution with no verbatim copying. That’s likely legal... So GPL becomes irrelevant..
If that becomes normal, I can easily imagine companies pulling back from open source. Why publish your best work if anyone can cheaply reimplement it? Code will move back to closed source and become the "secret sauce" again. A black box is harder to vibe-code than an open source repo...
> No AI-generated media is allowed (art, images, videos, audio, etc.). Text and code are the only acceptable AI-generated content, per the other rules in this policy
What's the reason for this?
Media is the most likely thing I'd consider using AI for as part of a contribution to an open source project.
My code would be hand crafted by me. Any AI use would be similar to Google use: a way to search for examples and explanations if I'm unclear on something. Said examples and explanations would then be read, and after I understand what is going on I'd write my code.
Any documentation I contributed would also be hand written. However, if I wanted to include a diagram in that documentation I might give AI a try. It can't be worse than my zero talent attempts to make something in OmniGraffle or worse a photograph of my attempt to draw a nice diagram on paper.
I'd have expected this to be the least concerning use of AI.
Very very few companies today have zero lines of AI generated code in their codebases. You can't copyright or patent specific code structures or ways of solving common problems.
At the Zulip open-source project, we've had a significant onslaught of AI slop in the past few months. It gets as absurd as PR descriptions with AI-generated "screenshots" of the app to "demonstrate" the changes. We've had to start warning contributors that we won't be able to review their work if they continue misusing AI, and occasionally banning repeat offenders. It's feels draining -- we want to spend our time mentoring people who'll actually learn from feedback, not interacting with contributors who are just copy-pasting LLM responses without thought.
it's really strange, I maintain a lot of OSS and I just don't see it. I've had a bit of slop, and the # of PRs I receive has 10x'ed, but the quality is generally quite good. I wonder if maybe it's because I make dev tool CLIs that are just easier for AI to work with?
A well crafted policy that, I think, will be adopted by many OSS.
You'd need that kind of sharp rules to compete against unhinged (or drunken) AI drivers and that's unfortunate. But at the same time, letting people DoS maintainers' time at essential no cost is not an option either.
I wholeheartedly agree. It's the people that are the problem, not the technology. In the hands of people who understand it's utility and limitations, AI becomes an assistant you can't imagine life without. In the hands of people who aren't so intellectually curious, it helps them run into a brick wall much faster.
My former boss often shared the words of his father: "A fool with a tool...is still a fool!"
Until now code was something costly to make and could only be created by our monkey brains.
But now we have some kind of electronic brains that can also generate code, not at the level of the best human brains out there but good enough for most projects. And they are quicker and cheaper than humans, for sure.
So maybe in the end this will reduce the need for human contributions to opensource projects.
I just know that as a solo developer AI coding agents enable me to tackle projects I didn't think about event starting before.
It is important to write the code yourself so you understand how it functions. I tried vibe coding a little bit. I totally felt like I was reading someone else's code base.
Sanitization practices of AI are bad too.
Let me be clear nothing wrong with AI in your workflow, just be an active participator in your code. Code is not meant to be one and done.
You will go through iteration after iteration, security fix after fix. This is how development is.
sounds reasonable to me. i've been wondering about encoding detailed AI disclosure in an SBOM.
on a related note: i wish we could agree on rebranding the current LLM-driven never-gonna-AGI generation of "AI" to something else… now i'm thinking of when i read the in-game lore definition for VI (Virtual Intelligence) back when i played mass effect 1 ;)
I think a social norm of disclosing AI use at all times would be great. People and companies should also be held 100% accountable for anything created using AI.
Ultimately what's happening here is AI is undermining trust in remote contributions, and in new code. If you don't know somebody personally, and know how they work, the trust barrier is getting higher. I personally am already ultra vigilant for any github repo that is not already well established, and am even concerned about existing projects' code quality into the future. Not against AI per se (which I use), but it's just going to get harder to fight the slop.
The problem is that most aren’t good, and bad ones can take a lot of effort to distinguish, if they look plausible on the surface. So the potentially good ones aren’t worth all the bad ones.
I agree with most of them being bad, I disagree with them taking lots of effort to distinguish, and I am maintainer unfortunately receiving receiving more and more using AI.
the endless issue with coding policy statements like this is that the people who need to read them the most are the ones who couldn't care less and don't read anything.
I think that a warning of public ridicule may be fine. However, actually doing it is quite low brow IMO. I'm sad to see more and more otherwise admirable projects step down to that (assuming they actually do it).
An unenforced threat is toothless. Publicly stating we do not appreciate XYZ pr that was ai generated, low effort and in bad faith is perfectly acceptable.
> Issues and discussions can use AI assistance but must have a full human-in-the-loop. This means that any content generated with AI must have been reviewed and edited by a human before submission.
I can see this being a problem. I read a thread here a few weeks ago where someone was called out on submitting an AI slop article they wrote with all the usual tells. They finally admitted it but said something to the effect they reviewed it and stood behind every line.
The problem with AI writing is at least some people appear incapable of critically reviewing it. Writing something yourself eliminates this problem because it forces you to pick your words (there could be other problems of course).
So the AI-blind will still submit slop under the policy but believe themselves to have reviewed it and “stand behind” it.
Honestly I don't care how people come with the code they create, but I hold them responsible for what they try to merge.
I work in a team of 5 great professionals, there hasn't been a single instance since Copilot launched in 2022 that anybody, in any single modification did not take full responsibility for what's been committed.
I know we all use it, to different extent and usage, but the quality of what's produced hasn't dipped a single bit, I'd even argue it has improved because LLMs can find answers easier in complex codebases. We started putting `_vendor` directories with our main external dependencies as git subtrees, and it's super useful to find information about those directly in their source code and tests.
It's really as simple. If your teammates are producing slop, that's a human and professional problem and these people should be fired. If you use the tool correctly, it can help you a lot finding information and connecting dots.
Any person with a brain can clearly see the huge benefit of these tools, but also the great danger of not reviewing their output line by line and forfeiting the constant work of resolving design tensions.
Of course, open source is a different beast. The people committing may not be professionals and have no real stakes so they get little to lose by producing slop whereas maintainers are already stretched in their time and attention.
> It's really as simple. If you or your teammates are producing slop, that's a human and professional problem and these people should be fired.
Agree, slop isn't "the tool is so easy to use I can't review the code I'm producing", slop is the symptom of "I don't care how it's done, as long as it looks correct", and that's been a problem before LLMs too, the difference is how quickly you reach the "slop" state now, not that you have gate your codebase and reject shit code.
As always, most problems in "software programming" isn't about software nor programming but everything around it, including communication and workflows. If your workflow allows people to not be responsible for what they produce, and if allows shitty code to get into production, then that's on you and your team, not on the tools that the individuals use.
I mean this policy only applies to outside contributors and not the maintainers.
> Ghostty is written with plenty of AI assistance, and many maintainers embrace AI tools as a productive tool in their workflow. As a project, we welcome AI as a tool!
> Our reason for the strict AI policy is not due to an anti-AI stance, but instead due to the number of highly unqualified people using AI. It's the people, not the tools, that are the problem.
Basically don't write slop and if you want to contribute as an outsider, ensure your contribution actually is valid and works.
Another idea is to simply promote the donation of AI credits instead of output tokens. It would be better to donate credits, not outputs, because people already working on the project would be better at prompting and steering AI outputs.
>people already working on the project would be better at prompting and steering AI outputs.
In an ideal world sure, but I've seen the entire gamut from amateurs making surprising work to experts whose prompt history looks like a comedy of errors and gotchas. There's some "skill" I can't quite put my finger on when it comes to the way you must speak to an LLM vs another dev. There's more monkey-paw involved in the LLM process, in the sense that you get what you want, but do you want what you'll get?
This is the most well-informed and reasonable AI policy I've seen so far. Neither kneejerk-hostile nor laissez faire; just a mature understanding of the limitations of LLMs, and an insistence on transparency and accountability when using them.
TLDR don't be an asshole and produce good stuff. But I have the feeling that this is not the right direction for the future. Distrust the process: only trust the results.
Moreover this policy is strictly unenforceable because good AI use is indistinguishable from good manual coding. And sometimes even the reverse. I don't believe in coding policies where maintainers need to spot if AI is used or not. I believe in experienced maintainers that are able to tell if a change looks sensible or not.
As someone who has picked up recently some 'legacy' code. AI has been really good at mostly summing up what is going on. In many cases it finds things I had no idea was wrong (because I do not know the code very well yet). This is so called 'battle hardened code'. I review it and say 'yeah its is wildly broken and I see how the original developer ended up here'. Sometimes the previous dev would be nice enough to leave a comment or some devs 'the code is the comments'. I have also had AI go wildly off the rails and do very dumb things. It is an interesting tool for sure one you have to keep an eye on or it will confidently make a foot gun for you. It is also nice for someone like me who has some sort of weird social anxiaty thing about bugging my fellow devs. In that I can create options tables and pick good ideas out of that.
I'm not sure I agree it's completely unenforceable: a sloppy, overly verbose PR, maybe without an attached issue, is pretty easy to pick out.
There's some sensible, easily-judged-by-a-human rules in here. I like the spirit of it and it's well written (I assume by Mitchell, not Claude, given the brevity).
This doesn't work in the age of AI where producing crappy results is much cheaper than verifying them. While this is the case, metadata will be important to understand if you should even bother verifying the results.
The time needed for an AI patch written with the prompt "now make it as small as possible, clean, and human coded" is as big as reviewing the patch itself.
Banned I understand but ridiculed? I would say that these bad drive by spammers are analogous to phishing emails. Do you engage with those? Are they worth any energy or effort from you? I think ghostty should just ghost them :)
EDIT: I'm getting downvoted with no feedback, which is fine I guess, so I am just going to share some more colour on my opinion in case I am being misunderstood
What I meant with analogous to phishing is that the intent for the work is likely the one of personal reward and perhaps less of the desire to contribute. I was thinking they want their name on the contributors list, they want the credit, they want something and they don't want to put effort on it.
Do they deserve to be ridiculed for doing that? Maybe. However, I like to think humans deserve kindness sometimes. It's normal to want something, and I agree that it is not okay to be selfish and lazy about it (ignoring contribution rules and whatnot), so at minimum I think respect applies.
Some people are ignorant, naive, and are still maturing and growing. Bullying them may not help (thought it could) and mockery is a form of aggression.
I think some true false positives will fall into that category and pay the price for those who are truly ill intended.
Lastly, to ridicule is to care. To hate or attack requires caring about it. It requires effort, energy, and time from the maintainers. I think this just adds more waste and is more wasteful.
Maybe those wordings are there just to 'scare' people away and maintainers won't bother engaging, though I find it is just compounding the amount of garbage at this point and nobody benefits from it.
Anyways, would appreciate some feedback from those of you that seem to think otherwise.
> You must state the tool you used (e.g. Claude Code, Cursor, Amp)
Interesting requirement! Feels a bit like asking someone what IDE they used.
There shouldn't be that meaningful of a difference between the different tools/providers unless you'd consistently see a few underperform and would choose to ban those or something.
The other rules feel like they might discourage AI use due to more boilerplate needed (though I assume the people using AI might make the AI fill out some of it), though I can understand why a project might want to have those sorts of disclosures and control. That said, the rules themselves feel quite reasonable!
The biggest surprise to me with all this low-quality contribution spam is how little shame people apparently have. I have a handful of open source contributions. All of them are for small-ish projects and the complexity of my contributions are in the same ball-park as what I work on day-to-day. And even though I am relatively confident in my competency as a developer, these contributions are probably the most thoroughly tested and reviewed pieces of code I have ever written. I just really, really don't want to bother someone with low quality "help" who graciously offers their time to work on open source stuff.
Other people apparently don't have this feeling at all. Maybe I shouldn't have been surprised by this, but I've definitely been caught off guard by it.
It's because a lot of people that werent skilful werent on your path before. Now that pandora's box has been re-opened, those people feel "they get a second chance at life". It's not that they have no shame, they have no perspective to put that shame.
You on the other hand, have for many years honed your craft. The more you learn, the more you discover to learn aka , you realize how little you know. They don't have this. _At all_. They see this as a "free ticket to the front row" and when we politely push back (we should be way harsher in this, its the only language they understand) all they hear is "he doesn't like _me_." which is an escape.
You know how much work you ask of me, when you open a PR on my project, they don't. They will just see it as "why don't you let me join, since I have AI I should have the same skill as you".... unironically.
In other words, these "other people" that we talk about haven't worked a day in the field in their life, so they simply don't understand much of it, however they feel they understand everything of it.
This is so completely spot on. It’s happening in other fields too, particularly non-coding (but still otherwise specialized or technical) areas. AI is extremely empowering but what’s happening is that people are now showing up in all corners of the world armed with their phone at the end of their outstretched arm saying “Well ChatGPT says…” and getting very upset when told that, no, many apologies, but ChatGPT is wrong here too.
6 replies →
That all makes sense. But the more I know, the more I realize that a lot of software engineering isn't about crazy algorithms and black magic. I'd argue a good 80% of it is the ability to pick up the broken glass, something even many students can pull off. 15% of that comes down to avoiding landmines in a large field as you pick up said glass.
But that care isn't even evident here. People submitting prs that don't even compile, bug reports for issues that may not even exist. The minimum I'd expect is to check the work of whatever you vibe coded. We can't even get that. It's some. Odd form of clout chasing as if repos are a factor of success, not what you contribute to them.
I find that interesting because for the first 10 years of my career, I didn’t feel any confidence in contributing to open source at all because I didn’t feel I had the expertise to do so. I was even reluctant to file bugs because I always figured I was on the wrong and I didn’t want to cause churn for the maintainers.
This is easily the most spot-on comment I've read on HN in a long time.
The humility of understanding what you don't know and the limitations of that is out the window for many people now. I see time and time again the idea that "expertise is dead". Yet it's crystal clear it's not. But those people cannot understand why.
It all boils down to a simple reality: you can't understand why something is fundamentally bad if you don't understand it at all.
It's not as if there weren't that sort of people in our profession even before the rise of LLMs, as evidenced by the not infrequent comments about "gatekeeping" and "nobody needs to know academic stuff in a real day-to-day job" on HN.
[flagged]
> The biggest surprise to me with all this low-quality contribution spam is how little shame people apparently have.
ever had a client second guess you by replying you a screenshot from GPT?
ever asked anything in a public group only to have a complete moron replying you with a screenshot from GPT or - at least a bit of effor there - a copy/paste of the wall of text?
no, people have no shame. they have a need for a little bit of (borrowed) self importance and validation.
Which is why i applaud every code of conduct that has public ridicule as punishment for wasting everybody's time
Problem is people seriously believe that whatever GPT tells them must be true, because… I don't even know. Just because it sounds self-confident and authoritative? Because computers are supposed to not make mistakes? Because talking computers in science fiction do not make mistakes like that? The fact that LLMs ended up having this particular failure mode, out of all possible failure modes, is incredibly unfortunate and detrimental to the society.
53 replies →
This sounds a bit like the "Asking vs. Guessing culture" discussion on the front page yesterday. With the "Guesser" being GP who's front-loading extra investigation, debugging and maintenance work so the project maintainers don't have to do it, and with the "Asker" being the client from your example, pasting the submission to ChatGPT and forwarding its response.
2 replies →
I've also had the opposite.
I raise an issue or PR after carefully reviewing someone else's open source code.
They ask Claude to answer me; neither them nor Claude understood the issue.
Well, at least it's their repo, they can do whatever.
Not OP, but I don't consider these the same thing.
The client in your example isn't a (presumably) professional developer, submitting code to a public repository, inviting the scrutiny of fellow professionals and potential future clients or employers.
1 reply →
Our CEO chiming in on a technical discussion between engineers: by the way, this is what Claude says: *some completely made-up bullshit*
2 replies →
Didn't happen to me yet.
I'm not looking forward to it...
Random people don’t do this. Your boss however…
Keep in mind that many people also contribute to big open source projects just because they believe it will look good ok their CV/GitHub and help them get a job. They don't care about helping anyone, they just want to write "contributed to Ghostty" in their application.
I think this falls under the "have no shame" comment that they made
It's worse. Some of them are required to contribute to an existing project of their choice for some course they're taking.
From my experience, it's not about helping anyone or CV building. I just ran into a bug or a missing feature that is blocking me.
TBH Im not sure if this is a "growing up in a good area" vibe. But over the last decade or so I have had to slowly learn the people around me have no sense of shame. This wasnt their fault, but mine. Society has changed and if you don't adapt you'll end up confused and abused.
I am not saying one has to lose their shame, but at best, understand it.
Like with all things in life shame is best in moderation.
Too little or too much shame can lead to issue.
Problem is no one tells you what too little or too much actually is and there are many different situations where you need to figure it out on your own.
So I think sometimes people just get it wrong but ultimately everyone tries their best. Truly malicious shameless people are extremely rare in my experience.
For the topic at hand I think a lot of these “shameless” contributions come from kids
4 replies →
The adaption is going to be that competent, knowledgeable people will begin forming informal and formal networks of people they know are skilled and intelligent and begin to scorn the people who aren't skilled and aren't intelligent. They will be less willing to work with people who don't have a proven record of competence. This results in greater stratification and harder for people who aren't already part of the in group to break in.
1 reply →
Shame is a good thing it shows one has a conscience and positive self regard.
Just like pain is a good thing, it tells you and signals to remove your hand from the stove.
2 replies →
It doesn't help that it seems like society has been trending to reward individuals with a lack of shame. Fortune favors the bold, that is.
Think of a lot of the inflammatory content on social media, how people have made whole careers and fortunes over outrage, and they have no shame over it.
It really does begin to look like having a good sense of shame isn't rewarded in the same way.
Lack of shame, and antisocial behavior in general are also directly economically rewarded nowadays thanks to the attention economy.
I worked for a major open-source company for half a decade. Everyone thinks their contribution is a gift and you should be grateful. To quote Bo Burnham, "you think your dick is a gift, I promise it's not".
> To quote Bo Burnham, "you think your dick is a gift, I promise it's not".
For those curious:
https://www.youtube.com/watch?v=llGvsgN17CQ
Sounds like everyone's got some main character syndrome, the cure for that is to be a meaningless cog in the enterprise wheels for a while. But then I suspect a lot of open source contributions are done exactly by those people - they don't really matter in their day job, but in open source they can Make A Difference.
Of course, the vast majority of OS work is the same cog-in-a-machine work, and with low effort AI assisted contributions, the non-hero-coding work becomes more prevalent than ever.
Kind of by definition we will not see the people who do not submit frivolous PRs that waste the time of other people. So keep in mind that there's likely a huge amount of survivor bias involved.
Just like with email spam I would expect that a big part of the issue is that it only takes a minority of shameless people to create a ton of contribution spam. Unlike email spam these people actually want their contributions to be tied to their personal reputation. Which in theory means that it should be easier to identify and isolate them.
All email is spam.
"Other people" might also just be junior devs - I have seen time and again how (over-)confident newbies can be in their code. (I remember one case where a student suspected a bug in the JVM when some Java code of his caused an error.)
It's not necessarily maliciousness or laziness, it could simply be enthusiasm paired with lack of experience.
Funny, I had a similar experience TAing “Intro to CS” (first semester C programming course). The student was certain he encountered a compiler bug (pushing back on my assumption there was something wrong with their code, since while compilers do have bugs, they are probably not in the code generation of a nested for loop). After spending a few minutes parsing their totally unindented code, the off-by-one error revealed itself
2 replies →
Our postgres replication suddenly stopped working and it took three of us hours - maybe days - of looking through the postgres source before we actually accepted it wasn't us or our hosting provider being stupid and submitted a ticket.
I can't imagine the level of laziness or entitlement required for a student (or any developer) to blame their tools so quickly without conducting a thorough investigation.
1 reply →
have found bugs in native JVM, usually it takes some effort, though. Printing the assembly is the easiest one. (I consider the bug in java.lang/util/io/etc. code not an interesting case)
Memory leaks and issues with the memory allocator are months long process to pin on the JVM...
In the early days (bug parade times), the bugs are a lot more common, nowadays -- I'd say it'd be an extreme naivete to consider JVM the culprit from the get-go.
It's good to regularly see such policies and discussions around them to remind me how staggeringly shameless some people could be and how many of such people out there. Interacting mostly with my peers, friends, acquaintances I tend to forget that they don't represent average population and after some time I start to assume all people are reasonable and act in good faith.
Yep, this. You can just look at the state of FOSS licensing across GitHub to see it in action: licenses are routinely stripped or changed to remove the original developers, even on trivial items, even on forked projects where the action is easily visible, even on licenses that allow for literally everything else. State "You can do everything except this" and loads of people will still actively do it, because they have no shame (or because they enjoy breaking someone else's rules? Because it gives them a power trip? Who knows).
3 replies →
Some people just want their name in the contributor list, whether it's for ego, to build a portfolio, etc. I think that's what it comes down to. Many projects, especially high profile ones, have to deal with low effort contributions - correcting spelling mistakes, reformatting code, etc. It's been going on for a long time. The Linux contributor guidelines - probably a lot of other projects too - specifically call this stuff out and caution people not to do it lest they suffer the wrath of the LKML. AI coding tools open up all kinds of new possibilities for these types of contributors, but it's not AI that's the problem.
A subset of open source contributors are only interested in getting something accepted so they can put it on their resume.
Any smart interviewer knows that you have to look at actual code of the contributions to confirm it was actually accepted and that it was a non-trivial change (e.g. not updating punctuation in the README or something).
In my experience this is where the PR-spammers fall apart in interviews. When they proudly tell you they’re a contributor to a dozen popular projects and you ask for direct links to their contributions, they start coming up with excuses for why they can’t find them or their story changes.
There are of course lazy interviewers who will see the resume line about having contributed to popular projects and take it as strong signal without second guessing. That’s what these people are counting on.
You just have to go take a look at what people write in social media, using their real name and photo, to conclude that no, some people have no shame at all.
I would imagine there are a lot of "small nice to haves" that people submit because they are frustrated about the mere complexity of submitting changes. Minor things that involve a lot of complexity merely in terms of changing some config or some default etc. Something where there is a significant probability of it being wrong but also a high probability of someone who knows the project being able to quickly see if it's ok or not.
i.e. imagine a change that is literally a small diff, that is easy to describe as a mere user and not a developer, and that requires quite a lot of deep understanding merely to submit as a PR (build the project! run the tests! write the template for the PR!).
Really a lot of this stuff ends up being a kind of failure mode of various projects that we all fall into at some point where "config" is in the code and what could be a simple change and test required a lot of friction.
Obviously not all submissions are going to be like this but I think I've tried a few little ones like that where I would normally just leave whatever annoyance I have alone but think "hey maybe it's 10 min faff with AI and a PR".
The structure of the project incentives kind of creates this. Increasing cost to contribution is a valid strategy of course, but from a holistic project point of view it is not always a good one especially assuming you are not dealing with adversarial contributors but only slightly incompetent ones.
To have that shame, you need to know better. If you don’t know any better, having access to a model that can make code and a cursory understanding of the language syntax probably feels like knowing how to write good code. Dunning-Krueger strikes again.
I’ll bet there are probably also people trying to farm accounts with plausible histories for things like anonymous supply chain attacks.
when it comes to enabling opportunities i dont think it becomes a matter of shame for them anymore. A lot of people (especially in regions where living is tough and competition is fierce) will do anything by hook or crook to get ahead in competition. And if github contributions is a metric for getting hired or getting noticed then you are going to see it become spammed.
Funny enough, reading this makes me feel a little more confident and less... shame.
I've been deep-diving into AI code generation for more niche platforms, to see if it can either fill the coding gap in my skillset, or help me learn more code. And without writing my whole blog post(s) here, it's been fairly mediocre but improving over time.
But for the life of me I would never submit PRs of this code. Not if I can't explain every line and why it's there. And in preparation of publishing anything to my own repos I have a readme which explicitly states how the code was generated and requesting not to bother any upstream or community members with issues from it. It's just (uncommon) courtesy, no?
This is one thing I find funny about all the discussion around AI watermarking. Yes for absolutely nefarious bad actors it is incredibly important, but what seems clear is that the majority of AI users do absolutely nothing to conceal obvious tells of AI generation. Turns out people are shameless!
Two immediate ones I can think of:
- The yellow hue/sepia tone of any image coming out of ChatGPT
- People responding to text by starting with "Good Question!" or inserting hard-to-memorize-or-type unicode symbols like → into text where they obviously wouldn't have used that and have no history of using it.
> how little shame people apparently have
You can expand this sentiment to everyday life. The things some people are willing to say and do in public is a never ending supply of surprising.
The major companies that made available the very tools they use to create this spam code, applied the exact same ethics.
> The biggest surprise to me with all this low-quality contribution spam is how little shame people apparently have
My guess is that those people have different incentives. They need to build a portfolio of open-source contributions, so shame is not of their concern. So, yeah, where you stand depends on where you sit.
Shamelessness is very definitely in vogue at the moment. It will pass, let's hope for more than ruins.
To put this another way, shame is only effective if it's coupled with other repercussions with long standing effects.
An example I have of this is from high school where there were guys that were utterly shameless in asking girls for sex. The thing is it worked for them. Regardless of how many people turned them down they got enough of a hit rate it was an effective strategy. Simply put there was no other social mechanism that provided enough disincentive to stop them.
And to take the position as devil's advocate, why should they feel shame? Shame is typically a moral construct of the culture you're raised in and what to be ashamed for can vary widely.
For example, if your raised in the culture of Abrahamic religions it's very likely you're told to be ashamed for being gay. Whereas non-religious upbringing is more likely to say why the hell would you be ashamed for being gay.
TL:DR, shame is not an effective mechanism on the internet because you're dealing with far too many cultures that have wildly different views on shame, and any particular viewpoint on shame is apt to have millions to billions of people that don't believe the same.
It's because the AI is generating code better than they would write, and if you don't like it then that's fine... they didn't write it
it's easy to not have shame when you have no skin in the game... this is similar to how narcissists think so highly of themselves, it's never their fault
I'm not surprised. Lower barrier of entry -- thanks to AI in this case -- often leads to a decrease in quality in most things.
https://x.com/JDHamkins/status/2014085911110131987
I am seeing the doomed future of AI math: just received another set theory paper by a set theory amateur with an AI workflow and an interest in the continuum hypothesis.
At first glance, the paper looks polished and advanced. It is beautifully typeset and contains many correct definitions and theorems, many of which I recognize from my own published work and in work by people I know to be expert. Between those correct bits, however, are sprinkled whole passages of claims and results with new technical jargon. One can't really tell at first, but upon looking into it, it seems to be meaningless nonsense. The author has evidently hoodwinked himself.
We are all going to be suffering under this kind of garbage, which is not easily recognizable for the slop it is without effort. It is our regrettable fate.
Lots of people cosplay as developers, and "contributing" to open source is a box they must check. It's like they go through the moves without understanding they're doing the opposite of what they should be doing. Same with having a tech blog, they don't understand that the end goal is not "having a blog" but "producing and sharing quality content"
> The biggest surprise to me with all this low-quality contribution spam is how little shame people apparently have.
My guess is it's mostly people from countries with a culture that reward shameless behavior.
> Other people apparently don't have this feeling at all.
I think this is interesting too. I've noticed the difference in dating/hook-up contexts. The people you're talking about also end up getting laid more but that group also has a very large intersection with sex pests and other shitty people. The thing they have in common though is that they just don't care what other people think about them. That leads some of them to be successful if they are otherwise good people... or to become borderline or actual crininals if not. I find it fascinating actually, like how does this difference come about and can it actually be changed or is it something we get early in life or from the genetic lottery.
The Internet (and developer communities) used to be a high trust society - mostly academics and developers, everyone with shared experiences of learning when it was harder to get resources, etc.
The grift culture has changed that completely, now students face a lot of pressure to spam out PRs just to show they have contributed something.
[flagged]
If you are from poor society you can't afford to have shame. You either succeed or fail, again and again, and keep trying.
In other news, wet roads cause rain.
"The biggest surprise to me with all this low-quality contribution spam is how little shame people apparently have."
And this is one half of why I think
"Bad AI drivers will be [..] ridiculed in public."
isn't a good clause. The other is that ridiculing others, not matter what, is just no decent behavior. Putting it as a rule in your policy document makes it only worse.
> The other is that ridiculing others, not matter what, is just no decent behavior.
Shaming people for violating valid social norms is absolutely decent behaviour. It is the primary mechanism we have to establish social norms. When people do bad things that are harmful to the rest of society, shaming them is society's first-level corrective response to get them to stop doing bad things. If people continue to violate norms, then society's higher levels of corrective behaviour can involve things like establishing laws and fining or imprisoning people, but you don't want to start with that level of response. Although putting these LLM spammers in jail does sound awfully enticing to me in a petty way, it's probably not the most constructive way to handle the problem.
The fact that shamelessness is taking over in some cultures is another problem altogether, and I don't know how you deal with that. Certain cultures have completely abdicated the ability to influence people's behaviour socially without resorting to heavy-handed intervention, and on the internet, this becomes everyone in the world's problem. I guess the answer is probably cultivation of spaces with strict moderation to bar shameless people from participating. The problem could be mitigated to some degree if a Github-like entity outright banned these people from their platform so they could not continue to harass open-source maintainers, but there is no platform like that. It unfortunately takes a lot of unrewarding work to maintain a curated social environment on the internet.
3 replies →
No society can function without enforced rules. Most people do the pro-social thing most of the time. But for the rest, society must create negative experiences that help train people to do the right thing.
What negative experience do you think should instead be created for people breaking these rules?
3 replies →
Getting to live by the rules of decency is a privilege now denied us. I can accept that but I don't have to like it or like the people who would abuse my trust for their personal gain.
Tit for tat
2 replies →
On a tangent: the origin of the problems with low-quality drive-by requests is github's social nature. That might have been great when GitHub started, but nowadays many use it as portfolio padding and/or social proof.
"This person contributed to a lot of projects" heuristic for "they're a good and passionate developer" means people will increasingly game this using low-quality submissions. This has been happening for years already.
Of course, AI just added kerosene to the fire, but re-read the policy and omit AI and it still makes sense!
A long term fix for this is to remove the incentive. Paradoxically, AI might help here because this can so trivially be gamed that it's obvious it's not longer any kind of signal.
Your point about rereading without ai makes so much sense.
The economics of it have changed, human nature hasn’t. Before 2023 (?) people also submitted garbage PRs just to be able to add “contributed to X” to their CV. It’s just become a lot cheaper.
Let's not forget the Hacktober Fest, the scourge of open source for over a decade now, the driver of low-quality "contribution" spam by hordes of people doing it for a goddamn free t-shirt.
No, this problem isn't fundamentally about AI, it's about "social" structure of Github and incentives it creates (fame, employment).
Mailing lists essentially solve this by introducing friction: only those who genuinely care about the project will bother to git send-email and defend a patch over an email thread. The incentive for low-quality drive-by submissions also evaporates as there is no profile page with green squares to farm. The downside is that it potentially reduces the number of contributors by making it a lot harder for new contributors to onboard.
I can see this becoming a pretty generally accepted AI usage policy. Very balanced.
Covers most of the points I'm sure many of us have experienced here while developing with AI. Most importantly, AI generated code does not substitute human thinking, testing, and clean up/rewrite.
On that last point, whenever I've gotten Codex to generate a substantial feature, usually I've had to rewrite a lot of the code to make it more compact even if it is correct. Adding indirection where it does not make sense is a big issue I've noticed LLMs make.
I agree with you on the policy being balanced.
However:
> AI generated code does not substitute human thinking, testing, and clean up/rewrite.
Isn't that the end goal of these tools and companies producing them?
According to the marketing[1], the tools are already "smarter than people in many ways". If that is the case, what are these "ways", and why should we trust a human to do a better job at them? If these "ways" keep expanding, which most proponents of this technology believe will happen, then the end state is that the tools are smarter than people at everything, and we shouldn't trust humans to do anything.
Now, clearly, we're not there yet, but where the line is drawn today is extremely fuzzy, and mostly based on opinion. The wildly different narratives around this tech certainly don't help.
[1]: https://blog.samaltman.com/the-gentle-singularity
> Isn't that the end goal of these tools and companies producing them?
It seems to be the goal. But they seem very far away from achieving that goal.
One thing you probably account for is that most of the proponents of these technologies are trying to sell you something. Doesn't mean that there is no value to these tools, but the wild claims about the capabilities of the tools are just that.
Intern generated code does not substitute for tech lead thinking, testing, and clean up/rewrite.
4 replies →
This is such a good write-up and something I'm struggling with very hard. Does quality of code in the traditional sense even matter anymore if e.g. CC can work with said code anyway. I haven't had imposter's in a long time, but it's spiking hard now. Whenever i read or write code I feel like I'm an incompetent dev doing obsolete things.
[dead]
Everything except the first provision is reasonable. IMO it's none of your damn business how I wrote the code, only that I understand it, and am responsible for it.
It's one of those provisions that seem reasonable, but really have no justification. It's an attempt to allow something, while extracting a cost. If I am responsible for my code, and am considered the author in the PR, than you as the recipient don't have a greater interest to know than my own personal preference not to disclose. There's never been any other requirement to disclose anything of this nature before. We don't require engineers to attest to the operating system or the licensing of the tools they use, so materially outside your own purant interests, how does it matter?
It's a signal vs noise filter, because today, AI can make more mistakes. Your operating system or IDE cannot lead you to make a similar level or amount of mistakes while writing code.
It is of course your responsibility, but the maintainer may also want to change their review approach when dealing with AI generated code. And currently, as the AI Usage Policy also states, because of bad actors sending pull requests without reviewing or taking the responsibility themselves, this acts as a filter to separate your PR which you have taken the responsibility for.
Maintenance, for one. I imagine contributions that are 100% AI generated are more likely to have a higher maintenance burden and lower follow-up participation from the author in case fixes are needed.
I think I’m going to use it as a guide for our own internal AI guideline. We hire a lot of contractors and the amount of just awful code we get is really taking a toll and slowing site buildouts.
I agree this could be a template that services like GitHub should propose, the same way as they suggest contributing and code of conduct templates.
> Bad AI drivers will be banned and ridiculed in public. You've been warned. We love to help junior developers learn and grow, but if you're interested in that then don't use AI, and we'll help you. I'm sorry that bad AI drivers have ruined this for you.
Finally an AI policy I can agree with :) jokes aside, it might sound a bit too agressive but it's also true that some people have really no shame into overloading you with AI generated shit. You need to protect your attention as much as you can, it's becoming the new currency.
I don't think ridicule is an effective threat for people with no shame to begin with.
Well, this is explicitly public ridicule. The penalty isn't just feeling shamed. It's reputational harm, immortalized via Google.
One of the theorized reasons for junk AI submissions is reputation boosting. So maybe this will help.
And I think it will help with people who just bought into the AI hype and are proceeding without much thought. Cluelessness can look a lot like shamelessness at first.
I think it makes sense, both for this, and for curl.
Presumably people want this for some kind of prestige, so they can put it on their CV (contributed to ghostty/submitted security issue to curl).
If we change that equation to have them think "wait, if I do this, then when employers Google me they'll see a blog post saying I'm incompetent" changes calculation that is neutral/positive for if their slop gets accepted to negative/positive.
Seems like it's addressing the incentives to me.
[dead]
Doubly so as these bad AI drivers are trading away even the possibility of having attention. It's very possible to render yourself senseless through a habit of deference. Even if you're coming up with ways to optimize AI responses, you are just trying to make a more superior superior to defer to.
"Pull requests created by AI must have been fully verified with human use." should always be a bare minimum requirement.
> "Pull requests [] must have been fully verified with human use."
I would expect this is entirely uncontroversial and the AI qualifier redundant.
If you have good tests, certain types of change can be merged without manual testing. One problem specific to AI is that it has a tendency to game/bypass/nerf/disable tests, as opposed to actually making the code do the correct thing.
1 reply →
I would hope that actually testing the changes is done regardless of whether or not AI is used
> verified with human use
Quality of that verification matters, people who might use AI tend to cut corners. This does not completely solve problem with AI slop imo and solution quality. You ask Claude Code to go and implement a new feature in a complex code base, it will, the code might even work, but implementation might have subtle issues and might be missing the broader vision of the repo.
> people who might use AI tend to cut corners
People do this all the time too, and is one source for the phrase "tech debt"
It's also a biased statement. I use Ai and I cut fewer corners now because the Ai can spam out that boring stuff for me
AI is so smart these days that I typically just ask Claude to verify the code for me.
This sort of request may have made sense in the old days but as the quality of generated code rapidly increases, so does the necessity of human intervention decrease.
If you're going to put something on someone else's desk, you're going to have to own it.
If you don't check it yourself, then you're going to own whatever your tooling misses, and also own the amount of others' time you waste through what the project has decided to categorize as negligence, which will make you look worse than if you simply made an honest mistake.
You still live under the impression that they are able to reason and verify it. Verification is done by doing it in The Real, not your or some llm's imagination (which is all their output, at best)
I literally just scrolled past a thread discussing the psychology of shamelessness and undeserved self-confidence that creates all this drive-by AI pull request slop wasting everyone else’s time and see this comment…
If you care so little, why are you even prompting at all? Surely you can leave it to its own devices without troubling it with your wishes? It seems like the farther you go down this path, the more likely it is that it'll have something better to do.
I really like the phrase "bad AI drivers"...AI is a tool, and the stupid drive-by pull requests just mean you're being inconsiderate and unhelpful in your usage of the tool, similar to how "bad drivers" are a nightmare to encounter on a highway...so stop it or you'll end up on the dashcam subreddit of programming.
The experience of using a coding agent is that you're more of a "backseat driver" though. The AI acts as your driver and you tell it where to go, sometimes making corrections if it's going the wrong way.
The experience is what you make of it. Personally I'm quite enjoying using AI as a way to generate code I can disagree with and refactor into what I want.
1 reply →
A factor that people have not considered is that the copyright status of AI generated text is not settled law and precedent or new law may retroactively change the copyright status of a whole project.
Maybe a bit unlikely, but still an issue no one is really considering.
There has been a single ruling (I think) that AI generated code is uncopyrightable. There has been at least one affirmative fair use ruling. Both of these are from the lower courts. I'm still of the opinion that generative AI is not fair use because its clearly substitutive.
I agree with you that generative AI is clearly not fair use.
However, at this point, the economic impact of trying to de tangle this mess would be so large, the courts likely won't do anything about it. You and I don't get to infringe on copyright; Microsoft, Facebook and Google sure do though.
I think the usage is so widespread now that the law will adapt to customs. It is untenable now to say code generated is uncopyrightable IMO. Maybe copyright as is defined right now is not enough, but then the legislation will change it. There is enough pressure on them from the business community to do so.
Some take that in consideration, I did when I until recently was in a CTO role, and I've come across companies that take compliance seriously and have decided against such code synthesis due to the unclear legal status.
This only matters if you get sued for copyright violation, though.
No? Licenses still apply even if you _don't_ get sued?
3 replies →
At what time in the future does this not become an issue?
If you're a big enough target, that is inevitable.
You may become a big enough target only when it's too late to undo it.
I never thought of this, you are right. What happens if, let's say, AI generated text/code is "ilegal"? Especially what happens with all the companies that have been using it for their products? Do they need to rollback? It should be a shit show but super interesting to see it unfold...
See x thread for rationale: https://x.com/mitchellh/status/2014433315261124760?s=46&t=FU...
“ Ultimately, I want to see full session transcripts, but we don't have enough tool support for that broadly.”
I have a side project, git-prompt-story to attach Claude Vode session in GitHub git notes. Though it is not that simple to do automatic (e.g. i need to redact credentials).
Not sure how I feel about transcripts. Ultimately I do my best to make any contributions I make high quality, and that means taking time to polish things. Exposing the tangled mess of my thought process leading up to that either means I have to "polish" that too (whatever that ends up looking like), or put myself in a vulnerable position of showing my tangled process to get to the end result.
I've thought about saving my prompts along with project development and even done it by hand a few times, but eventually I realized I don't really get much value from doing so. Are there good reasons to do it?
For me it's increasingly the work. I spend more time in Claude Code going back and forth with the agent than I do in my text editor hacking on the code by hand. Those transcripts ARE the work I've been doing. I want to save them in the same way that I archive my notes and issues and other ephemera around my projects.
My latest attempt at this is https://github.com/simonw/claude-code-transcripts which produces output like the is: https://gisthost.github.io/?c75bf4d827ea4ee3c325625d24c6cd86...
4 replies →
It's not for you. It's so others can see how you arrived to the code that was generated. They can learn better prompting for themselves from it, and also how you think. They can see which cases got considered, or not. All sorts of good stuff that would be helpful for reviewing giant PRs.
2 replies →
Using them for evals at a future date.
I save all of mine, including their environment, and plan to use them for iterating on my various system prompts and tool instructions.
If the AI generated most of the code based on these prompts, it's definitely valuable to review the prompts before even looking at the code. Especially in the case where contributions come from a wide range of devs at different experience levels.
At a minimum it will help you to be skeptical at specific parts of the diff so you can look at those more closely in your review. But it can inform test scenarios etc.
>I want to see full session transcripts, but we don't have enough tool support for that broadly
I think AI could help with that.
simow wrote a tool that does this for Claude code
https://simonw.substack.com/p/a-new-way-to-extract-detailed-...
You should be able to attach the plan file that you and Claude develop in Plan mode before even starting to code. This should be the source of truth.
On our team, we have discussed attaching claude transcripts to jira tickets, not github PRs (though the PRs are attached to tickets)
We're just not going to see any code written entirely without AI except in specialist niches, just as we don't see handwritten assembly and binaries. So the disclosure part is going to become boilerplate.
In the old era, the combination 'it works' + 'it uses a sophisticated language' + 'it integrates with a complex codebase' implied that this was an intentional effort by someone who knew what they were doing, and therefore probably safe to commit.
We can no longer make that social assumption. So then, what can we rely on to signal 'this was thoroughly supervised and reviewed and understood and tested?' That's going to be hard and subjective.
Personal reputations and track records are pedigrees and brands are going to become more important in the industry; and the meritocratic 'code talks no matter where you came from' ethos is at risk.
> No AI-generated media is allowed (art, images, videos, audio, etc.). Text and code are the only acceptable AI-generated content, per the other rules in this policy.
I find this distinction between media and text/code so interesting. To me it sounds like they think "text and code" are free from the controversy surrounding AI-generated media.
But judging from how AI companies grabbed all the art, images, videos, and audio they could get their hands on to train their LLMs it's naive to think that they didn't do the same with text and code.
> To me it sounds like "text and code" are free from the controversy surrounding AI-generated media.
It really isn't, don't you recall the "protests" against Microsoft starting to use repositories hosted at GitHub for training their own coding models? Lots of articles and sentiments everywhere at the time.
Seems to have died down though, probably because most developers seemingly at this point use LLMs in some capacity today. Some just use it as a search engine replacement, others to compose snippets they copy-paste and others wholesale don't type code anymore, just instructions then review it.
I'm guessing Ghostty feels like if they'd ban generated text/code, they'd block almost all potential contributors. Not sure I agree with that personally, but I'm guessing that's their perspective.
Right, that's what I'm thinking too (I'll update my statement a bit to make that more clear), but I constantly hear this perspective that it's all good for text and code but when it's media, then it's suddenly problematic. It's equally problematic for text and code.
2 replies →
I bet they aren't honoring the terms of the MIT license I use for my repos. It's pretty lenient and I bet they're still not compliant.
2 replies →
It's not that code is distinct or "less than" art. It's an authority and boundaries question.
I've written a fair amount of open source code. On anything like a per-capita basis, I'm way above median in terms of what I've contributed (without consent) to the training of these tools. I'm also specifically "in the crosshairs" in terms of work loss from automation of software development.
I don't find it hard to convince myself that I have moral authority to think about the usage of gen AI for writing code.
The same is not true for digital art.
There, the contribution-without-consent, aka theft, (I could frame it differently when I was the victim, but here I can't) is entirely from people other than me. The current and future damages won't be born by me.
Alright, if I understand correctly, what you're saying is they make this distinction because they operate in the "text and code" space but not in the media space.
I've written _a lot_ of open source MIT licensed code, and I'm on the fence about that being part of the training data. I've published it as much for other people to use for learning purposes as I did for fun.
I also build and sell closed source commercial JavaScript packages, and more than likely those have ended up in the training data as well. Obviously without consent. So this is why I feel strong about making this separation between code and media, from my perspective it all has the same problem.
2 replies →
I’m starting to think AI will kill open source... and maybe even platforms like GitHub/GitLab as we know them.
What I’m seeing: a flood of new repos appearing on GitHub with huge codebases and "extensive" documentation, often produced in two or three commits. The problem is that nobody uses them, nobody reads the docs, and many of these projects don’t provide real value. But the infrastructure cost is real: storing it all, indexing it, scanning it, backing it up, mirroring it....
Licensing is another issue. Licenses protect against copying, but AI changes totally the game: someone can download a repo, ask Claude to analyze and understand it, and then generate a similar solution with no verbatim copying. That’s likely legal... So GPL becomes irrelevant..
If that becomes normal, I can easily imagine companies pulling back from open source. Why publish your best work if anyone can cheaply reimplement it? Code will move back to closed source and become the "secret sauce" again. A black box is harder to vibe-code than an open source repo...
> No AI-generated media is allowed (art, images, videos, audio, etc.). Text and code are the only acceptable AI-generated content, per the other rules in this policy
What's the reason for this?
Media is the most likely thing I'd consider using AI for as part of a contribution to an open source project.
My code would be hand crafted by me. Any AI use would be similar to Google use: a way to search for examples and explanations if I'm unclear on something. Said examples and explanations would then be read, and after I understand what is going on I'd write my code.
Any documentation I contributed would also be hand written. However, if I wanted to include a diagram in that documentation I might give AI a try. It can't be worse than my zero talent attempts to make something in OmniGraffle or worse a photograph of my attempt to draw a nice diagram on paper.
I'd have expected this to be the least concerning use of AI.
AI generated media is a copyright gray zone.
Very very few companies today have zero lines of AI generated code in their codebases. You can't copyright or patent specific code structures or ways of solving common problems.
At the Zulip open-source project, we've had a significant onslaught of AI slop in the past few months. It gets as absurd as PR descriptions with AI-generated "screenshots" of the app to "demonstrate" the changes. We've had to start warning contributors that we won't be able to review their work if they continue misusing AI, and occasionally banning repeat offenders. It's feels draining -- we want to spend our time mentoring people who'll actually learn from feedback, not interacting with contributors who are just copy-pasting LLM responses without thought.
Our evolving AI policy is in the same spirit as ghostty's, with more detail to address specific failure modes we've experienced: https://zulip.readthedocs.io/en/latest/contributing/contribu...
Honestly, yours looks nothing like Mitchell's to me, and that's a good thing.
It's actually reasonable and the guidance you provide on how to best use Ai when contributing to Zulip is :chef's kiss:
truly, I'm going to copy yours as a thank you!
it's really strange, I maintain a lot of OSS and I just don't see it. I've had a bit of slop, and the # of PRs I receive has 10x'ed, but the quality is generally quite good. I wonder if maybe it's because I make dev tool CLIs that are just easier for AI to work with?
A well crafted policy that, I think, will be adopted by many OSS.
You'd need that kind of sharp rules to compete against unhinged (or drunken) AI drivers and that's unfortunate. But at the same time, letting people DoS maintainers' time at essential no cost is not an option either.
I wholeheartedly agree. It's the people that are the problem, not the technology. In the hands of people who understand it's utility and limitations, AI becomes an assistant you can't imagine life without. In the hands of people who aren't so intellectually curious, it helps them run into a brick wall much faster.
My former boss often shared the words of his father: "A fool with a tool...is still a fool!"
Even more true with AI.
Until now code was something costly to make and could only be created by our monkey brains.
But now we have some kind of electronic brains that can also generate code, not at the level of the best human brains out there but good enough for most projects. And they are quicker and cheaper than humans, for sure.
So maybe in the end this will reduce the need for human contributions to opensource projects.
I just know that as a solo developer AI coding agents enable me to tackle projects I didn't think about event starting before.
It is important to write the code yourself so you understand how it functions. I tried vibe coding a little bit. I totally felt like I was reading someone else's code base.
Sanitization practices of AI are bad too.
Let me be clear nothing wrong with AI in your workflow, just be an active participator in your code. Code is not meant to be one and done.
You will go through iteration after iteration, security fix after fix. This is how development is.
sounds reasonable to me. i've been wondering about encoding detailed AI disclosure in an SBOM.
on a related note: i wish we could agree on rebranding the current LLM-driven never-gonna-AGI generation of "AI" to something else… now i'm thinking of when i read the in-game lore definition for VI (Virtual Intelligence) back when i played mass effect 1 ;)
I recently had to do a similar policy for my TUI feed reader, after getting some AI slop spammy PRs: https://github.com/CrociDB/bulletty?tab=contributing-ov-file...
The fact that some people will straight up lie after submitting you a PR with lots of _that type_ of comment in the middle of the code is baffling!
I think a social norm of disclosing AI use at all times would be great. People and companies should also be held 100% accountable for anything created using AI.
I wish HN had a title tag for AI-generated posts, like it used to have for PDF’s and still does for year-of-publication.
How about autocomplete with LLMs? Should it be disclosed too? (scratching my balding head).
nah, this is just rhetorical polemic..
Ultimately what's happening here is AI is undermining trust in remote contributions, and in new code. If you don't know somebody personally, and know how they work, the trust barrier is getting higher. I personally am already ultra vigilant for any github repo that is not already well established, and am even concerned about existing projects' code quality into the future. Not against AI per se (which I use), but it's just going to get harder to fight the slop.
A good PR using IA should be impossible to distinguish from a non-AI one.
The problem is that most aren’t good, and bad ones can take a lot of effort to distinguish, if they look plausible on the surface. So the potentially good ones aren’t worth all the bad ones.
I agree with most of them being bad, I disagree with them taking lots of effort to distinguish, and I am maintainer unfortunately receiving receiving more and more using AI.
the endless issue with coding policy statements like this is that the people who need to read them the most are the ones who couldn't care less and don't read anything.
These are unbelievably reasonable terms for AI.
I might copy it for my company.
I think that a warning of public ridicule may be fine. However, actually doing it is quite low brow IMO. I'm sad to see more and more otherwise admirable projects step down to that (assuming they actually do it).
An unenforced threat is toothless. Publicly stating we do not appreciate XYZ pr that was ai generated, low effort and in bad faith is perfectly acceptable.
I agree with the second statement but public ridicule can be understood to have a stronger effect, and be in bad faith itself.
[dead]
> Issues and discussions can use AI assistance but must have a full human-in-the-loop. This means that any content generated with AI must have been reviewed and edited by a human before submission.
I can see this being a problem. I read a thread here a few weeks ago where someone was called out on submitting an AI slop article they wrote with all the usual tells. They finally admitted it but said something to the effect they reviewed it and stood behind every line.
The problem with AI writing is at least some people appear incapable of critically reviewing it. Writing something yourself eliminates this problem because it forces you to pick your words (there could be other problems of course).
So the AI-blind will still submit slop under the policy but believe themselves to have reviewed it and “stand behind” it.
Honestly I don't care how people come with the code they create, but I hold them responsible for what they try to merge.
I work in a team of 5 great professionals, there hasn't been a single instance since Copilot launched in 2022 that anybody, in any single modification did not take full responsibility for what's been committed.
I know we all use it, to different extent and usage, but the quality of what's produced hasn't dipped a single bit, I'd even argue it has improved because LLMs can find answers easier in complex codebases. We started putting `_vendor` directories with our main external dependencies as git subtrees, and it's super useful to find information about those directly in their source code and tests.
It's really as simple. If your teammates are producing slop, that's a human and professional problem and these people should be fired. If you use the tool correctly, it can help you a lot finding information and connecting dots.
Any person with a brain can clearly see the huge benefit of these tools, but also the great danger of not reviewing their output line by line and forfeiting the constant work of resolving design tensions.
Of course, open source is a different beast. The people committing may not be professionals and have no real stakes so they get little to lose by producing slop whereas maintainers are already stretched in their time and attention.
> It's really as simple. If you or your teammates are producing slop, that's a human and professional problem and these people should be fired.
Agree, slop isn't "the tool is so easy to use I can't review the code I'm producing", slop is the symptom of "I don't care how it's done, as long as it looks correct", and that's been a problem before LLMs too, the difference is how quickly you reach the "slop" state now, not that you have gate your codebase and reject shit code.
As always, most problems in "software programming" isn't about software nor programming but everything around it, including communication and workflows. If your workflow allows people to not be responsible for what they produce, and if allows shitty code to get into production, then that's on you and your team, not on the tools that the individuals use.
I mean this policy only applies to outside contributors and not the maintainers.
> Ghostty is written with plenty of AI assistance, and many maintainers embrace AI tools as a productive tool in their workflow. As a project, we welcome AI as a tool!
> Our reason for the strict AI policy is not due to an anti-AI stance, but instead due to the number of highly unqualified people using AI. It's the people, not the tools, that are the problem.
Basically don't write slop and if you want to contribute as an outsider, ensure your contribution actually is valid and works.
Another project simply paused external contributions entirely: https://news.ycombinator.com/item?id=46642012
Another idea is to simply promote the donation of AI credits instead of output tokens. It would be better to donate credits, not outputs, because people already working on the project would be better at prompting and steering AI outputs.
>people already working on the project would be better at prompting and steering AI outputs.
In an ideal world sure, but I've seen the entire gamut from amateurs making surprising work to experts whose prompt history looks like a comedy of errors and gotchas. There's some "skill" I can't quite put my finger on when it comes to the way you must speak to an LLM vs another dev. There's more monkey-paw involved in the LLM process, in the sense that you get what you want, but do you want what you'll get?
At the moment I have 20 subagents fixing stuff throughout my own codebase.
But I've never had the gall to let my AI agent do stuff on other people's projects without my direct oversight.
This is the most well-informed and reasonable AI policy I've seen so far. Neither kneejerk-hostile nor laissez faire; just a mature understanding of the limitations of LLMs, and an insistence on transparency and accountability when using them.
with limited training data that llm generated code must be atrocious
shaming doesn't work.
Very quaint of them to exempt themselves because they've "proven themselves" already.
Surely they are incapable of producing slop because they are just so much smarter than everyone else so the rules shouldn't apply to them, surely.
TLDR don't be an asshole and produce good stuff. But I have the feeling that this is not the right direction for the future. Distrust the process: only trust the results.
Moreover this policy is strictly unenforceable because good AI use is indistinguishable from good manual coding. And sometimes even the reverse. I don't believe in coding policies where maintainers need to spot if AI is used or not. I believe in experienced maintainers that are able to tell if a change looks sensible or not.
As someone who has picked up recently some 'legacy' code. AI has been really good at mostly summing up what is going on. In many cases it finds things I had no idea was wrong (because I do not know the code very well yet). This is so called 'battle hardened code'. I review it and say 'yeah its is wildly broken and I see how the original developer ended up here'. Sometimes the previous dev would be nice enough to leave a comment or some devs 'the code is the comments'. I have also had AI go wildly off the rails and do very dumb things. It is an interesting tool for sure one you have to keep an eye on or it will confidently make a foot gun for you. It is also nice for someone like me who has some sort of weird social anxiaty thing about bugging my fellow devs. In that I can create options tables and pick good ideas out of that.
I'm not sure I agree it's completely unenforceable: a sloppy, overly verbose PR, maybe without an attached issue, is pretty easy to pick out.
There's some sensible, easily-judged-by-a-human rules in here. I like the spirit of it and it's well written (I assume by Mitchell, not Claude, given the brevity).
This doesn't work in the age of AI where producing crappy results is much cheaper than verifying them. While this is the case, metadata will be important to understand if you should even bother verifying the results.
The time needed for an AI patch written with the prompt "now make it as small as possible, clean, and human coded" is as big as reviewing the patch itself.
1 reply →
[flagged]
[flagged]
That's really nice - and fast ui!
It gets even better when you click on "raw", IMO... which is what you also get when clicking on "raw" on Github.
Not sure why you are getting downvoted, given that the original site is such a jarringly user-hostile mess.
Without using a random 3rd party, and without the "jarring user-hostile mess":
https://raw.githubusercontent.com/ghostty-org/ghostty/refs/h...
2 replies →
Whatever your opinion on the GitHub UI may be, at least the text formatting of the markdown is working, which can't be said for that alternative site.
Banned I understand but ridiculed? I would say that these bad drive by spammers are analogous to phishing emails. Do you engage with those? Are they worth any energy or effort from you? I think ghostty should just ghost them :)
EDIT: I'm getting downvoted with no feedback, which is fine I guess, so I am just going to share some more colour on my opinion in case I am being misunderstood
What I meant with analogous to phishing is that the intent for the work is likely the one of personal reward and perhaps less of the desire to contribute. I was thinking they want their name on the contributors list, they want the credit, they want something and they don't want to put effort on it.
Do they deserve to be ridiculed for doing that? Maybe. However, I like to think humans deserve kindness sometimes. It's normal to want something, and I agree that it is not okay to be selfish and lazy about it (ignoring contribution rules and whatnot), so at minimum I think respect applies.
Some people are ignorant, naive, and are still maturing and growing. Bullying them may not help (thought it could) and mockery is a form of aggression.
I think some true false positives will fall into that category and pay the price for those who are truly ill intended.
Lastly, to ridicule is to care. To hate or attack requires caring about it. It requires effort, energy, and time from the maintainers. I think this just adds more waste and is more wasteful.
Maybe those wordings are there just to 'scare' people away and maintainers won't bother engaging, though I find it is just compounding the amount of garbage at this point and nobody benefits from it.
Anyways, would appreciate some feedback from those of you that seem to think otherwise.
Thanks!
PS: What I meant with ghostty should "ghost" them was this: https://en.wikipedia.org/wiki/Shadow_banning
It always amazes me why image/videos/audio generated by AI are treated differently from code.
Are images somehow better? If one draws, is he better the one that writes code? Why protect one and not the other. Or why protect any form at all?
> You must state the tool you used (e.g. Claude Code, Cursor, Amp)
Interesting requirement! Feels a bit like asking someone what IDE they used.
There shouldn't be that meaningful of a difference between the different tools/providers unless you'd consistently see a few underperform and would choose to ban those or something.
The other rules feel like they might discourage AI use due to more boilerplate needed (though I assume the people using AI might make the AI fill out some of it), though I can understand why a project might want to have those sorts of disclosures and control. That said, the rules themselves feel quite reasonable!