Comment by manbitesdog

2 months ago

I cringe every time I see Claude trying to co-author a commit. The git history is expected to track accountability and ownership, not your Bill of Tools. Should I also co-author my PRs with my linter, intellisense and IDE?

If those tools are writing the code then in general I do expect that to be included in the PR! Through my whole career I've seen PRs where people noted that code that was generated (people have been generating code since long before LLMs). It's useful context unless you've gone over the generated code and understand it and it is the same quality as if you wrote it yourself (which in my experience is the case where it's obvious boilerplate or the generated section is small).

Needing to flag nontrivial code as generated was standard practice for my whole career.

  • > It's useful context unless you've gone over the generated code and understand it and it is the same quality as if you wrote it yourself

    If this is not the case you should not be sending it to public repos for review at all. It is rude and insulting to expect the people maintaining these repos to review code that nobody bothered to read.

    • Sometimes code generation is a useful tool, and maybe people have read and reviewed the generator.

      The difference here is that the generator is a non-deterministic LLM and you can't reason about its output the same way.

      1 reply →

    • "Here's what AI came up with and it mostly worked the one time I tested it. Might need improving".

      No. I don't want to test and pick through your shitty LLM generated code. If I wanted the entire code base to be junk, it'd say so in the readme.

  • Usually, pre-LLM generated code is flagged because people aren't expected to modify it by hand. If you find a bug and track it to the generated code, you are expected to fix the sources and re-generate.

    This is not at all the case with LLM-generated code - mostly because you can't regenerate it even if you wanted to, as it's not deterministic.

    That said, I do agree that LLM code is different enough from human code (even just in regards to potential copyright worries) that it should be mentioned that LLMs were used to create it.

  • > If those tools are writing the code then in general I do expect that to be included in the PR!

    How about compiler?

    • Compilers don't usually write the code that ends up in a PR. But compilers do (and should) generally leave behind some metadata in the end result saying what tools were used, see for example the .comment section in ELF binaries.

    • Are you checking in compiled artifacts? Then yeah, we should have a chain of where that binary blob came from.

    • Do you check in binaries into your git history? If so, you should mark a commit as generated, and the commit message (plus repository state) should be enough to recreate it 1:1.

      Similarly, if I use e.g. jextract or uniffi to generate Java interfaces from C code and check that in, I'll create tooling to automatically run those, and the commit will be attributed to that tooling.

    • Compiler versions are usually included in the package manifest. Generally you include commit info compiler version and compilation date and platform embedded in the binaries that compilers produce.

  • Absolutely. Let's say I have a problem with gRPC and traced it to code generated using the gRPC compiler. I can reproduce it, highlight it and I'm pretty sure the gRPC team would address the issue.

    Replace gRPC compiler with LLM. Can you reproduce? (probably not 100%). Can anybody fix it short of throwing more english phrases like "DO NOT", "NEVER", "Under No Circumstances"?

    Probably not.

  • >It's useful context unless you've gone over the generated code and understand it and it is the same quality as if you wrote it yourself

    I thought the argument was that AI-users were reviewing and understanding all of the code?

  • > people have been generating code since long before LLMs

    How? LSTM?

    • There are many techniques. You're most likely to come across things like declarative DSL:s and macros, then there are things like JAXB and similar tooling that generates code from data schemas, and some people script around data sources to glue boilerplate and so on.

      Arguably snippet collections belong to this genre.

    • See, for example, this blog post from 2014: https://go.dev/blog/generate

      The following comment in the blog post

          //go:generate stringer -type=Pill
      

      generates a .._string.go file which contains a '.String()' method.

      I would find it very reasonable to commit that with 'Co-Authored-By: stringer v0.1.0' or such.

      Or 'sed s/a/b/g' and 'Co-Authored-By: sed'

  • You assemble all your machine code using a magnetized needle?

    • I am not against the general use of AI code. Quite simply, my view is that all relevant context for a review should be disclosed in the PR.

      AI and humans are not the same as authors of PRs. As an obvious example: one of the important functions of the PR process is to teach the writer about how to code in this project but LLMs fundamentally don't learn the same way as humans so there's a meaningful difference in context between humans and AIs.

      If a human takes the care to really understand and assume authorship of the PR then it's not really an issue (and if they do, they could easily modify the Claude messages to remove "generated by Claude" notes manually) but instead it seems that Claude is just hiding relevant context from the reviewer. PRs without relevant context are always frustrating.

      2 replies →

    • You don't generally commit compiled code to your VCS. If you do need to commit a binary for whatever reason, yeah it makes sense to explain how the binary was generated.

      1 reply →

A whole lot of people find LLM code to be strictly objectionable, for a variety of reasons. We can debate the validity of those reasons, but I think that even if those reasons were all invalid, it would still be unethical to deceive people by a deliberate lie of omission. I don't turn it off, and I don't think other people should either.

  • For the purpose of disclosure, it should say “Warning: AI generated code” in the commit message, not an advertisement for a specific product. You would never accept any of your other tools injecting themselves into a commit message like that.

  • My tools just don't add such comments. I don't know why I would care to add that information. I want my commits to be what and why, not what editor someone used. It seems like cruft to me. Why would I add noise to my data to cater to someone's neuroticism?

    At least at my workplace though, it's just assumed now that you are using the tools.

    • What editor you are using has no effect on things like copyright, while software that synthesises code might.

      In commercial settings you are often required to label your produce and inform about things like 'Made in China' or possible adverse effects of consumption.

    • well if I know a specific LLM has certain tendencies (eg. some model is likely to introduce off-by-one errors), I would know what to look for in code-review

      I mean, of course I would read most of the code during review, but as a human, I often skip things by mistake

      1 reply →

  • If a whole of people thought that running code through a linter or formatter was objectionable, I'd probably just dismiss their beliefs as invalid rather than adding the linter or formatter as a co-author to every commit.

    • Linters and formatters are different tools then LLMs. There is a general understanding that linters and formatters don’t alter the behavior of your program. And even still most projects require a particular linter and a formatter to pass before a PR is accepted, and will flag a PR as part of the CI pipeline if a particular linter or a particular formatter fails on the code you wrote. This particular linter and formatter is very likely to be mentioned somewhere in the configuration or at least in the README of the project.

    • Like frying a veggie burger in bacon grease. Just because somebody's beliefs are dumb doesn't mean we should be deliberately tricking them. If they want to opt out of your code, let them.

      7 replies →

  • Can you see a world where everyone has an AI Persona based on their prior work that acts like a RAG to inform how things should be coded? Meaning this is patent qualified code because, despite being AI configured, it is based on my history of coding?

  • Likewise. I don’t mind that people use LLMs to generate text and code. But I want any LLM generated stuff to be clearly marked as such. It seems dishonest and cheap to get Claude to write something and then pretend you did all the work yourself.

    • You can disclose that you used an LLM in the process of writing code in other ways, though. You can just tell people, you can mention it in the PR, you can mention it in a ticket, etc.

      3 replies →

    • So if I use Claude to write the first pass at the code, make a few changes myself, ask it to make an additional change, change another thing myself, then commit it — what exactly do you expect to see then?

      2 replies →

If you accept the code generated by them nearly verbatim, absolutely.

I don't understand why people consider Claude-generated code to be their own. You authored the prompts, not the code. Somehow this was never a problem with pre-LLM codegen tools, like macro expanders, IPC glue, or type bundle generators. I don't recall anybody desperately removing the "auto-generated do not edit" comments those tools would nearly always slap at the top of each file or taking offense when someone called that code auto-generated. Back in the day we even used to publish the "real" human-written source for those, along with build scripts!

  • It's weird, because they should not consider it as their own, but they should take accountability from it.

    Ideally, if I contribute to any codebase, what needs to be judged is the resulting code. Is it up to the project's standards ? Does the maintainer have design objections ?

    What tool you use shouldn't matter, be it your IDE or your LLM.

    But that also means you should be accountable for it, you shouldn't defend behind "But Claude did this poorly, not me !", I don't care (in a friendly way), just fix the code if you want to contribute.

    The big caveat to this is not wanting AI-Generated code for ideological reasons, and well, if you want that you can make your contributors swear they wrote it by themselves in the PR text or whatever.

    I'm not really sure how to feel about this, but I stand by my "the code is what matters" line.

  • Some differences with the human source for those kinds of tools: (1) the resultant generated code was deterministic (2) it was usually possible to get access to the exact version of the tool that generated it

    Since AI tools are constantly obsoleted, generate different output each run, and it is often impossible to run them locally, the input prompts are somewhat useless for everyone but the initial user.

Well is it actually being used as a tool where the author has full knowledge and mental grasp of what is being checked in, or has the person invoked the AI and ceded thought and judgment to the AI? I.e., I think in many cases the AI really is the author, or at least co-author. I want to know that for attribution and understanding what went into the commit. (I agree with you if it's just a tool.)

  • I have worked with quite a few people committing code they didn't fully understand.

    I don't meant this as a drive by bazinga either, the practice of copying code or thinking you understand it when you don't is nothing new

    • Yes and if they copy and paste code they don’t understand then they should disclose that in the commit message too!

Yes, it sets the reviewer's expectations around how much effort was spent reviewing the code before it was sent.

I regularly have tool-generated commits. I send them out with a reference to the tool, what the process is, how much it's been reviewed and what the expectation is of the reviewer.

Otherwise, they all assume "human authored" and "human sponsored". Reviewers will then send comments (instead of proposing the fix themselves). When you're wrangling several hundred changes, that becomes unworkable.

> Should I also co-author my PRs with my linter, intellisense and IDE?

Absolutely. That would be hilarious.

Tools do author commits in my code bases, for example during a release pipeline. If I had commits being made by Claude I would expect that to be recorded too. It isn't for recording a bill of tools, just to help understand a projects evolution.

I suspect vibe coders might actually want you to consider turning to Claude for accountability and ownership rather than the human orchestrator.

If your linter is able to action requests, then it probably makes sense to add too.

Eh, there are some very good reasons[0] that you would do better to track your usage of LLM derived code (primarily for legal reasons)

[0]: https://www.jvt.me/posts/2026/02/25/llm-attribute/

  • legally speaking.. if you're not sure of the risk- you don't document it.

    • >legally speaking.. if you're not sure of the risk- you don't document it.

      Ah, so you kinda maybe sorta absolve yourself of culpability (but not really — "I didn't know this was copyrighted material" didn't grant you copyright), and simultaneously make fixing the potentially compromised codebase (someone else's job, hopefully) 100x harder because the history of which bits might've been copied was never kept.

      Solid advice! As ethical as it is practical.

      By the same measure, junkyards should avoid keeping receipts on the off chance that the catalytic converters some randos bring in after midnight are stolen property.

      Better not document it.

      One little trick the legal folks don't want you to know!

Yea in my Claude workflow, I still make all the commits myself.

This is also useful for keeping your prompts commit-sized, which in my experience gives much better results than just letting it spin or attempting to one-shot large features.

No, because those things don't change the logical underpinnings of the code itself. LLM-written code does act in ways different enough from a human contributor that it's worth flagging for the reviewer.

> The git history is expected to track accountability and ownership, not your Bill of Tools.

The point isn't to hijack accountability. It's free publicity, like how Apple adds "Sent from my IPhone."

> Should I also co-author my PRs with my linter, intellisense and IDE?

Kinda, yeah. If I automatically apply lint suggestions, I would title my commit "apply lint suggestions".

  • Huh? Unless the sole purpose of the commit was to lint code, it would be unnecessary fluff to append the name of the automatically linted tools that ran in a pre-commit hook in every commit.

well maybe?

co-authoring doesn't hide your authorship

if I see someone committing a blatantly wrong code, I would wonder what tool they actually used

You have copyright to a commit authored by you. You (almost certainly) don't have copyright (nobody has) to a commit authored by Claude.

  • Where is there any legal precedent for that?

    In some jurisdictions (e.g. the UK) the law is already clear that you own the copyright. In the US it is almost certain that you will be the author. The reports of cases saying otherwise I have been misreported - the courts found the AI could not own the copyright.

    • >Where is there any legal precedent for that?

      Thaler v. Perlmutter: The D.C. Circuit Court affirmed in March 2025 that the Copyright Act requires works to be authored "in the first instance by a human being," a ruling the Supreme Court left intact by declining to hear the case in 2026.

      And in the US constitution,

      https://constitution.congress.gov/browse/article-1/section-8...

      Authors and inventors, courts have ruled, means people. Only people. A monkey taking a selfie with your camera doesn't mean you own a copyright. An AI generating code with your computer is likewise, devoid of any copyright protection.

      13 replies →

    • It's beyond obvious that a LLM cannot have copyright, any more than a cat or a rock can. The question is whether anyone has or if whatever content generated by a LLM simply does not constitute a work and is thus outside the entire copyright law. As far as I can see, it depends on the extent of the user's creative effort in controlling the LLM's output.

      7 replies →

  • Anthropic could at least make a compelling case for the copyright.

    It becomes legally challenging with regards to ownership if I ever use work equipment for a personal project. If it later takes off they could very well try to claim ownership in its entirety simply because I ran a test once (yes, there's a while silicon valley season for it).

    I don't know if they'd win, but Anthropic absolutely would be able to claim the creation of that code was done on their hardware. Obviously we aren't employees of theirs, though we are customers that very likely never read what we agreed to in a signup flow.

    • Using work equipment for a personal project only matters because you signed a contract giving all of your IP to your employer for anything you did with (or sometimes without) your employer's equipment.

      Anthropic's user agreement does not have a similar agreement.

      2 replies →