Openrsync: An implementation of rsync, by the OpenBSD team

1 day ago (github.com)

I've been using openrsync here and there since it was announced and it's definitely improved over time. I'm looking forward to when I can use it exclusively.

The one place in my usage where it doesn't match Samba rsync is with the following:

openrsync --rsync-path=openrsync -av -e ssh /etc/services example.com:/tmp/services

I would expect openrsync to create a remote file /tmp/services, but instead it creates /tmp/services/services.

Normal directory mirroring as in -av -e ssh /path/to/src/ example.com:/path/to/dst/ works as it does with Samba rsync.

  • > The one place in my usage where it doesn't match Samba rsync is with the following:

    > openrsync --rsync-path=openrsync -av -e ssh /etc/services example.com:/tmp/services

    This appears to match "normal" `rsync` behavior as well. I think you need a trailing slash after `services` to sync only the contents.

    EDIT: actually my "normal" rsync is openrsync on macOS...

  • Was there already a /tmp/services directory on the dest?

    One of the biggest points of confusion with rsync is how directories and trailing slashes are handled.

    • I hear that a lot, but I familiarized myself with it once and ever since it makes a lot of sense to me.

      Source ending in “/“: You want what’s inside. Source not ending in “/“: You want the thing (i.e. directory itself). For the destination, it does not matter whether it ends in “/“ or not, but for consistency I like adding a “/“ anyway (I want to put thing inside the directory).

      1 reply →

    • It's a big source of confusion with cp. One of the UI reasons to use rsync (for mundane non-remote copying) is that it doesn't do different things based on what's present on the target.

    • > Was there already a /tmp/services directory on the dest?

      No. And just to make sure, I ran a quick 'rm -rf /tmp/services' on the remote host, then re-ran openrsync on the client. Same result. This is OpenBSD 7.9 on both sides.

      And I 100% agree about trailing slashes.

  • If you use a trailing slash on the source it copies from the directory, if you omit the trailing slash it copies the directory itself. AFAIK this is pretty standard across POSIX tools

  • > I would expect openrsync to create a remote file /tmp/services, but instead it creates /tmp/services/services.

    As someone who has also suffered uncountable years of abuse from rsync, I understand the impulse, but I think it makes a lot more sense (and is a safer default) to create a second ”services”.

    If we have a chance to change rsync defaults to something less insane and save future generations from this mess I think we should.

    • We don't, since we're not implementing a UI from scratch, we're matching something else.

      Of the two possible worlds where in one this reimplementation matches what some see as annoyances in the interface or in another they mostly match the interface except for a few cases where the purposefully diverge (for no good technical reason), IMO the latter is far worse and causes more enexpected behavior.

      At most, add a special flag to opt into different default behavior so nobody is surprised by running the same command on different systems and getting different behavior.

      2 replies →

Given the sudden spike in vibe-coded commits to the rsync codebase, and regressions that’s introduced, this is very good news.

  • I was prepared to dismiss your comment because rsync has always been rock solid, but indeed upgrading broke my backup script. The latest issue on GitHub documents plenty of bugs introduced in the last 2 patches, including a monstrous ~9k LOC commit that was probably pointless.

    LLMs make writing code faster/easier, but the thinking was always the important bit. I’ve no idea why you’d muck up such a long-standing, reliable piece of software.

    • When LLM allows you to produce code n% faster, it also allows you to introduce bugs n% faster.

      I find it quite strange that people do not seem to be aware of that...I think many started worshipping the tool as if it was some kind of divinity and lost all objctivity. This doesn't bode well for the future if people aren't able to review code anymore.

      1 reply →

  • Yes, I also don't know how license critical distributions like Debian can even ship rsync now because it contains a) laundered code and b) Tridgell cannot claim copyright over the Claude additions.

    So rsync should be pinned to an older version and just get security updates.

> If the source or destination is on a remote server, the client then fork(2)s and starts the server openrsync on the remote host over ssh(1). The client and the server subsequently communicate over socketpair(2) pipes.

How is that supposed to work? I guess they mean something like, the forked child forwards the connection to the parent using a socket pair, or just connects its stdout/stdin to the socketpair "pipe" (socket) and execs ssh.

But that's like saying you're going to Australia by car when you mean you're driving to the airport.

The actual work of porting is matching the security features provided by OpenBSD's pledge(2) and unveil(2). These are critical elements to the functionality of the system. Without them, your system accepts arbitrary data from the public network.

https://justine.lol/pledge/

I am not seeing pledge on Alpine Linux in edge. Have people been testing Pledge on Linux? Did I perhaps misunderstand the risk of using Openrsync without pledge? Or is this article just for OpenBSD users?

  • Linux has no such features as pledge or unveil, nor capsicum. it has cgroups, namespaces and a mess ofnother things u need to combine to try and do similar things. (it was built iteratively as many systems interacting and being combined to form 'sandboxing' or isolation/limiting of capabilities rather than specific isolation as an entire concept with specific system calls and kernel paths to enable it).

    there might be newer stuff in linux land now i see comments about landlock but i assume those will build on the linux primitives rather than whole new ones. - total assumption there but it would seem logical to reuse rather than make new.

    part of likely what they mean by 'mess' is that its all over the place. many different ways to try and lock things down. hard to pick what is best etc. without thoroughly diving into the different subsystems entirely. (as opposed to just have 1 or 2 relatively simple system calls)

  • From above your quote:

    > The only officially-supported operating system is OpenBSD, as this has considerable security features.

    And below your quote:

    > This is possible (I think?) with FreeBSD's Capsicum, but Linux's security facilities are a mess, and will take an expert hand to properly secure.

    It is portable in the sense that it compiles and runs, not in the sense that it has the same security features.

    I'd love to see pledge/unveil on (upstream) Linux - but I'm not holding my breath.

    • > I'd love to see pledge/unveil on (upstream) Linux - but I'm not holding my breath

      There is Landlock now, I believe it would be possible to implement unveil and pledge on top of that.

      5 replies →

  • that quote seems to be a bit of an oversimplification to the point of being completely wrong.

    > Without them, your system accepts arbitrary data from the public network.

    Neither of these features change if you are accepting arbitrary data from the public network. They limit what an exploited process can do. It's explained properly in the 'Security' section, so I'm not sure where this came from.

All software anneals into a final form that is incontrovertibly correct. Each of these OpenBSD rewrites feels like the realization of such a final form!

Code is like a math proof: sketched first on a napkin, then on a blackboard, and then finally typeset in a paper. Each step tidies up the ideas to improve the communication of intent, and the final version should be self evidently stable and/or correct.

In the old days you’d see code stability emerge as the “v2” edition of some piece of software. Mozilla to Phoenix to Firebird/fox. Linux 2.2 to 2.4. Python 2.6 to 3.x. The design patterns are carried over but the implementations are revamped for more stable, more legible, and more maintainable code.

I don’t mind that vibe coding is the latest form of this phenomenon. We have all been “vibe coding” for decades really. Code like this crap:

  T = “hello world”

  def foo_2():
    FONT = “Perpetua.ttf”
    w = text(T, Font)
    w2 = w.translate((50,0,0))
    w3 = w.translate((0,0,20))
    show(w3)

…gets eventually rewritten, once it works, into:

  def render(phrase: str, font: Font) -> Shape:
    return w.text(phrase, font)

  font = Fonts.load(“Perpetua”)
  text = render(“hello world”, font).translate((50, 0, 20))
  show(text)

Before, we’d hack a v0, tidy it up sufficiently for it to be worth of review as v1, then, come back much later and rearrange the innards (in a far more sensible way) as v2.

With LLMs — especially in the hands of those who can’t read or won’t read the actual code — we are seeing a lot more version zeroes in the world. Thank you OpenBSD for giving us, albeit surprisingly for rsync, a nice v2.

They don’t support any recent rsync protocol, so there’s no 64bit timestamp support, so you can never actually sync metadata across newer filesystems.

This is the version used in macOS since 15.0.

  • Was it 15.0? I seem to recall it coming in one of the minor point releases in the 15.x line - and I remember it breaking some scripts mysteriously.

    EDIT: ah, fun: they did include it in 15.0, but they decided to save the breaking change that removed backwards compatibility for 15.4. https://apple.stackexchange.com/a/479297

What's the deal with the name? Openrsync implies to me that it's an open source alternative to a closed source program. But the original Rsync is GPL? Is this just the pushover license making it "more open"?

  • OpenBSD folks would consider the GPL to be less open due to the requirement to apply the GPL to any derivative works.

    • And GNU folks would say the GPL is actually the more open choice because it forces the project to stay open.

      Two different ways of thinking about it I guess... it's nice to have choices and I don't think one is more or less "correct", more a matter of opinion/taste I guess.

      27 replies →

  • Many projects closely associated with OpenBSD start with "open"... openssh, openbgpd, openntpd, opensmtpd etc.

A few comments here suggest rsync is undergoing some "churn" which tbh is highly undesirable for a command line utility. Might switch over.

As an aside I really love the stuff openBSD puts out. If they ever succeed in making a modern journalling filesystem I will probably switch over.

This attempt to avoid things that use AI is increasingly looking like some weird kind of reverse whack-a-mole where each targeted hole becomes radioactive after. Just grabbing some popcorn to watch.

  • Thanks for the heads-up! I wasn't aware that Tridge is using Claude. I shall use Openrsync from now on.

  • How about the attempt to avoid things that use AI promiscuously and start exhibiting bugs? :-(

    • Push for better guardrails and QA structures. Avoidance helps nobody in the long run, and isn't possible anyway without going completely cold turkey. Like literally in a few months every project worth using will directly or indirectly involve AI.

      2 replies →

  • I feel bad for people with the real name Claude.

    • It took me quite some time to realize what an utterly presumptuous product name Claude Code actually is, but only because Shannon is rarely mentioned with his first name. It's golden calf levels of hubris, even more so if you consider how incapable it was on release. It's like renaming calc.exe Einstein. Incredibly poor taste, but entirely in line with AI tech bro mentality.

      3 replies →

  • This is a recent trend in open-source though:

    - people avoiding systemd like it's the plague

    - people avoiding wayland because it is devil's work

    - people avoiding rsync because someone used AI on the testcases

    - ...

  • Wasting their precious limited time on this planet for performative hand wringing.

    AI is only going to get better and better. Eventually manually writing software by hand with programming languages will be thought of as the punch-card phase of software development.

    Do these people think we'll be writing software in 200 years time? That anybody will be maintaining rsync, let alone this "moral human hands only" version of it?

    The anti-AI lot are trying to make all AI content wear a Scarlett letter. I wish they would wear one themselves so that we could filter them from our timeline.

    This "effort" is entirely wasted.

    • It is not "performative hand wringing" to observe that a tool sucks and to reject its use. You cannot, at present, write quality software with AI tools. At best you get something you could've made yourself, slower than you could've made it yourself. Only a fool insists on using a tool when it has been proven to not work.

      2 replies →

Since I switched my VPS to OpenBSD (base only), I've been "forced" to handle openrsync and it has been mostly a drop-in. Except a problem where I couldn't mix `--exclude glob` and `--delete` on the client, no problem to report.

Thanks for the various vibecoding posts, I wasn't aware of the extent of it. Some prior HN discussion: https://news.ycombinator.com/item?id=48334021

rsync has specific running modes for the super-user. It also pumps arbitrary data from the network onto your file-system. openrsync is about 10 000 lines of C code: do you trust me not to make mistakes?

No, but that's why almost nobody runs it outside of strict trust boundaries. This security section would make more sense if rsync was like curl, which routinely deals with hostile counterparties. If the other side of your rsync is hostile, you probably have bigger problems!

(I'm not an rpki person so I don't know if there's some part of that problem domain that changes this equation. I'm not dunking on the project, just saying this snagged me in the README).

  • No, but that's why almost nobody runs it outside of strict trust boundaries. This security section would make more sense if rsync was like curl, which routinely deals with hostile counterparties. If the other side of your rsync is hostile, you probably have bigger problems!

    I disagree. While rsync is most often used to transfer data between "friendly" systems, it's inherently crossing a security boundary. It's important to make sure that an attacker can't leverage it to transform the breach of one system into the breach of multiple systems.

  • > almost nobody runs it outside of strict trust boundaries.

    I guess you can define "strict" however you want, but from what I saw ~10 years ago, most linux distros handled mirroring with rsync. That's a lot of usage in a pretty core part of the foundational open source ecosystem.

    • Many distros use rsync for that but also support unencrypted HTTP.

      They’re layering on checksums and signing such that they mostly don’t think about the trustworthiness of mirrors or the networks between them.

I have not checked with OpenBSD 7.9, but as of 7.8 it did not support --exclude or -z. But outside of that openrsync works great.

(EDIT: --exclude is now supported on 7.9. Not sure when that was added, nice!)

But seems avoiding "slop" is getting very hard. I saw postfix now has a bit of AI code in it.

https://mastodon.sdf.org/@mrmasterkeyboard@mastodon.social/1...

  • Somewhat ironic Postfix has a record of no root/RCE in the default install, where opensmptd hasn't (CVE-2020-7247). Time will tell if it stays that way.

  • Exclude is very commonly used in automation jobs to avoid duplicating big git repos and other big files. I think that would be a show stopper for a number of people.

    • I just tried openrsync(1) on OpenBSD 7.9, --exclude now works.

      I have not tried using exclude in openrsync in a while, but I can see it now works on OpenBSD 7.9!

I'm going to ask a question. I could ask chatgpt. I could Google it. I am asking a question because it is human to do so.

Ubuntu's packaged rsync, is it Samba rsync? Why reimplement it?

  • Ubuntu's rsync is samba rsync. It's not part of the samba project per se, but it is made by the same guy and the official url is https://rsync.samba.org/ so it's entirely fair to call it samba rsync in my opinion.

  • it's tridge rsync; samba is another project by the same guy. (rsync was originally a PhD thesis...)

    • tridge is mighty in the Linux world, he arguably greatly contributed to Linux as a server (via Samba to save on windows NT client user licenses), rsync, and IIRC reverse-engineering the bitlocker protocol thereby starting the gitstorm.

Might be offtopic, but i think when looking for alternatives the options should be broad, questioning the category of the app, not just clones and forks.

If you are considering migrating away from github, don't just consider gitlab and gitea, consider just git, and so on..

If you are considering migrating away from rsync, consider dd. You need to configure the folders you want to backup as mountable partitions or disks, but it does byte for byte copies instead of 100kloc fuckery.

I'm confused. Isn't rsync already free software? What are we doing here. Why are we trying to cuck ourselves for capital.

I like open bsd but this just seems like burning cash

  • My understanding is that much of the point of openrsync is to create a second implementation of a protocol so the standards bodies don't balk at including it in their standards.

    Or to put it more concretely, people working on the rpki standard(who happened to also be openbsd devs) wanted to use rsync to transfer bulk data. The standards body was hesitant, while rsync is ostensibly a documented protocol, there was only one implementation. So in true openbsd fashion they rolled up their sleeves and wrote that second implementation.

    On use, there is nothing wrong with openrsync, however it may never hit feature parity with rsync, that is not a goal of the project, they want a specific subset of rsync features to support their rpki needs. If anyone else finds this useful that is great. So I suspect users will be those who want a bsd licensed rsync(apple) or them who are willing to give up features for openbsd quality code(myself).

There is also a (stub) web page:

https://www.openrsync.org/

The problem with this fragmentation of rsync is that Apple and Android will prefer it, but the Linux and greater GPL world will adhere to the original implantation due to inertia. Power users will just have to know the quirks of each version.

The only way to stop this is for the original author(s) to release this under a BSD license.

Edit: For those assuming equivalent/identical behavior, study these words carefully: "accepts only a subset of rsync's command-line arguments."

  • > The only way to stop this is for the original author(s) to release this under a BSD license.

    Would that stop it? My understanding was that at least OpenBSD tended do redo things for technical reasons, not just licensing

    • Jeremy Alliston (assuming that my memory serves me) is the foremost to decide if this should be done.

  • It's really no different than every other BSD utility (and SysV utility, if you're running one of those) being different than the GNU ones. We've coped with it for fifty years at this point.

  • The only option should not to be to take away user freedoms. BSD licenses are popular with proprietary software writers because they can use it without any of the restrictions that seek to preserve the rights of the end user. Instead you get proprietary software stacks like Apple and Android that seek to lock the users out of anything not granted by the company.

    The correct way to stop this is to file bugs against the software for not matching the de-facto standard of the copied software.

  • Basically like GNU Tar/CPIO and BSD Tar/CPIO. I've largely standardised towards using the bsd variant everywhere (especially since now even Windows ships it and it handles lots of other archive formats using the `tar` command) but it's always a pain to install it everywhere

  • > The only way to stop this is for the original author(s) to release this under a BSD license

    No, then you get proprietary forks of the BSD codebase.

    Apple doesn't like GPLv3, but this is by choice.

    Sometimes, inventions by OpenBSD team (often using Open as prefix) become standard, such as OpenSSH and PF.

  • > The only way to stop this is for the original author(s) to release this under a BSD license.

    That is likely not possible even if they wanted to - unless all contributors have signed over rights to their contributions.

    Even then if the new project is specifically wanting to simplify things, and/or a change in language is important, reimplementation might still be preferable for them.

  • > Apple and Android will prefer it,

    My thought upon reading this is why would Apple or Android bother including rsync? I've noticed that I've needed to install it manually on fresh installs of Debian, FreeBSD...

    But then, I just checked a recent Mac that I don't use much and haven't put much on, and it's installed.