Tar Files Created on macOS Display Errors When Extracting on Linux (2024)

4 days ago (aruljohn.com)

Ex-Apple engineer here. This is, for better or worse, just the way Apple approaches this type of problem. From Apple's perspective, this is the way to preserve Finder / Gatekeeper / metadata semantics. It avoids silent data loss when round-tripping archives between Macs. This behavior also maintains consistency with copyfile(3) (as well as the Archive Utility behavior).

Apple treats tar less like “portable Unix interchange” and more like “archive this filesystem object faithfully.” That is very Apple, and very libarchive. ;-)

This is probably going to get worse (as Apple continues to add macOS-specific metadata), so your workaround is very helpful.

I haven't tested it in a while, but at one point, setting the COPYFILE_DISABLE=1 env variable would disable the inclusion of macOS-specific metadata.

  • Arguably, principle of least surprise is very Apple.

    If I point "tape archive" at a file system, I want that file system archived to tape. And so, tar does.

    If I don't, well, that's a fine option, and there's a fine option for that.

    So it's less of a "workaround" or something that "gets worse", than, "No, I don't really want a tape archive of this filesystem, only of some of it." And that's supported.

    That said, never seeing another .DS_Store should be a system-wide option!

    • > Arguably, principle of least surprise is very Apple.

      Principle of least surprise is good engineering practice. The question is always whose surprise. Someone who expects tar to behave like other UNIX systems is going to be surprised by this. Someone who expects tar on Apple to have perfect fidelity would be surprised by not-this.

      I increasingly feel like build systems should never be relying on any "native" utilities from the host system, and should instead be bringing them in via dependencies. You can't have this problem if your packaging system pulls in a specific portable `tar` library.

      17 replies →

    • > I want that file system archived to tape. And so, tar does.

      The traditional UNIX tar and cpio utilities cannot archive the modern Linux file systems without loss of metadata.

      Most modern tar programs implement various file format extensions as a workaround for this, but the extensions may be incompatible between distinct tar programs and frequently they are very poorly documented.

      Some years in the past, libarchive was the only archiver available on Linux that guaranteed lossless backups for the Linux file systems, e.g. xfs or ext4 (and also lossless file transfers between Linux file systems and FreeBSD file systems). Therefore that is what I have been using on Linux since then.

      Presumably since then GNU tar and other tar programs should have caught up with it, but I have not verified this.

      Whichever tar program was used in TFA, it was an obsolete tar program, and that was the real problem, not that the archives had been created on an Apple computer.

    • If you think that most people who run the tar command are assuming it will work like a tape archive, you'll probably be the one surprised

  • It's a good attitude to have, in my opinion. Portability is overrated. Linux developers should be doing a lot more of this. We should be making everything work better for us without caring how it's going to impact other irrelevant platforms. Let the people who actually care about those platforms worry about such things.

    • It would at least be nice if there was a way to keep apple users from shitting all over the filesystem with remote mounts and ds_store files. Perhaps by automatically unmounting if one is detected.

      2 replies →

    • Portability of tar archives at least. We should have some like .zip which are standardised and allow some like tar to be faithful replicas of exactly how the OS stores data.

      2 replies →

    • > Linux developers should be doing a lot more of this. We should be making everything work better for us without caring how it's going to impact other irrelevant platforms

      Linux developers already do. Using a BSD can already be a pain in the arse, thanks to (often poorly thought out) Linux-isms cropping up everywhere.

  • Yes, I completely disagree with TFA.

    The problem described in TFA is not specific to Apple, but the same problem appears when archiving any decent filesystem that has been designed during the last 3 decades and not a half of century ago, including all Linux file systems.

    The problem described in TFA is not caused by Apple, but by the author using an obsolete tar program and not being aware of this.

    The traditional tar file format cannot store a lot of the metadata that is contained in modern file systems (e.g. high resolution timestamps, access control lists, extended file attributes), so it is useless for such file systems.

    Most modern "tar" implementations have added extensions to the tar file format, to make it usable with modern file systems, such as Linux XFS or Linux EXT4. But many of these extensions are incompatible between themselves, so certain tar files can be fully extracted only with the same tar program that has created them.

    I strongly recommend against using the old tar or cpio file formats. Even with various extensions it is not guaranteed that they always work correctly.

    I always use only the pax file format, which has also required extensions in order to work with the modern file systems, but the pax extensions are cleaner than those for tar, because the file format is better designed.

    Libarchive, which was mentioned in TFA, is available in most Linux distributions or it can be built from source on any Linux computer. It provides an executable that is preferable to tar (better invoked as "bsdtar --format=pax") for the backup or transfer of any Linux files.

    I have not checked recently GNU tar or other tar programs available on Linux, and I hope that meanwhile they have been upgraded to be able to archive losslessly the Linux file systems, but some years ago that was not true, so using tar or cpio on Linux could easily corrupt the archived files.

  • Funnily enough, I got the error message and asked Claude Code, and it replied;

        The warning can be suppressed by `--no-xattrs --no-mac-metadata`.
    

    then just edited the code as

        -  tar czf dist.tar.gz dist
        +  COPYFILE_DISABLE=1 tar czf dist.tar.gz dist

  • To me, the big question is why Apple needs all these file attribute ? If the files are extracted OK, just ignore the errors :)

    • Apple has had multiple streams per file since the very beginning, and it can store useful and necessary information (the latter is quite rare now, as most things have sane defaults, but losing the extended attributes can lose things that can be annoying).

The title seems misleading.

These are not errors. They are simply warnings about extended attributes being ignored when extracting files, which seems completely fine to me, and creating the tar without those extended attributes has exactly the same outcome, but throws away the metadata at archive time instead of extraction time.

Furthermore, this is not an Apple/macOS issue. The tool used is bsdtar, so it would also affect all BSD-variants that default to bsdtar/libarchive, and those systems also have extended attributes, e.g., for SELinux, which would get added to the TAR.

I use these settings when creating a tar file for deploy:

    tar --no-xattrs --no-mac-metadata -czf

Per this 2018 page, GNU tar seems to work with SCHILY.* encoded xattrs, but not LIBARCHIVE.* ones:

* https://mgorny.pl/articles/portability-of-tar-features.html#...

* Via: https://github.com/mxmlnkn/ratarmount/issues/145

bsdtar ≥3.7.2 apparently adds both types to its files for maximum portability:

* https://github.com/libarchive/libarchive/pull/691/files#diff...

AFAICT, bsdtar will default to "ustar" format, but will auto-switch to "pax" if needed.

  • I wonder how come GNU tar never added them. I have to assume someone has brought the problem to their attention before.

I don't see errors, just warnings about unknown metadata. It's annoying yes. But they aren't errors.

Why switch to a completely different tar and rewire the PATH when you could just set a shell alias? You'll need to edit .bashrc both times but there's no need to install a second tar to /opt to solve this.

We might also ask, why doesn't Linux also track such meta-data? Are Linux users not also subject to drive-by downloads impersonating valid files? Should we be one chmod a+x away from compromise?

  • Yes, we should be.

    My computer should run programs when I tell it to run them.

    Don’t blunt _every_ tool just to make them harder to cut yourself on.

    • I hope you're in the very small minority of people who rigorously manage untrusted downloads and whitelist every binary, because you're operating an appliance from the 1970s, sticking a metal fork into an un-earthed toaster. Most people need help from their operating system.

      1 reply →

    • Increased metadata isn't tool blunting in itself though, even if MacOS uses it for being... annoying is one way of saying it.

      Provenance information bundled into a file is not the worst idea in the world IMO. We have created/modified timestamps on files already, right? There's definitely the question of "why" but hey if more of my binaries just had at least a tag about who put them there that would be a win in my book.

      Not an argument for doing what MacOS does, just an argument that the info would be nice to have.

    • It’s not blunting a tool, it’s sheathing it. Modern software requires too much proxied trust for this attitude to work.

    • I sincerely agree. By the way, thanks for lending your machine for my "Network-Retransmission-and-Compute-as-a-service" network.

  • > Are Linux users not also subject to drive-by downloads impersonating valid files?

    Linux users generally install software with apt or rpm. Or steam.

    The existence of any executable file outside the system dirs it a red flag in itself.

  • Should I be able to run files I download on my own computer? I think yes I should, hate fighting MacOS to do simple tasks because Apple engineers assume the end user has the average intelligence of an ostrich.

Homebrew installs GNU tar as "gtar". On my M4 MacBook:

  $ which gtar
  gtar is /opt/homebrew/bin/gtar

  • Ive installed the gtar formula and aliased it to tar. Cant be bothered to memorize the differences between macOS tar and unix tar, especially when the latter is considered to be the de facto standard

Would this ever affect me if I don't use many of MacOS built on tools? I brew install gnu equivalents make them all default. Just like how I also don't use most of their desktop environment stuff, and instead use rectangle, hammerspoon, karabiner to make it feel more like the Linux desktop I wish I could use at work.

Oh Arul John, just because you don't understand, means it's a error.

What horrible advice also to download different tar versions, for something that should just be explained properly.

If it weren't for the "2024" in the title, I would have thought this to be a result from AI.

But it's not artificial intelligence. It's real stupidity.

I'll admit that if I don't care about extended attributes (I never really do) I just use zip instead.

You can either send stderr to /dev/null or use --warning=no-unknown-keyword to suppress them cleanly.

But still interesting nonetheless why they are added

> If you are using a Mac with an Apple Silicon M1, M2, M3 or M4 processor,

Don't forget the MacBook Neo's A18 Pro :)