Comment by scrollaway

11 years ago

Speaking as someone who has worked a lot on mime types: Do not ever use `file` as a means to identify a file type. Use the shared-mime-info database.

http://standards.freedesktop.org/shared-mime-info-spec/share...

File has its own internal database. It can do more than the mime type db and is great for quickly identifying and getting lots of info about various file types, but it is AWFUL when used within other apps. Please, don't use it like so.

Not sure I agree with this recommendation... shared-mime-info database usually trails file(1) database by several years. Case in point: pcap-ng file format was added to file(1) in 2011; was added to mime-info database in 2013.

Also, file --mime FILENAME gets you the mime type.

  • The file database is not extensible. This means it will not recognize file formats specific to your setup, etc. With the shared-mime-info database, you can have user-wide and system-wide extensions to the database, so installed programs can (and do) install their own mime types. If you need to deal with, for example, a custom image or archive format you can also set that up.

    Additionally, the shared-mime-info database is much better curated than the file mime database; file has a lot of wrong mime types (I have gotten a dozen or so fixed so far). If there are file types that are missing from shared-mime-info, please report them on https://bugs.freedesktop.org/ (there are guidelines for good mime type reports, follow those and you can get a fix merged in very fast).

Crazy... wouldn't it make sense to combine both DBs?

  • No, the databases are very different. `file` can do some very advanced file processing, while xdg-mime has a regex-ish syntax for scanning part of the file.

    More to the point though, the xdg database is extensible. While `file` requires playing with the source.

    • But then one could combine the good parts of both? Very advanced file processing + extensible database, sounds awesome. ;)

      I mean seriously, once I wrote a simple web app in Go and the content-type for static serving was determined via file. (It's just so easy.) But nobody in the world wants to have Desktop stuff installed on her server.

      2 replies →

nice!

  ~$ xdg-mime query filetype /etc/passwd
  text/plain

  • Yeah, I'm not sure about that. According to the xdg-mime(1) man page:

    "The query option is for use inside a desktop session only."

    Why wouldn't it work from a console session? Or even a non-login session (e.g. cron job)? I'm not sure I want to trust a command-line program that only works as part of a desktop session. It sounds like the sort of fragile doodad that makes weird assumptions about its environment, which might one day no longer be true even in that environment, to me.

    Can anyone shed any further light on that crazy usage restriction?

    • What it actually means is that it needs the various XDG basedir variables to be set, which they sometimes aren't in a non-desktop session (this is becoming less common).

      http://standards.freedesktop.org/basedir-spec/basedir-spec-l...

      xdg-mime merely looks at the shared-mime-info db, processes files accordingly and spits out the result.

      Also, I did not actually say you should use xdg-mime; I said you should use the shared-mime-info database. xdg-mime is a (very crappy) interface for it, but using an xdg library is a lot, lot more efficient.

      2 replies →