Comment by riquito

11 years ago

In the end the error was in the program "file", which erroneously was detecting the postscript file as an Erlang JAM file

(see https://bugs.launchpad.net/ubuntu/+source/file/+bug/248619)

Speaking as someone who has worked a lot on mime types: Do not ever use `file` as a means to identify a file type. Use the shared-mime-info database.

http://standards.freedesktop.org/shared-mime-info-spec/share...

File has its own internal database. It can do more than the mime type db and is great for quickly identifying and getting lots of info about various file types, but it is AWFUL when used within other apps. Please, don't use it like so.

  • Not sure I agree with this recommendation... shared-mime-info database usually trails file(1) database by several years. Case in point: pcap-ng file format was added to file(1) in 2011; was added to mime-info database in 2013.

    Also, file --mime FILENAME gets you the mime type.

    • The file database is not extensible. This means it will not recognize file formats specific to your setup, etc. With the shared-mime-info database, you can have user-wide and system-wide extensions to the database, so installed programs can (and do) install their own mime types. If you need to deal with, for example, a custom image or archive format you can also set that up.

      Additionally, the shared-mime-info database is much better curated than the file mime database; file has a lot of wrong mime types (I have gotten a dozen or so fixed so far). If there are file types that are missing from shared-mime-info, please report them on https://bugs.freedesktop.org/ (there are guidelines for good mime type reports, follow those and you can get a fix merged in very fast).

  • Crazy... wouldn't it make sense to combine both DBs?

    • No, the databases are very different. `file` can do some very advanced file processing, while xdg-mime has a regex-ish syntax for scanning part of the file.

      More to the point though, the xdg database is extensible. While `file` requires playing with the source.

      3 replies →

  • nice!

      ~$ xdg-mime query filetype /etc/passwd
      text/plain

    • Yeah, I'm not sure about that. According to the xdg-mime(1) man page:

      "The query option is for use inside a desktop session only."

      Why wouldn't it work from a console session? Or even a non-login session (e.g. cron job)? I'm not sure I want to trust a command-line program that only works as part of a desktop session. It sounds like the sort of fragile doodad that makes weird assumptions about its environment, which might one day no longer be true even in that environment, to me.

      Can anyone shed any further light on that crazy usage restriction?

      3 replies →

file is funny, i just did file on standard JS file

  >file main.js 
  >main.js: ASCII C++ program text, with very long lines

  • Why does file always passive aggressively note the fact that you're using 'very long lines'? I'm not sure it's relevant to the format of the file.

    • `file` does more than identify the file format; it also looks for properties. It can detect sizes and color properties for most image and video formats, various audio properties for audio formats etc.

      "Very long lines" is sort of a property of text files. It means it's unlikely to have been written by hand for one thing.

      3 replies →