← Back to context

Comment by sokoloff

3 years ago

If you’ve been using your mailbox (and deleting mail) for a while, have over 50K mails in it now, and see (what you think is) a UID of 51950 on the most recent email, the chances that it’s “U” are extremely low, meaning there’s a gap in understanding or in implementation.

Every time I see that, I’m floored by it. The fact that message IDs in IMAP change when you delete messages has got to be one of the worst design choices in any in-use protocol. I’m flabbergasted by it.

The sooner everyone moves to jmap the better.

  • IMAP UIDs are guaranteed to be static and unique within a mailbox and a UIDVALIDITY. They don't just change by themselves while you're working on them because if you do you get this bridge issue. They aren't necessarily message IDs (those exist and are part of a different email standard) but clients shouldn't need them to be. If you delete a message, the ID doesn't normally change because all you do is add a flag.

    Many mail clients often have a recycle bin/trash folder where "deleted" email will go. When you click delete, the application lies to you about what it's about to do and starts a move to another folder/mailbox.

    This completely ignores the standard method IMAP has for deleting email. You can mark email as deleted but that email will remain stored until you call EXPUNGE (or UID EXPUNGE) on the server to empty the trash/actually delete the message. In other words, a virtual trash folder doesn't need to be stored on the server as a mailbox at all, the protocol already has a solution for this.

    There are mechanisms for globally unique mail identifiers, like Message-ID, but those are part of the emails themselves and not the protocol. IMAP deals with a combined primary key of (mailbox, UID, UIDVALIDITY) and that works just fine in my opinion. If the UID ever changes, your mail client will know about this because the UIDVALIDITY changes and the cache needs to be refetched. I see my mail client as a view for the backend data, because that's how IMAP was designed.

  • To forestall confusion: there are two different ID mechanisms in IMAP.

    The first mechanism is message sequence IDs. If you have a mailbox with N messages in it, these messages are numbered 1-N (inclusive); a new message gets N+1; and deleting message, say, 3 causes 4-N to be renumbered to 3-(N-1). Note that the server can be the one to delete the message (say another connected client deletes it), but server-to-client notifications of message deletion can only happen at specified times in the protocol. The client might still have stale sequence numbers when it sends you the next command however, because IMAP is pipelined. And if you think this sounds like a recipe for lots of weird bugs, you are indeed correct in those thoughts.

    So what everyone uses instead are UIDs, which are stable IDs for a message (kind of). UIDs are monotonically increasing (a message with UID 100 is newer than one with UID 95, but there's no guarantee that UID 97 exists), and are not impacted by message creation or deletion. One way to think of them are offsets in the underlying mbox file of where the message lives. If UIDs need to be renumbered (... yay mbox), the server changes the UIDVALIDITY which means that previous UIDs are no longer necessarily valid.

    The message sequence numbers kind of make sense, if you imagine that IMAP clients are very, very thin clients that don't maintain any local information, and if you imagine that servers are not expected to support two clients connecting to the same mailbox at the same time. But modern email clients need to maintain their own local database of email metadata, which means that the IMAP protocol has become, in effect, a database synchronization protocol, even though it's not originally designed as such (later IMAP extensions added features that made some elements of synchronization much faster).

    • There's arguably a third: Message-Id. It's meant to be globally unique and comes from the message itself. I don't know how workable it is for email synchronization with IMAP; IMAP servers will provide them (they're just another header) but I'm not sure the client commands exist for a message-id approach to be as efficient or ergonomic.

      1 reply →

  • > The fact that message IDs in IMAP change when you delete messages

    IDs can also change between sessions, and can be different between simultaneous sessions.

  • It's an artifact of IMAP having evolved in the presence of older mailstore technologies like mbox and maildir that couldn't easily accommodate long-term stable message IDs.

    It's fine though, as long as the MUA developers understand this and don't expect UIDs to be more stable than they actually are.