← Back to context

Comment by jerf

10 years ago

"with each code point having 1/#code_point entropy."

That requires that users be uniformly-randomly selecting Unicode characters. There's a number of problems with this idea, most notably that the resulting password would have an insanely high "difficulty to type"/"bit of entropy" ratio. By the time you're through your third keyboard mode switch or third character typed in via generic Unicode hex entry, a 4-word passphrase user already has logged in and opened their browser.

Mixing in a single Unicode character into your password might be sorta clever, but you probably shouldn't rely on getting a lot more "bits" out of it.

Users don't uniformly select ASCII characters but generally we accept 1 char of password length === 8 bits of entropy.

  • No, we do not. Six is a much better estimate (26 times 2+10 = 62, close to 64), and that's still for a uniformly-random selection, which many passwords are not even close to.

  • > 1 char of password length === 8 bits of entropy

    Oh hell no. https://xkcd.com/936/

    The "little obscure tricks" to increase the entropy of a password do NOT work well with human memory. If your template is "Uncommon Word + Emoji + 5 tweaks", your entropy is 50,000 (the uncommon word) x (number of Emojis) x 5 * 8 (there are roughly 8 ways to "tweak" a word).

    There are no more than 500 Emojis that people use. You're not getting much entropy by choosing one. Now if you start choosing obscure Chinese words and Arabic symbols, maybe you'd be getting somewhere (It requires mastery of multiple languages to really exercise that UTF-8 dataset).

    But honestly, an English-speaker will get far more entropy by just adding two more common words (top 5000) to their password. A new common word is worth a hell of a lot more than an Emoji. A phrase of 8 words (ie a sentence) is also very easy to memorize and contains a ton of entropy as well.

    Even a simple sentence is impossible to brute force. The following sentence has probably never been said in the history of humanity:

    "My long password to gmail.com is a passphrase, the current sentence that I just typed, lulz!"

    That sentence is virtually unhackable and easy as heck to memorize. Sure, the entropy is only a few bits per character, but the length makes it better. And since it uses common letters, it is extremely quick to type.

    So unless you plan on learning a new language to hit those obscure Unicode symbols, I think its best to just stick with what your brain is already wired to memorize: Words. Common English Words.

    • The only down side to that, is when you're trying to enter it on your phone. I do use sentences, but generally not that long... usually wind up with 15-20 characters, which is long enough. LastPass helps with some instances.

      "F34r is the mind killer." as an example, does use replacement, but only in one of the words, it's short enough that phone entry isn't too bad, and is easy enough to remember. Given it's a phrase from a movie/book, but probably good enough.

      That said, I probably wouldn't have thought to use an emoji, I know some people hate it, but I do filter whitespace at the beginning/end of protected entry (reset codes, etc), as copy-paste + whitespace errors are more common than leading/trailing whitespace in a password.

      1 reply →