Comment by echelon
4 years ago
Better solution: only allow ASCII, maybe dashes, and up to twelve characters. Problem solved.
Enforce this in LDAP.
Strict convention is better than flexibility and predicting obscure edge cases that can fail.
4 years ago
Better solution: only allow ASCII, maybe dashes, and up to twelve characters. Problem solved.
Enforce this in LDAP.
Strict convention is better than flexibility and predicting obscure edge cases that can fail.
In my case, and for many people writing desktop software, and for absolutely everybody writing open-source tools or libraries, unfortunately you can't control the environment.
Non-ASCII paths are extremely common (e.g. the user's home directory on Windows, for the large majority of users outside the English-speaking world) and spaces, punctuation and weirder characters will definitely happen when you least expect it.
Yes if you can avoid it then absolutely that's great, but I don't think most people can.
It's also not usually very difficult to deal with, as long as you actually spot the issue in the first place.
only allow ASCII, maybe dashes, and up to twelve characters. Problem solved
...and only hire people from the exact same background as you, who will never have unusual characters or accents in their name. And also make sure not to have any users who aren't exactly like you, and conform to this very narrow requirement. Surely, excluding 90% of the world won't hurt revenue in any way.
Snarky, but I'll take it.
Use strict schema for the hardware interface, networking, physical stuff the user never sees. Microservice names don't need to be non-Latin. Database replicas, infrastructures, etc. And you're not going to piss off employees by giving them ASCII ldap/email addresses.
Use utf8mb4 or similar for storing names. Don't state "first" or "last". I've been through this rodeo too many times. You're not surprising anyone.
This is not excluding? I just use an ascii canonicalized version of my name and works fine.
UTF-8 strings aren’t reproducible anyways. User ID should be strictly for identification, be alphanumeric random string if necessary.
You can use an "ASCII-fied" version of the name, only ~27% of mine can be typed in ASCII letters that look similar but the rest is just phonetically or visually close-enough letters. This is something people did for decades and nowadays even government IDs have an ASCII-fied (well, Latin-fied) version of the name.
Ugh, we have the 15 character Active Directory limit now with hostnames, and a previous IT administration has imposed a convention that every name had to follow [prod|dev]-[ph|vm]-[service]-[nn]. So basically every production service is prod-vm-owtf-01— you get exactly four characters to actually describe what the machine does. Works great when the service is "jira" or "wiki", but there are a lot that are pretty mystical-sounding, like jkns, jwrk, cntr, hrbr, etc, where you kind of just have to know.
Do they at least allow you to set up CNAMEs?
Yes, and for many of the web-serving machines, that's what happens, they're jenkins.example.com or containers.example.com or similar. But often a singular service is backed by hidden worker nodes, databases, whatever else, and it seems silly to give those machines that level of indirection vs just using the hostname as their sole identifier.
I kind of like that honestly. No doubt you need some documentation so everyone knows what the service abbreviations are, but after you've been working there for a month you get it. Makes everything clean, consistent, and informational. You can quickly ascertain what a specific host is doing just from the name.
Oh absolutely it makes sense to have a standard, and being able to tell at a glance if something is a VM or physical machine is of value also. But dedicating 2/3s of the character budget to such a scheme is madness. If the prod-vm- prefix simply become pv-, then you'd at least be able to do pv-jenkins-01 again.
Anyway, all this was fine when we were on LDAP rather than Active Directory. So basically it's all Windows' fault.
Ah, that's the he enterprise edition.
But then your program will crash hard and unexpectedly when a user decides to save under "~/house plans" or ~/Téléchargements.
I think it's better to exercise this in CI, that's what CI is for.