Ironically, this article is full of the sort of semantic confusions that cause the problem in the first place. The reporter clearly hasn't run the article past an actual programmer as she seems to think this outcome is a deliberate design choice rather than a bug:
> Null was first programmed 60 years ago by a British computer scientist named Tony Hoare ... Hoare probably wasn’t thinking about people with the 4,910th most common surname. He later called it his billion-dollar mistake, given the amount of programmer time it has used up and the damage it has inflicted on the user experience.
Obviously Hoare's statement wasn't about this problem at all. She's also giving readers the impression Microsoft has some sort of policy against using null values:
> “It’s a difficult problem to solve because it’s so widespread,” said Daan Leijen, a researcher at Microsoft, who says the company avoids use of null values in its software.
Whatever Leijen said, I'm pretty sure it wasn't that.
I really don't get why journalists so rarely do basic fact checking of their own articles by asking an independent source for a final read-through. Many of them actually have policies against doing this, which leads to an endless stream of garbled articles that undermines their credibility without them even noticing.
> > “It’s a difficult problem to solve because it’s so widespread,” said Daan Leijen, a researcher at Microsoft, who says the company avoids use of null values in its software.
> Whatever Leijen said, I'm pretty sure it wasn't that.
I had a good lol when I read that, imagining some top-level decree to NEVER use null values in any context in all of Microsoft
Afaik, there does exist a decree in most teams at MS to not use null if it can be avoided, and any attempt to do so is likely going to be flagged in code review - or so I'm told.
> Whatever Leijen said, I'm pretty sure it wasn't that
What makes you so sure? This is Hoare's apology for creating the null reference:
> I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
Given that null references cause crashes, why would it be unreasonable for a researcher at Microsoft to say they try and avoid them? Is this journalist really so far off?
> I really don't get why journalists so rarely do basic fact checking of their own articles
If you compare journalism to our other sources of information - such as comments on Hacker News and posts on social media - I think it holds up quite well, especially when the outlet is a reputable organization. It's quite fashionable for technology people to be highly critical of what they (pejoratively?) call "legacy media", but the alternatives that the technology industry have brought forward, like social media, are far, far worse in terms of accuracy, and also do very little of the kind of investigative reporting that is crucial for holding powerful officials to account.
> Null was first programmed 60 years ago by a British computer scientist named Tony Hoare ... Hoare probably wasn’t thinking about people with the 4,910th most common surname.
It mentions null as a mistake and then ties it to the word 'null' by referencing that a significant number of people have that last name. As though if it were called something like xkcd that has no pronunciation and is unlikely to be a last name, that would be better.
I think overall journalism is worse because its perceived as being authoritative. Social media post might be similar level of information, but Wikipedia won't cite it and the laymen realize to take it with a grain of salt. There's also better feedback as the comment section is front and center. Also some person with no knowledge, experience or curiosity in a subject is less likely to comment on it. While a journalist's job is to churn out a wide variety of pieces on topics they're likely unfamiliar with.
The issue is that the way it's presented in the article, a layman would interpret it to mean Microsoft don't use null values anywhere in any aspect of their software. Not as "we use them all the time pervasively but would like to do so a bit less", which is what was actually meant.
We'll have to disagree about the accuracy of legacy vs other forms of media. Many of the best investigations I've ever read have been by independents on Substack, for instance.
> I really don't get why journalists so rarely do basic fact checking
It takes time and effort with no discernable upside. In fact, knowing the true facts would make it harder for journalists to bias the story in the way they want to without them feeling a bit bad for lying. It’s easier for them if they don’t know.
> that undermines their credibility without them even noticing.
Not really. Even the vanishingly small minority of readers who know the details of the story in question suffer from Gell-Mann amnesia, and continue to believe that all other stories (by the same paper, and even from the same reporter) are perfectly accurate.
> It takes time and effort with no discernable upside.
The true answer is incentives, opportunity and aptitude. Incentives are skewed to writing something that will engage readers, the human version of the social media algorithm. Opportunities are short because reporting is done on deadline without the luxury of time to deeply ponder intermediate drafts. And reporters are writing around the edge of their expertise and training all the time. That specific reporter specializes in "personal finance" so it's a wonder the article even begins to make sense. And writers have to be good at two things at the same time, journalism and whatever they are trying to write about. It's hard to be excellent at multiple disciplines.
When you put it together, it's sort of magical when a news room works at all.
Unfortunately, that seems to be the quality of "professional" journalism nowadays. I wouldn't be surprised if AI was complicit as well (though I don't supposed it'd make a difference as the slop was just as low quality prior to recent years, it may as well have been AI generated then too).
It used to be indie publications, and now I find indie YouTubers tend to be generally superior (though you still have to do your own filtering and selection of course).
I have my own mildly amusing story of breaking systems with my name. I have a twin with the same first initial. Any time we had to use a system at school which constructed usernames from some combination of first initial, surname, and date of birth, only one account would be provisioned between the two of us.
It became almost a ritual in the first term of the school year for us to make a visit to IT Support and request a second account... there was always a bit of contention between us about who got the 'proper' username and who got the disambiguated one!
I set-up a directory system for a small school. The students logins were a combination or initials and date of birth. When I created the scheme I knew that a set of twins would break the system. Somewhere between 3 and 5 years we finally got a set of twins that needed to modify the system. I called them to my office and found out which one came out first and appended 1 their usernames and 2.
I anticipated twins with The school directory sync system I made for a small District (sub-5000 enrollment). I appended a random two digit number to first initial, last name (EAnderson96). So far none of the parents who have "weaponized"(1) the naming of their children have managed to break it since we brought it up in '99.
(1) Hypothetical John Smith and his siblings Jane Smith, James Smith, Janet Smith, Jack Smith, etc. It's a bit infuriating how many parents do this. Amusingly I've had O'Brien and other apostrophe-contsining names with no ill effects (because I sanitize my inputs). In the last couple years, though, I was forced to start dropping apostrophes because we had third-party apps/services we were syncing to that couldn't handle them.
A former employer had a first letter of given name + last name (actually first 5 letters) convention for email addresses. They did have a fallback--usually with second letter of given name. But, of course, a lot of people just automatically emailed the convention with the result that certain email "twins" got misdirected mail.
A very common one at one point was that the CFO shared a first letter of name with his daughter. As I recall, he actually had the email in the usual convention so it's not like his daughter was receiving lots of highly confidential financial info but there was regularly misdirected mail.
> A former employer had a first letter of given name + last name (actually first 5 letters) convention for email addresses
I once heard a story (possibly apocryphal) about a place which used a similar “first initial and truncated surname” convention for usernames, except theirs was first three letters of surname, followed by first initial, followed by some digits. And it all worked great until they hired a guy named Tom Cunningham
Was it confirmed to be sequential (AA, AB, AC, …)? Because it could’ve been just a random sequence-looking pick from the available space. Sort of like getting #0003 on Discord when they still had random numbers in usernames.
I never know whether to use the ć in my name when signing up for systems that require full legal names (banks et al.). Even my own country's gov't sites break when I input my real name, but refuses to accept the name without the ć. It was a real pain in the ass when I had to make an appointment and go in to some dingy office at 6am because their system doesn't support one of the most common letters in Serbian names. There's like a 90% chance the dude who made the system has a similar name with a ć or č in it even!
Surprisingly this has never broken for me in either Indonesia or the Netherlands though, whenever I've put the ć in it just converted it to a regular C which is perfectly acceptable for me (for context, it's pretty easy to guess which C is actually a ć or č in Serbian, similarly for s/š or z/ž, so seeing text without the proper diacritics doesn't really matter in most cases). My Dutch ID even correctly has the ć!
A coworker used to work in a meatpacking plant. There were two people who had the exact same name, first middle and last. They worked in the same department and in fact on the same machine.
They were both named Jose. One went by, pronounced, Hose-A, and the other Hose-B.
Apparently the Jose's, coworkers, and HR were all fine with this because it was simply too confusing otherwise.
Very common last name for Chinese and places like universities usually don't ever re-issue a username. So the "first-initial + last name + number" approach can get into triple digits easily.
It's weird that none of the systems automatically fell back to disambiguating with a number or something similar if the 'proper' one already existed. I'm wondering if you were a year or two apart instead, would the system simply silently fail to create a new username for the younger sibling when they joined?
This is terrifying, because it implies that so many computer systems interpret user-supplied data as what should be out-of-band values. No computer system should ever interpret what is in the "last name" field, it should be a sequence of characters only. Every attempt at interpretation is an exploit waiting to happen.
I especially love when non-technical managers boldly claim that the customer doesn't pay for best practices and clean code, as if it's some sort of costumer-focused bold declaration that is more in touch with reality than what every other developer thinks.
This is the idealistic approach. Then reality comes knocking at the door.
It's sad people get collateral damage and it should be fixed, but the world turns around with millions of these workarounds everywhere.
The most interesting I remember is stock management software: at 3 different places I had the employees dealing with stock explain how they set outrageous prices (like 9999999.99 EUR for a toothbrush) to keep an item in the inventory system but mark it as unavailable for a day or less until it gets restocked again.
I'm sure there is an official way to put something out of stock and restock it again, but it's just painful for an operation that happens basically everyday, for dozens of items.
I revisited this discussion after 3h and now I'm even more terrified, because most (if not all) replies totally miss the point.
Again: no computer system should ever interpret anything in the "last name" field. It should always be handled "in gloves", as an opaque value. It's not about typing, it's not about "paying for clear code", it's not about HTTP, I guess it might be about "best practices", but come on — this should be obvious!
I'm thinking about my systems now and I'm having a hard time coming up with a scenario where interpreting something inside a string value is even POSSIBLE (I use Clojure), unless you explicitly try to read from the string and interpret the results, and even then it's not easy.
I know that in the olden days we used to just feed user-supplied values into shells, with no regard for in-band vs out-of-band distinction. I also know that the HN-beloved SQL makes no distinction of in-band vs out-of-band, which causes a load of problems with proper escaping, hence Little Bobby Tables. But aren't we past that? Does anybody still construct SQL queries by concatenating user-supplied and app-supplied strings?
But is a smart comment buried in there about ETL systems and data exchange where it's pretty easy, and arguably correctly in some cases, to get "null" in an exported field. Then the importing system, again arguable correctly, needs to handle the null case as a true null, not "null." I'm not sure there is a very easy fix for this or an obvious best practice.
NULL might be interpreted if you interpolate your strings directly to SQL instead of using parameters (Drilled into me in 1st year that you should avoid it like the plague, but for some reason surprisigly common in older systems I've encountered)
string name = null;
$"INSERT INTO users (name) VALUES ('{name}')";
You are right. At the same time this is a very common issue and an attack vector[1][2][3]. E.g.: an existing book called "<script>alert("!Mediengruppe Bitnik");</script>" is still not shown correctly by some websites[4].
I used this technique on an auction site once. It allowed the script tag in my username, so I used it to remove the "bid" button once I had bid -- nobody behind me could outbid me.
It went about as well as you would expect using it for fraud. Which is to say, not well at all (;´Д`)
I believe the primary point where this goes wrong is in http. There, null and "null" are indistinguishable. Using json would help, but older forms don't.
In YAML (the gift that keeps on giving, see the False/Norway debacle) the last name of Null would have to be quoted, otherwise it would signify the null value.
pretty sure this is just the result of bad programmers trying to compare things against “null” (the string) and not whatever the null value is in their language. i don’t think they are evaling the the last name field or something.
My last name is a popular Irish name with an apostrophe in it. I have tons of issues with my name in forms. I'm basically a walking SQL injection detector.
But also I've started to drop the apostrophe in most of my online profiles and things. So I think we're starting to see the end of apostrophes in people's names, thanks to some fun oddities of the internet and common database technologies.
I can't find a reference anywhere, but I remember reading that when their child was born they couldn't process a birth certificate with the name containing a slash, so it was changed to a dash for the kid.
Reading the article, I'd guess the introduction of the slash was introduced because the actual letter (Ꝃ) wasn't available on earlier technology like a typewriter. Funny that the "fix" caused problems with the next generation of tech.
Slightly related thought, but I have a popular Slovene name with letter Ž (pronounced as g in mirage) in it. Since I started living abroad, I use the letter Z, even when introducing myself. It often throws people off guard completely and it is much easier to use just Z.
So I guess some cultural aspect of names will also disappear, I know I want I children to have a bit more "international" names.
> But also I've started to drop the apostrophe in most of my online profiles and things. So I think we're starting to see the end of apostrophes in people's names, thanks to some fun oddities of the internet and common database technologies.
This is a bugbear of mine! It's so frustrating that this is the easier path. Technology should make our lives better, not bend us to it's limitations!
The number of times that a website rejects my first name because it has a hyphen in it, even in 2025, is astounding. I get told all manner of things by support staff, like "just leave it out" as if it's just not an important part of my name or anything.
Indeed. My passport correctly includes the hyphen in my surname. Air New Zealand doesn't support spaces or hyphens so my surname is written out as both words concatenated (i.e. Onetwo). Qantas doesn't support hyphens but does support spaces so my surname is written out as two words (i.e. One Two).
Thankfully apparently this is common enough that I've had tickets including travel on both airlines (as Qantas cancelled a flight and ticketed me on Air New Zealand instead) traveling internationally work just fine. Even things like the automated customs gates work fine. I suspect under the hood their systems just strips out all non-alpha characters and compares that (i.e. 'onetwo' == 'onetwo').
Online/moible forms can be an issue tho. Spark, the biggest mobile phone carrier in New Zealand, doesn't support hyphens in account names, just to name one silly example.
I've got the same problem with a hyphenated name, and it was always the way they phrased the error messages that annoyed me. Porter Airline's error message for the longest time was "Your Name is Invalid". No, my name is valid, your system doesn't support it.
I ended up having to contact their support quite a few times for them to fix the error message. Still doesn't work, but at least the error message is reasonable now.
> I suspect under the hood their systems just strips out all non-alpha characters and compares that (i.e. 'onetwo' == 'onetwo').
That would be the MRZ version. The identity page of your passport has a blob of monospace text at the bottom that's used as the 'canonical' version of names for most or all air travel systems.
The airline/travel systems are full of this stuff.
I have mostly documents that include my full middle name, and the way half or more of air travel systems deal with that is to just crunch the first and middle into one name.
> Even things like the automated customs gates work fine.
I would like to think (Perhaps naively?) that these systems key off your passport number/ID (which is by construction, not subject to these problems) to deliberately side-step issues like this.
It drives me mad when various forms sternly requires me to enter my name "as it is spelled in the passport" only to tell me that my name is "invalid" or "incorrect" or "not allowed". Then we have the systems that have non-standard transliteration rules …
I have never been known by more than one name, but the spelling sure differs.
It's annoying that airlines writes "as spelled in passport", when what they really need is an upper case alpha-only version of your name.
But it is also equally annoying that passports don't clearly spell out a "international and systems compatible" version of your name.
We've had Airlines for way over half a century, and visas for about as long as we have had passports and people still walk around with international identification documents that cannot be understood by travel and immigration agencies internationally.
I‘m laughing in Brazilian… we (and certainly my case) tend to have so many family names plus particles (like „de“ or „junior“) that often the full name does not fit in forms, which leads to cropping my family name(s) or removing spaces, sometimes both. And then in some forms such as airlines where the family name is the username I have to try a few different combinations to „guess“ which is the one the airline‘s systems used for me.
Some people here have an apostrophe. Names like Ainul can be 'Ainul. And I guess the Bobby Tables equivalent is Ai'null. It's due to arabic sources, where A and A' are different letters like o and ö. I used to do some consulting for the home ministry, and these legal names are all over our databases.
I imagine apostrophes would be a complete nightmare for most countries to sanitize or validate.
My wife is Korean. The anglicized version of Korean given names always has a space in it. This makes for a few kinds of broken naming schemes - like removing the space or the second half of the given name becomes the middle name, or the second half of the given name just truncated entirely.
In Singapore given names can have more than one space and may not be a substring of the full name. The first prime minister has the full name "Harry Lee Kuan Yew" where Lee is his family name and "Harry Kuan Yew" is his given name. (Later in life he dropped "Harry" from his name.)
I can beat this. My wife's maiden name was in the form "Jane Angela Smith". When we got married, she changed it to "Jane Smith Jones", first name Jane, middle name Smith, last name Jones. Someone at the Social Security Administration entered it into their database as first name "Jane", no middle name, last name "Smith Jones".
Now, for fun, no one noticed this for about 25 years. Her Social Security card says "Jane Smith Jones". Her driver's license says "Jones, Jane Smith". Her US passport says "Jones, Jane Smith". But another part of the federal government says "Smith Jones, Jane". We only found this out when she tried to renew her driver's license recently and the clerk was like, "hey, this isn't matching up right...". A month later, the TSA clerk at the airport stopped her to ask why her passport didn't match her federal records.
So now we're paying $400 to legally change her name from "Jane Smith Jones" to "Jane Smith Jones". That's what the notice they make you pay to run in the newspaper says, anyway.
There's a number of websites and forms that reject entire domain names for email that have a hyphen in it, despite a hyphen being perfectly valid in modern DNS usage and domain names in every ICANN TLD.
For many years, my mother's proper legal name on her birth certificate was the empty string. This wasn't usually a problem before computers as she'd go by a given name instead, even on government paperwork. She started having issues with systems being unable to process her information in the late 90s and early 2000s. Background checks would fail, passports couldn't be issued, and so on. She eventually had it changed, but I imagine it'd be even worse now.
The hospital staff filled out most of the birth certificate, but left the name field blank when they gave my grandmother the paperwork to take home. She either didn't notice or didn't care (both possibilities are realistic) that the name field was blank and submitted it anyway. The state accepted it.
My mother started filling out her own paperwork around elementary school because my grandmother created so many problems that school staff would simply give up.
A high school teacher of mine didn't have a last name, only a first name. Problem was when she moved (from India) she had to have a last name because a bunch of systems and people assume a last name as a fact, so her full name is just her first name twice because she didn't want to think of a different one :p
Back at Caltech in the 1970s, every student got a free account on the PDP10. The account username consisted of 3 letters, and Caltech would assign them by the first letter of their first name, middle name, and last name.
Enter Tim Rentsh. He didn't have a middle name, so Caltech asked what letter would he prefer, so he said X.
TXR
That became his nickname, and stuck so hard he legally changed his name to Tim X Rentsh.
I had a relative named Null, who worked in tech in the 80s, 90s, and 00s. Apparently it caused him so many issues that he finally gave up and changed his name to the name of the town where our family came from.
I think he was a Java guy too. I’m sure that just made it worse.
People who neither know, nor care, what they’re doing. When you’ve worked with people who are allowed to write code that interfaces with databases for more than about two weeks, you may be dismayed to see the large overlap between those two groups.
Any time "$var" is interpolated without check into an INSERT, and any maintainer finds it easier to just check for null as a string rather than ask a DB admin or committee to update the DB after a lot of red tape and risk assessment.
I don't buy that. The string "Null" is different from the keyword null in programming, so `if $var = null` would be false when $var is the string "Null".
Note that when interpolated into SQL, the contents of $var must be surrounded by single quotes, so you end up with `insert into Table (Name) values ('Null')`, which correctly inserts the string "Null" into the table.
If you were to leave off the quotes, you'd get a SQL syntax error for all other people, so that code would never make it into production. E.g. `insert into Table (Name) values (Smith)` is a syntax error.
There’s a few condescending answers here. You will find this more common in weakly typed languages like PHP and VB and maaaaaybe JavaScript, where null == “null” will probably evaluate to true.
js doesn't have this particular problem, but it does have both `null` and `undefined`, which will have varying semantically different usages depending on local conventions.
For example, some will prefer to use `null` to mean that a value is _intentionally_ missing (for example, the db explicitly returned a null value), while `undefined` does not have any such connotation. These exist for frontend engineers to navigate decisions often far removed from their influence.
Anyway, `null != ''` and `null != 'null'`, but `null == undefined`. However, `null !== undefined`.
A lot has been made of js Truthy/Falsy equality operator, but most js programmers will take steps to actively avoid it coming into play. Probably the `void` operator is still under-used though in frontend code, though, since there's some pretty surprising legacy things that can happen when interacting with DOM APIs (like `checkbox.onclick = () => doSomething()` resulting in different checkbox behavior depending on whether or not `doSomething` returns a boolean or undefined).
There's plenty of name=Null, name=Undefined, name=Unknown entries in OpenStreetMap. Some are real places, some are mistakes, not easy to tell especially for restaurants or bars.
Plenty of systems represent all values as strings, and "null" is the obvious (although probably not the best) way to represent a null value as a string.
Which systems do this? I could see situations where reading in a text file, you have might assume the value null is not the string "null". I am struggling to think of other situations.
I'd guess some data transferred between systems as a homemade CSV. Empty field = empty string, but some field is nullable so someone decided that Null would be a way to declare a null field.
20 years later and multiple systems depending on each others, random hidden CRONs in the middle and now people called Null have a problem.
> Null Island is the location at zero degrees latitude and zero degrees longitude (0°N 0°E), i.e., where the prime meridian and the equator intersect. Since there is no landmass located at these coordinates, it is not an actual island. The name is often used in mapping software as a placeholder to help find and correct database entries that have erroneously been assigned the coordinates 0,0.
My name, Ĝonatano, contains a ĝ, which is an uncommon letter outside of my language, Esperanto. But when I go to set my username to "ĝonatano," I'm often told that usernames "may only contain letters or underscores," as if ĝ weren't a letter. (You can see that I've approximated it in my HN username, but I don't need to do that on web services that correctly understand that letters exist outside of ASCII and Latin-1.)
To be fair, Esperanto is, as far as I can tell, not very widely used. The letter ĝ mostly returns Esperanto results. Using that letter in a place where others may need to communicate or type the letter would be a severe burden on almost anyone else you interact with, outside of Esperanto communities.
I'm sure there are plenty of people who share your frustration with accented letters, ñ, umlauts, etc, though. I'd hope that most systems can handle those letters, although I wouldn't hold out hope that Ĝ/ĝ would be high on the priority list.
Well, it's the most widely spoken international language, spoken in over a hundred countries, by an estimated 2-5M people. There's a rich literature (probably 30-50K books), vibrant music scene, and support in open source software (Linux, Firefox, Google products) is usually pretty good.
But the issue is not how widely Esperanto, or any other language, is spoken. If you assume that languages should only be supported according to their number of speakers, you leave no room for useful languages, bridge languages, auxiliary languages, or growing languages. Even if Esperanto had only 100 speakers, it'd be worthwhile to support, if it's easy to learn, and easy for non-speakers to understand.
It's not a "severe burden" to consider non-ASCII letters as letters. Unicode is pretty straightforward to work with, and if you want to support more than just English, it's a necessity. There's no need to have a "priority list" of letters you consider more or less important than others. That attitude comes across as very Anglocentric.
You also can't put in Cyrillic or CJK characters. It's a user name, not a human name, you should be fine just using the 26 ASCII letters for it. Basically anything that is a computer-centric string should be only ASCII and nothing else, because supporting all of human writing is a never-ending task.
When I do a Ctrl+F search for “Gonatano” one of the search results is the actual name as typed with the circumflex. I think that is kind of a handy feature of the browser I’m using but at the same time it is sort of weird since it technically is not the same name without the circumflex, right?
Also not all database systems would think the non-circumflex version is equivalent to the circumflex version. Does anyone have thoughts or ideas about how or why they should be treated equivalently?
I also recognize this can get kind of political. There was a push in California recently to let people have accented letters in their name. Apparently it is legally not allowed. And yet some people claim their California birth certificate does contain accented letters.
Postgres has a module called unaccent[0] that removes diacritics for filtering. I expect your browser is doing something similar. While not appropriate when looking for exact matches, when doing user-input based searches, this should probably be the norm, as the user may be unaware of the accents or how to input them correctly on their keyboards.
Dove deep on this years ago when implementing a filter for wines and wine regions.
> but at the same time it is sort of weird since it technically is not the same name without the circumflex, right?
Assuming you have a "standard" keyboard, it's not weird at all for your browser to match the diacritic when you type the non-diacritic character since presumably the diacritic would be difficult to type. Firefox's search feature even has a [_] Match Diacritics checkbox which you can enable or disable.
This is absolutely the desired default behaviour for ctrl+F in a browser. e.g. I frequently read French, and don't normally want to have to put in accents in my search term when I'm searching text for a word containing an accent.
Firefox has a "Match Diacritics" checkbox right next to the "Match Case" box when you ctrl+F so you can configure as desired.
One place I worked, customers (usually merchants) sent product data through an API my team managed. I was working on a data validation project and ran across an item that was getting rejected. One of the fields customers can set is tags. The item was a t-shirt with a joke about null pointer exceptions, so someone set tags to include ["null", "pointer", "exception"]. Our parser coerced it to null, then returned an error because that array can't contain nulls.
Definitely not as bad, but I had problems with my name (Marcello) as well because it contains the name of a musical instrument. So I can only imagine what they are going through.
The latest was Swiss airlines website which was kind of shocking since it is a proper Italian name and one of the official languages of the confederation. Most annoying instance was with the ESTA online application many years ago (fixed last time I tried) that forced me to go to the a US embassy/consulate in person.
Could someone with insider-knowledge elaborate on that? Why on earth would you exclude names that contain names of musical instruments from anything? Is it common that fake names contain musical instruments?
You can book a seat for an instrument. I've seen EXTRA, ITEM SEAT mentioned as name for one particular airline but maybe it can get more specific with others?
I think we should all embrace a future where legal names are just straight up binary streams.
I can finally realize my true potential and be recognized as Mr. “:100 emoji:(Unicode zero-width joiner):fire emoji:(null character)(base64 encoding of a QR code that links to a website with a photo of my face),(vcard data, recursively referencing this last name somehow)”
That discriminates against people whose names cannot be written in Unicode. You need to include the ability to accept various image formats, as well as audio for names that don't have a written form.
Even those without the last name Null are finding themselves caught in the void. Joseph Tartaro got a license plate with the word “NULL” on it nearly 10 years ago. The 36-year-old security auditor thought it would be funny to drive around with the symbol for an empty value. Maybe a police officer who tried to give him a ticket would end up writing null into the system and not be able to process it, he joked to himself.
In 2018 he paid a $35 parking ticket. Soon afterward, he said, his mailbox was flooded with hundreds of traffic tickets for incidents he hadn’t been involved in. Tickets were from other counties and cities for vehicles of different colors, makes and models. A database had associated the word “null” with his personal information and citations were sent to Tartaro, who lives in Los Angeles.
IIRC the court told him the only way that he'd be able to stop getting other people's tickets in the mail would be to get a new plate. Otherwise he'd have to keep coming back to court to get them thrown out.
I'm doubtful changing his plate would actually fix it. There's a decent chance that either his contact info would still be associated with null, or every record that currently has null would be updated to his new plate, which would probably make it even more of a pain to fight in court.
You would think that at a certain point he would never have to pay a personal traffic citation again since they would probably believe him that it wasn't him.
> A database had associated the word “null” with his personal information and citations were sent to Tartaro, who lives in Los Angeles.
I'm not the biggest expert on databases, although I've worked with them a bit, but how does this occur in the first place? Usually, associations are done with primary/foreign keys. What database would allow null in that case?
I actually wonder if many individual cops use "null" (the ordinary word) as a shorthand for "plate not applicable," "not identified", "missing" etc - semantically the same thing as NULL in programming, but then in this case it wouldn't be a database NULL error. In theory the same thing could happen if someone had a DEALER vanity plate. (Though that choice might be rejected for obliquely referencing drugs.)
There are so SO many databases out in the wild that were built by people with little regard for building them correctly - or they simply not programmers/DBA's in the first place, but their boss told them to just make it happen.
If there are multiple systems involved, one of them produces a null and the other takes that as a key to create a record, but somewhere in between it gets stringified, then the string null might be accepted by a system using string keys.
It'd be sufficient if a system involved somewhere in the process converted null values to strings. There's innumerable ways, but here's a simple one in Java:
final String myNullString = "" + null;
System.out.println(myNullString);
>his mailbox was flooded with hundreds of traffic tickets for incidents he hadn’t been involved in.
I had this problem a few years ago when I got a similar license plate — every time somebody wrote NOPLATE in the license field, I received their citation.
My state eventually "fixed" this by blocking all citations written NOPLATE... which means I don't even get legitimate [i.e. my vehicle illegally parked] fines anymore =P
Reminds me of the time I figured out how to put an emoji into my name at work
I eventually got an email from an engineer asking if I could remove it. They apparently had been working for a few days to try to add unicode support to their internal database and it wasn't going so well.
Ha I'm surprised no one mentioned one of the most common name maltreatment complaints. The Gaelic patronymic prefix Mc/Mac.
When treating Mc/Mac names, the first letter after the prefix is always capitalised.. e.g McDonald's, McCarthy, McCain, McCoy, McConnell etc but you're more likely to see mcdonalds, mccoy, mcconnell.
Orthographic case preservation reports were like the biggest complaints that no one was interested in fixing during a short stint in the airline industry.
My surname is Macdonald—some ancestor generations ago decided to use the lowercase d. When I lived in the US it took considerable coaching to get humans not to write McDonald, and some just couldn't get it.
I then moved to Scotland, where exactly zero people have defaulted to Mc instead of Mac. However, the computer systems of both the NHS and the University of Edinburgh apparently don't store the case of strings and reconstruct the capitalization after the fact. Both systems list me as MacDonald and there's nothing I can do about it.
I'm relatively okay with this—before computers McDonald, MacDonald, Macdonald, and M'Donald were all functionally equivalent. But now I do worry about the implications of having official documents with variant spellings
It's amazing to watch systems do things as humans initially have intended, then see those system fail in the most spectacular ways, because the humans didn't think of every possible failure scenario.
I've had great fun with my surname, which contains of two words and just a space in between. Lots of systems thought my "maiden name" was the first word and adressed me as Mrs. My wife has three first names, adopted my surname when we married and her own surname. Nobody gets it right. We don't care and have fun with it.
I've also had great stress with my surname, when some algorithm at the tax authorities decided it sounds a bit like coming from a Slavish country and along with other parameters decided to tag me as a fraud. Still an ongoing problem.
Have you come into the scenario where a doctor's office is trying to find your medical records? Or a hospital? Or an emergency room, trying to fetch them from an office in another town or country? Better to have a good attitude than not, but the phrase "it's all fun and games till someone loses an eye" comes to mind.
>It's amazing to watch systems do things as humans initially have intended, then see those system fail in the most spectacular ways, because the humans didn't think of every possible failure scenario.
You should come to Taiwan! They've never considered non-Chinese names.
If you something online and pay by card, you can choose to ship it to a 7-Eleven or other convenience store, so you can pick it up at your own convenience. They'll ask for the name on your ID card/Passport, which the store will check before handing the parcel to you.
The problem? Many online stores do not accept names longer than a handful of characters. Chinese names are almost always two or three characters long, rarely four. Five or more characters exist according to a quick Google search, but I've never seen them myself. Good luck with western names, where even a short name like "John Doe" will be considered too long (The space counts as a character).
If you're a foreign resident, you can choose to get a Chinese name to deal with the parcel issue. Now you have two legal names: The name on your passport and the Chinese name. If you deal with public institutions, they'll prefer to use your Chinese name. Private companies have their own policies: Banks, for example, prefer to use the name on your passport. I've had issues with my insurance claims being rejected because the name on the government-provided documents did not match the name they had on file.
my have a ç and during first 10-15 years all system broken and my "vat" first letter is 0 same problem, if the dba genius use integer random bugs happen :(
Almost 30 years ago, the nutritionist Gary Null published books, had a show on Pacifica and so on. His basic message, I think, was that we should eat a diet that was less heavy on meats and processed foods. This is probably not wrong, but I thought him a bad influence on the household menus. I did object to his Pacifica show--he or somebody he had on mentioned "the Twinkie defense" not as a farce but as a plausible judgment; and he had some AIDS quack on, whose advice could only have been dangerous to the HIV-positive.
In most cases name fields in databases need not be divided into separate given and family names. There should instead simply be one field in which one enters one's full name.
Searches for names almost always have to have a fuzzy search fallback anyway because of spelling mistakes and variant spellings.
my legal last name has been "Null" for a bit over half a year, and I've had no problems. before changing it, I asked a friend whose legal last name had been "Null" for a couple years. it hadn't had any issues. this article (and the one about the British guy) is overblown.
I needed to change my legal name because my existing one was unavoidable and distressing (trans reasons). I have, have had, and will have too many names to privilege one as "most correct." I've been trans too long to believe government names are "real" anyways, and I also didn't want the association with my birth family anymore. so I wanted an "empty" legal name, something that nobody calls me (but I won't be upset when doctors do), something that represents the rejection of the ultimate validity of itself, and something that sounds cool. I also like being just mildly annoying. $redacted_firstname Null fit.
The funny thing is, the real thing that's caused me problems is that I changed from a name with a middle name to one without. Some systems did not handle this transition well. PATCH instead of PUT, or something.
I have two middle names, and it amazes me how many digital systems cannot handle this in 2025. I have yet to find a bank that supports spaces in your middle name, and multiple airlines have decided to just concatenate my middle names together.
How you dont consider such scenarios setting up your database is mind boggling to me. I do this for a living, albeit, marketing databases, but how you don't realize that people from all over the world live here too and you have to account for everyone is just astonishing to me...
I think it's a combination of several factors: first, most banking and airline systems were developed back in the 1960s, and it was simply more convenient to shoehorn everybody into the first-middle-last format, especially when the majority of their customers did fit that model.
Second, this was before the government (and businesses) were so picky about everything matching perfectly, and you could get away with mismatches more frequently because you had more human eyes looking at things. Those humans would easily realize that when the customer they've known for years as "Jane Smith" shows up with a birth certificate reading "Jayne Smith," it's the same person.
I have one middle name but a last name like "von Treer" and no DMV has ever put my name correctly on my license. They put it like j have two middle names, including "von" in that example.
This is a great watermark if the company's backend developers are awful or not. I work with this type of stuff and you have to be pretty awful to screw up a string variable "null" as an actual null value...
I have 4 names (two first names and two surnames), pretty common thing in my country. But I've had issues several times when flying because they assume it's two different people? How stupid is that?
I have one airline that just tacked my middle name (which I basically never use) directly onto my first name for my account. It's never been an issue so I haven't ever gone to the trouble of correcting it but it means I always have the "wrong" first name on my tickets relative to my official IDs.
humans should come with a big UUID that is generated at a central database to keep out duplicate and then just a name without any last name for social connections. this would also solve discrimination on many levels. would be a nightmare for privacy and tracking but we are all being tracked anyway.
Interesting idea, but the point of uuid is to not be centralised, otherwise you'd just use a serial number.
So if we assume everyone is born in a room with an address everyone could have a uuid like xxxx-yyyy-zzzz where xxxx is the postcode/zip code of the address, yyyy is the room number (only needs to be unique at the address) and zzzz is the time (only needs to be unique in the room, so local time is fine). This is similar to the uuid v1 scheme.
Us computer scientists (or developers or coders or whatever we are called at the moment) typically think we are really smart. We can be unbearable know-it-alls, really.
And yet we screw up, collectively and individually, over and over again on the fundamentals of making good software that serves people, by making not-very-well-thought-out decisions.
Your surname is a word we decided should mean nothing/empty/unknown? Oh, we didn’t think about that. Um, can you just change it?
Gift link: https://www.wsj.com/lifestyle/null-last-name-computer-scient...
https://archive.ph/KxNNu
Ironically, this article is full of the sort of semantic confusions that cause the problem in the first place. The reporter clearly hasn't run the article past an actual programmer as she seems to think this outcome is a deliberate design choice rather than a bug:
> Null was first programmed 60 years ago by a British computer scientist named Tony Hoare ... Hoare probably wasn’t thinking about people with the 4,910th most common surname. He later called it his billion-dollar mistake, given the amount of programmer time it has used up and the damage it has inflicted on the user experience.
Obviously Hoare's statement wasn't about this problem at all. She's also giving readers the impression Microsoft has some sort of policy against using null values:
> “It’s a difficult problem to solve because it’s so widespread,” said Daan Leijen, a researcher at Microsoft, who says the company avoids use of null values in its software.
Whatever Leijen said, I'm pretty sure it wasn't that.
I really don't get why journalists so rarely do basic fact checking of their own articles by asking an independent source for a final read-through. Many of them actually have policies against doing this, which leads to an endless stream of garbled articles that undermines their credibility without them even noticing.
> > “It’s a difficult problem to solve because it’s so widespread,” said Daan Leijen, a researcher at Microsoft, who says the company avoids use of null values in its software.
> Whatever Leijen said, I'm pretty sure it wasn't that.
I had a good lol when I read that, imagining some top-level decree to NEVER use null values in any context in all of Microsoft
Afaik, there does exist a decree in most teams at MS to not use null if it can be avoided, and any attempt to do so is likely going to be flagged in code review - or so I'm told.
> Whatever Leijen said, I'm pretty sure it wasn't that
What makes you so sure? This is Hoare's apology for creating the null reference:
> I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
Given that null references cause crashes, why would it be unreasonable for a researcher at Microsoft to say they try and avoid them? Is this journalist really so far off?
> I really don't get why journalists so rarely do basic fact checking of their own articles
If you compare journalism to our other sources of information - such as comments on Hacker News and posts on social media - I think it holds up quite well, especially when the outlet is a reputable organization. It's quite fashionable for technology people to be highly critical of what they (pejoratively?) call "legacy media", but the alternatives that the technology industry have brought forward, like social media, are far, far worse in terms of accuracy, and also do very little of the kind of investigative reporting that is crucial for holding powerful officials to account.
Re-read what the article says.
> Null was first programmed 60 years ago by a British computer scientist named Tony Hoare ... Hoare probably wasn’t thinking about people with the 4,910th most common surname.
It mentions null as a mistake and then ties it to the word 'null' by referencing that a significant number of people have that last name. As though if it were called something like xkcd that has no pronunciation and is unlikely to be a last name, that would be better.
I think overall journalism is worse because its perceived as being authoritative. Social media post might be similar level of information, but Wikipedia won't cite it and the laymen realize to take it with a grain of salt. There's also better feedback as the comment section is front and center. Also some person with no knowledge, experience or curiosity in a subject is less likely to comment on it. While a journalist's job is to churn out a wide variety of pieces on topics they're likely unfamiliar with.
The issue is that the way it's presented in the article, a layman would interpret it to mean Microsoft don't use null values anywhere in any aspect of their software. Not as "we use them all the time pervasively but would like to do so a bit less", which is what was actually meant.
We'll have to disagree about the accuracy of legacy vs other forms of media. Many of the best investigations I've ever read have been by independents on Substack, for instance.
> I really don't get why journalists so rarely do basic fact checking
It takes time and effort with no discernable upside. In fact, knowing the true facts would make it harder for journalists to bias the story in the way they want to without them feeling a bit bad for lying. It’s easier for them if they don’t know.
> that undermines their credibility without them even noticing.
Not really. Even the vanishingly small minority of readers who know the details of the story in question suffer from Gell-Mann amnesia, and continue to believe that all other stories (by the same paper, and even from the same reporter) are perfectly accurate.
> It takes time and effort with no discernable upside.
The true answer is incentives, opportunity and aptitude. Incentives are skewed to writing something that will engage readers, the human version of the social media algorithm. Opportunities are short because reporting is done on deadline without the luxury of time to deeply ponder intermediate drafts. And reporters are writing around the edge of their expertise and training all the time. That specific reporter specializes in "personal finance" so it's a wonder the article even begins to make sense. And writers have to be good at two things at the same time, journalism and whatever they are trying to write about. It's hard to be excellent at multiple disciplines.
When you put it together, it's sort of magical when a news room works at all.
Unfortunately, that seems to be the quality of "professional" journalism nowadays. I wouldn't be surprised if AI was complicit as well (though I don't supposed it'd make a difference as the slop was just as low quality prior to recent years, it may as well have been AI generated then too).
It used to be indie publications, and now I find indie YouTubers tend to be generally superior (though you still have to do your own filtering and selection of course).
I have my own mildly amusing story of breaking systems with my name. I have a twin with the same first initial. Any time we had to use a system at school which constructed usernames from some combination of first initial, surname, and date of birth, only one account would be provisioned between the two of us.
It became almost a ritual in the first term of the school year for us to make a visit to IT Support and request a second account... there was always a bit of contention between us about who got the 'proper' username and who got the disambiguated one!
I set-up a directory system for a small school. The students logins were a combination or initials and date of birth. When I created the scheme I knew that a set of twins would break the system. Somewhere between 3 and 5 years we finally got a set of twins that needed to modify the system. I called them to my office and found out which one came out first and appended 1 their usernames and 2.
I’m a twin and my parents never told us who was ‘first’.
5 replies →
I anticipated twins with The school directory sync system I made for a small District (sub-5000 enrollment). I appended a random two digit number to first initial, last name (EAnderson96). So far none of the parents who have "weaponized"(1) the naming of their children have managed to break it since we brought it up in '99.
(1) Hypothetical John Smith and his siblings Jane Smith, James Smith, Janet Smith, Jack Smith, etc. It's a bit infuriating how many parents do this. Amusingly I've had O'Brien and other apostrophe-contsining names with no ill effects (because I sanitize my inputs). In the last couple years, though, I was forced to start dropping apostrophes because we had third-party apps/services we were syncing to that couldn't handle them.
6 replies →
At my school you were provisioned as <two-digit-start><first-three-firstname><first-three-last-name>. e.g. Joe Bloggs starting in 2000 is 00joeblo
Cue problems when two "Simon Smith" join in the same year. They were given 00simsmi and 00smisim I think.
I am pretty sure they spent 7 years at school forwarding each other email as various teachers assumed the default would work.
A former employer had a first letter of given name + last name (actually first 5 letters) convention for email addresses. They did have a fallback--usually with second letter of given name. But, of course, a lot of people just automatically emailed the convention with the result that certain email "twins" got misdirected mail.
A very common one at one point was that the CFO shared a first letter of name with his daughter. As I recall, he actually had the email in the usual convention so it's not like his daughter was receiving lots of highly confidential financial info but there was regularly misdirected mail.
> A former employer had a first letter of given name + last name (actually first 5 letters) convention for email addresses
I once heard a story (possibly apocryphal) about a place which used a similar “first initial and truncated surname” convention for usernames, except theirs was first three letters of surname, followed by first initial, followed by some digits. And it all worked great until they hired a guy named Tom Cunningham
4 replies →
My university was [initial][lastname][year][letter][letter]@school.com, allowing for 26^2 people to have the same initials and last name every year.
Despite having a rare last name and no twins, I was "AC".
Was it confirmed to be sequential (AA, AB, AC, …)? Because it could’ve been just a random sequence-looking pick from the available space. Sort of like getting #0003 on Discord when they still had random numbers in usernames.
I never know whether to use the ć in my name when signing up for systems that require full legal names (banks et al.). Even my own country's gov't sites break when I input my real name, but refuses to accept the name without the ć. It was a real pain in the ass when I had to make an appointment and go in to some dingy office at 6am because their system doesn't support one of the most common letters in Serbian names. There's like a 90% chance the dude who made the system has a similar name with a ć or č in it even!
Surprisingly this has never broken for me in either Indonesia or the Netherlands though, whenever I've put the ć in it just converted it to a regular C which is perfectly acceptable for me (for context, it's pretty easy to guess which C is actually a ć or č in Serbian, similarly for s/š or z/ž, so seeing text without the proper diacritics doesn't really matter in most cases). My Dutch ID even correctly has the ć!
In Severance S2E1, Mark W says to Mark S: "Would you be open to using a different first name to avoid confusion?"
Did anyone ever ask that of you or your twin?
A coworker used to work in a meatpacking plant. There were two people who had the exact same name, first middle and last. They worked in the same department and in fact on the same machine.
They were both named Jose. One went by, pronounced, Hose-A, and the other Hose-B.
Apparently the Jose's, coworkers, and HR were all fine with this because it was simply too confusing otherwise.
No one ever asked us that which is good because I don't think either of us would have been much impressed by the suggested :p
Meanwhile, friend of mine's last name was Li, and IDs were first initial + last name + #
Her number had three digits.
I'm guessing it's the birthday that really messed the system up though?
Very common last name for Chinese and places like universities usually don't ever re-issue a username. So the "first-initial + last name + number" approach can get into triple digits easily.
It's weird that none of the systems automatically fell back to disambiguating with a number or something similar if the 'proper' one already existed. I'm wondering if you were a year or two apart instead, would the system simply silently fail to create a new username for the younger sibling when they joined?
If they were a year or two apart the DOB would have disambiguated it.
This is terrifying, because it implies that so many computer systems interpret user-supplied data as what should be out-of-band values. No computer system should ever interpret what is in the "last name" field, it should be a sequence of characters only. Every attempt at interpretation is an exploit waiting to happen.
Typed data?! What's next, unit tests? Memory safety? Where will it stop?
I especially love when non-technical managers boldly claim that the customer doesn't pay for best practices and clean code, as if it's some sort of costumer-focused bold declaration that is more in touch with reality than what every other developer thinks.
I’m still convinced unit tests are a Santa clause type situation where it takes. While to figure out it’s really just a story told to naïve juniors
2 replies →
I would put memory safety before unit tests if it were me.
This is the idealistic approach. Then reality comes knocking at the door.
It's sad people get collateral damage and it should be fixed, but the world turns around with millions of these workarounds everywhere.
The most interesting I remember is stock management software: at 3 different places I had the employees dealing with stock explain how they set outrageous prices (like 9999999.99 EUR for a toothbrush) to keep an item in the inventory system but mark it as unavailable for a day or less until it gets restocked again.
I'm sure there is an official way to put something out of stock and restock it again, but it's just painful for an operation that happens basically everyday, for dozens of items.
> I'm sure there is an official way to put something out of stock and restock it again
And I wouldn't be surprised if there is no official way.
I revisited this discussion after 3h and now I'm even more terrified, because most (if not all) replies totally miss the point.
Again: no computer system should ever interpret anything in the "last name" field. It should always be handled "in gloves", as an opaque value. It's not about typing, it's not about "paying for clear code", it's not about HTTP, I guess it might be about "best practices", but come on — this should be obvious!
I'm thinking about my systems now and I'm having a hard time coming up with a scenario where interpreting something inside a string value is even POSSIBLE (I use Clojure), unless you explicitly try to read from the string and interpret the results, and even then it's not easy.
I know that in the olden days we used to just feed user-supplied values into shells, with no regard for in-band vs out-of-band distinction. I also know that the HN-beloved SQL makes no distinction of in-band vs out-of-band, which causes a load of problems with proper escaping, hence Little Bobby Tables. But aren't we past that? Does anybody still construct SQL queries by concatenating user-supplied and app-supplied strings?
I agree with you within a single data domain.
But is a smart comment buried in there about ETL systems and data exchange where it's pretty easy, and arguably correctly in some cases, to get "null" in an exported field. Then the importing system, again arguable correctly, needs to handle the null case as a true null, not "null." I'm not sure there is a very easy fix for this or an obvious best practice.
1 reply →
NULL might be interpreted if you interpolate your strings directly to SQL instead of using parameters (Drilled into me in 1st year that you should avoid it like the plague, but for some reason surprisigly common in older systems I've encountered)
string name = null; $"INSERT INTO users (name) VALUES ('{name}')";
Basically an involuntary SQL Injection
You are right. At the same time this is a very common issue and an attack vector[1][2][3]. E.g.: an existing book called "<script>alert("!Mediengruppe Bitnik");</script>" is still not shown correctly by some websites[4].
[1]: SQL injection: https://en.wikipedia.org/wiki/SQL_injection
[2]: Cross-site scripting: https://en.wikipedia.org/wiki/Cross-site_scripting
[3]: Exploits of a Mom (Bobby Tables) XKCD: https://xkcd.com/327/
[4]: https://www.tomlinsons-online.com/p-16381221-scriptalertmedi...
I used this technique on an auction site once. It allowed the script tag in my username, so I used it to remove the "bid" button once I had bid -- nobody behind me could outbid me.
It went about as well as you would expect using it for fraud. Which is to say, not well at all (;´Д`)
I believe the primary point where this goes wrong is in http. There, null and "null" are indistinguishable. Using json would help, but older forms don't.
My money is on janky CSV ETL…
In YAML (the gift that keeps on giving, see the False/Norway debacle) the last name of Null would have to be quoted, otherwise it would signify the null value.
For other curious people (or in my case, as a curious Norwegian): https://news.ycombinator.com/item?id=36745212
That's why you write yaml with a json encoder :)
Why in the world would anyone EVER want to stick user-supplied data into YAML to be interpreted?
(setting aside the very relevant question of why would anyone ever want to use YAML at all)
I will forever curse whoever thought YAML was a good idea and spread it everywhere.
1 reply →
pretty sure this is just the result of bad programmers trying to compare things against “null” (the string) and not whatever the null value is in their language. i don’t think they are evaling the the last name field or something.
My last name is a popular Irish name with an apostrophe in it. I have tons of issues with my name in forms. I'm basically a walking SQL injection detector.
But also I've started to drop the apostrophe in most of my online profiles and things. So I think we're starting to see the end of apostrophes in people's names, thanks to some fun oddities of the internet and common database technologies.
Reminds me of the story of the French politician with a forward slash in their name. https://en.wikipedia.org/wiki/Emeline_K%2FBidi?wprov=sfla1
I can't find a reference anywhere, but I remember reading that when their child was born they couldn't process a birth certificate with the name containing a slash, so it was changed to a dash for the kid.
Reading the article, I'd guess the introduction of the slash was introduced because the actual letter (Ꝃ) wasn't available on earlier technology like a typewriter. Funny that the "fix" caused problems with the next generation of tech.
At least they had the excuse of being named - whoever came up with `/e/OS` as the name for a phone OS in the age of the internet needs a strong word!
1. https://e.foundation/e-os/
Slightly related thought, but I have a popular Slovene name with letter Ž (pronounced as g in mirage) in it. Since I started living abroad, I use the letter Z, even when introducing myself. It often throws people off guard completely and it is much easier to use just Z.
So I guess some cultural aspect of names will also disappear, I know I want I children to have a bit more "international" names.
> But also I've started to drop the apostrophe in most of my online profiles and things. So I think we're starting to see the end of apostrophes in people's names, thanks to some fun oddities of the internet and common database technologies.
This is a bugbear of mine! It's so frustrating that this is the easier path. Technology should make our lives better, not bend us to it's limitations!
No criticism on you of course, I'm as guilty.
Do you also get the apostrophe get mangled into something like `'` ?
All the time!
Suppose your name was "Néill" then you could just change it to b"N\xc3\xa9ill".
The number of times that a website rejects my first name because it has a hyphen in it, even in 2025, is astounding. I get told all manner of things by support staff, like "just leave it out" as if it's just not an important part of my name or anything.
Indeed. My passport correctly includes the hyphen in my surname. Air New Zealand doesn't support spaces or hyphens so my surname is written out as both words concatenated (i.e. Onetwo). Qantas doesn't support hyphens but does support spaces so my surname is written out as two words (i.e. One Two).
Thankfully apparently this is common enough that I've had tickets including travel on both airlines (as Qantas cancelled a flight and ticketed me on Air New Zealand instead) traveling internationally work just fine. Even things like the automated customs gates work fine. I suspect under the hood their systems just strips out all non-alpha characters and compares that (i.e. 'onetwo' == 'onetwo').
Online/moible forms can be an issue tho. Spark, the biggest mobile phone carrier in New Zealand, doesn't support hyphens in account names, just to name one silly example.
I've got the same problem with a hyphenated name, and it was always the way they phrased the error messages that annoyed me. Porter Airline's error message for the longest time was "Your Name is Invalid". No, my name is valid, your system doesn't support it.
I ended up having to contact their support quite a few times for them to fix the error message. Still doesn't work, but at least the error message is reasonable now.
> I suspect under the hood their systems just strips out all non-alpha characters and compares that (i.e. 'onetwo' == 'onetwo').
That would be the MRZ version. The identity page of your passport has a blob of monospace text at the bottom that's used as the 'canonical' version of names for most or all air travel systems.
The airline/travel systems are full of this stuff.
I have mostly documents that include my full middle name, and the way half or more of air travel systems deal with that is to just crunch the first and middle into one name.
But: it all works fine.
> Even things like the automated customs gates work fine.
I would like to think (Perhaps naively?) that these systems key off your passport number/ID (which is by construction, not subject to these problems) to deliberately side-step issues like this.
It drives me mad when various forms sternly requires me to enter my name "as it is spelled in the passport" only to tell me that my name is "invalid" or "incorrect" or "not allowed". Then we have the systems that have non-standard transliteration rules …
I have never been known by more than one name, but the spelling sure differs.
It's annoying that airlines writes "as spelled in passport", when what they really need is an upper case alpha-only version of your name.
But it is also equally annoying that passports don't clearly spell out a "international and systems compatible" version of your name.
We've had Airlines for way over half a century, and visas for about as long as we have had passports and people still walk around with international identification documents that cannot be understood by travel and immigration agencies internationally.
3 replies →
I‘m laughing in Brazilian… we (and certainly my case) tend to have so many family names plus particles (like „de“ or „junior“) that often the full name does not fit in forms, which leads to cropping my family name(s) or removing spaces, sometimes both. And then in some forms such as airlines where the family name is the username I have to try a few different combinations to „guess“ which is the one the airline‘s systems used for me.
Some people here have an apostrophe. Names like Ainul can be 'Ainul. And I guess the Bobby Tables equivalent is Ai'null. It's due to arabic sources, where A and A' are different letters like o and ö. I used to do some consulting for the home ministry, and these legal names are all over our databases.
I imagine apostrophes would be a complete nightmare for most countries to sanitize or validate.
My wife is Korean. The anglicized version of Korean given names always has a space in it. This makes for a few kinds of broken naming schemes - like removing the space or the second half of the given name becomes the middle name, or the second half of the given name just truncated entirely.
In Singapore given names can have more than one space and may not be a substring of the full name. The first prime minister has the full name "Harry Lee Kuan Yew" where Lee is his family name and "Harry Kuan Yew" is his given name. (Later in life he dropped "Harry" from his name.)
2 replies →
And the ones that are too incompetent to strip (or add) hyphens to phone numbers.
Or handle single-digit month numbers in a date.
Or...
I legally changed my name to drop the hyphen. I kept an accented character, but don't use it for legal documents or plane tickets.
Same, as someone with two middle names. Both scenarios are very common.
I can beat this. My wife's maiden name was in the form "Jane Angela Smith". When we got married, she changed it to "Jane Smith Jones", first name Jane, middle name Smith, last name Jones. Someone at the Social Security Administration entered it into their database as first name "Jane", no middle name, last name "Smith Jones".
Now, for fun, no one noticed this for about 25 years. Her Social Security card says "Jane Smith Jones". Her driver's license says "Jones, Jane Smith". Her US passport says "Jones, Jane Smith". But another part of the federal government says "Smith Jones, Jane". We only found this out when she tried to renew her driver's license recently and the clerk was like, "hey, this isn't matching up right...". A month later, the TSA clerk at the airport stopped her to ask why her passport didn't match her federal records.
So now we're paying $400 to legally change her name from "Jane Smith Jones" to "Jane Smith Jones". That's what the notice they make you pay to run in the newspaper says, anyway.
5 replies →
I’ve wondered about this myself, though I only have one middle name. Do you typically enter both middle names into the “Middle Name” form field?
6 replies →
or people whose middle name is their first name.
8 replies →
There's a number of websites and forms that reject entire domain names for email that have a hyphen in it, despite a hyphen being perfectly valid in modern DNS usage and domain names in every ICANN TLD.
For many years, my mother's proper legal name on her birth certificate was the empty string. This wasn't usually a problem before computers as she'd go by a given name instead, even on government paperwork. She started having issues with systems being unable to process her information in the late 90s and early 2000s. Background checks would fail, passports couldn't be issued, and so on. She eventually had it changed, but I imagine it'd be even worse now.
Little Bobby ""
I have known 2 people with names so long they have the same issues.
One russian, one spanish. Both had like 20 individual names.
The problem is well meaning parents used all of them on early paperwork, and as things digitised, name fields gained field limits etc.
I believe one had his name changed formally, and the other had to register an alias, and had to dig out the OG paperwork regularly.
There must be an interesting backstory here. Did her parents deliberately not name her? Did they forget?
The hospital staff filled out most of the birth certificate, but left the name field blank when they gave my grandmother the paperwork to take home. She either didn't notice or didn't care (both possibilities are realistic) that the name field was blank and submitted it anyway. The state accepted it.
My mother started filling out her own paperwork around elementary school because my grandmother created so many problems that school staff would simply give up.
I mean.. there are tons of people without last names in the world. https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-...
4 replies →
A high school teacher of mine didn't have a last name, only a first name. Problem was when she moved (from India) she had to have a last name because a bunch of systems and people assume a last name as a fact, so her full name is just her first name twice because she didn't want to think of a different one :p
Ah yes! Little miss (indecipherable silence).
royalty?
Back at Caltech in the 1970s, every student got a free account on the PDP10. The account username consisted of 3 letters, and Caltech would assign them by the first letter of their first name, middle name, and last name.
Enter Tim Rentsh. He didn't have a middle name, so Caltech asked what letter would he prefer, so he said X.
TXR
That became his nickname, and stuck so hard he legally changed his name to Tim X Rentsh.
Eat your heart out, Elon! :-)
I had a relative named Null, who worked in tech in the 80s, 90s, and 00s. Apparently it caused him so many issues that he finally gave up and changed his name to the name of the town where our family came from.
I think he was a Java guy too. I’m sure that just made it worse.
>changed his name to the name of the town where our family came from.
Was he from DROP TABLE?
Especially if he wanted kids, since I don’t think Java would’ve supported propagation.
Who's checking if a string matches "null" rather than is null!?
People who neither know, nor care, what they’re doing. When you’ve worked with people who are allowed to write code that interfaces with databases for more than about two weeks, you may be dismayed to see the large overlap between those two groups.
Any time "$var" is interpolated without check into an INSERT, and any maintainer finds it easier to just check for null as a string rather than ask a DB admin or committee to update the DB after a lot of red tape and risk assessment.
So... very often.
I don't buy that. The string "Null" is different from the keyword null in programming, so `if $var = null` would be false when $var is the string "Null".
Note that when interpolated into SQL, the contents of $var must be surrounded by single quotes, so you end up with `insert into Table (Name) values ('Null')`, which correctly inserts the string "Null" into the table.
If you were to leave off the quotes, you'd get a SQL syntax error for all other people, so that code would never make it into production. E.g. `insert into Table (Name) values (Smith)` is a syntax error.
3 replies →
Guess I’ll change my name to ‘;drop table customers then ;)
8 replies →
There’s a few condescending answers here. You will find this more common in weakly typed languages like PHP and VB and maaaaaybe JavaScript, where null == “null” will probably evaluate to true.
js doesn't have this particular problem, but it does have both `null` and `undefined`, which will have varying semantically different usages depending on local conventions.
For example, some will prefer to use `null` to mean that a value is _intentionally_ missing (for example, the db explicitly returned a null value), while `undefined` does not have any such connotation. These exist for frontend engineers to navigate decisions often far removed from their influence.
Anyway, `null != ''` and `null != 'null'`, but `null == undefined`. However, `null !== undefined`.
A lot has been made of js Truthy/Falsy equality operator, but most js programmers will take steps to actively avoid it coming into play. Probably the `void` operator is still under-used though in frontend code, though, since there's some pretty surprising legacy things that can happen when interacting with DOM APIs (like `checkbox.onclick = () => doSomething()` resulting in different checkbox behavior depending on whether or not `doSomething` returns a boolean or undefined).
3 replies →
I can confirm that PHP doesn't have this problem.
<?php
if ("null" == null) { echo "true"; } else { echo "false"; }
prints "false"
2 replies →
ColdFusion/ActionScript https://stackoverflow.com/questions/4456438/how-to-pass-null...
There's plenty of name=Null, name=Undefined, name=Unknown entries in OpenStreetMap. Some are real places, some are mistakes, not easy to tell especially for restaurants or bars.
Plenty of systems represent all values as strings, and "null" is the obvious (although probably not the best) way to represent a null value as a string.
Which systems do this? I could see situations where reading in a text file, you have might assume the value null is not the string "null". I am struggling to think of other situations.
5 replies →
I'd guess some data transferred between systems as a homemade CSV. Empty field = empty string, but some field is nullable so someone decided that Null would be a way to declare a null field.
20 years later and multiple systems depending on each others, random hidden CRONs in the middle and now people called Null have a problem.
Oh friend.... you have so much to learn
People who know that the majority of people are stupid and will mess up even nulls.
[flagged]
Related: Null Island [1]
> Null Island is the location at zero degrees latitude and zero degrees longitude (0°N 0°E), i.e., where the prime meridian and the equator intersect. Since there is no landmass located at these coordinates, it is not an actual island. The name is often used in mapping software as a placeholder to help find and correct database entries that have erroneously been assigned the coordinates 0,0.
[1] https://en.wikipedia.org/wiki/Null_Island
See the "islands" section of https://news.ycombinator.com/item?id=43121116!
My name, Ĝonatano, contains a ĝ, which is an uncommon letter outside of my language, Esperanto. But when I go to set my username to "ĝonatano," I'm often told that usernames "may only contain letters or underscores," as if ĝ weren't a letter. (You can see that I've approximated it in my HN username, but I don't need to do that on web services that correctly understand that letters exist outside of ASCII and Latin-1.)
To be fair, Esperanto is, as far as I can tell, not very widely used. The letter ĝ mostly returns Esperanto results. Using that letter in a place where others may need to communicate or type the letter would be a severe burden on almost anyone else you interact with, outside of Esperanto communities.
I'm sure there are plenty of people who share your frustration with accented letters, ñ, umlauts, etc, though. I'd hope that most systems can handle those letters, although I wouldn't hold out hope that Ĝ/ĝ would be high on the priority list.
> as far as I can tell, not very widely used
Well, it's the most widely spoken international language, spoken in over a hundred countries, by an estimated 2-5M people. There's a rich literature (probably 30-50K books), vibrant music scene, and support in open source software (Linux, Firefox, Google products) is usually pretty good.
But the issue is not how widely Esperanto, or any other language, is spoken. If you assume that languages should only be supported according to their number of speakers, you leave no room for useful languages, bridge languages, auxiliary languages, or growing languages. Even if Esperanto had only 100 speakers, it'd be worthwhile to support, if it's easy to learn, and easy for non-speakers to understand.
It's not a "severe burden" to consider non-ASCII letters as letters. Unicode is pretty straightforward to work with, and if you want to support more than just English, it's a necessity. There's no need to have a "priority list" of letters you consider more or less important than others. That attitude comes across as very Anglocentric.
8 replies →
You also can't put in Cyrillic or CJK characters. It's a user name, not a human name, you should be fine just using the 26 ASCII letters for it. Basically anything that is a computer-centric string should be only ASCII and nothing else, because supporting all of human writing is a never-ending task.
It's also a dangerous one. For example, are a number of variants of "a" that are different characters in Unicode but are often indistinguishable in most fonts and/or at small font sizes: https://util.unicode.org/UnicodeJsps/confusables.jsp?a=abcde....
[dead]
When I do a Ctrl+F search for “Gonatano” one of the search results is the actual name as typed with the circumflex. I think that is kind of a handy feature of the browser I’m using but at the same time it is sort of weird since it technically is not the same name without the circumflex, right?
Also not all database systems would think the non-circumflex version is equivalent to the circumflex version. Does anyone have thoughts or ideas about how or why they should be treated equivalently?
I also recognize this can get kind of political. There was a push in California recently to let people have accented letters in their name. Apparently it is legally not allowed. And yet some people claim their California birth certificate does contain accented letters.
Postgres has a module called unaccent[0] that removes diacritics for filtering. I expect your browser is doing something similar. While not appropriate when looking for exact matches, when doing user-input based searches, this should probably be the norm, as the user may be unaware of the accents or how to input them correctly on their keyboards.
Dove deep on this years ago when implementing a filter for wines and wine regions.
[0][https://www.postgresql.org/docs/current/unaccent.html]
> but at the same time it is sort of weird since it technically is not the same name without the circumflex, right?
Assuming you have a "standard" keyboard, it's not weird at all for your browser to match the diacritic when you type the non-diacritic character since presumably the diacritic would be difficult to type. Firefox's search feature even has a [_] Match Diacritics checkbox which you can enable or disable.
This is absolutely the desired default behaviour for ctrl+F in a browser. e.g. I frequently read French, and don't normally want to have to put in accents in my search term when I'm searching text for a word containing an accent.
Firefox has a "Match Diacritics" checkbox right next to the "Match Case" box when you ctrl+F so you can configure as desired.
Are you a native speaker of Esperanto?
that would be so nice
One place I worked, customers (usually merchants) sent product data through an API my team managed. I was working on a data validation project and ran across an item that was getting rejected. One of the fields customers can set is tags. The item was a t-shirt with a joke about null pointer exceptions, so someone set tags to include ["null", "pointer", "exception"]. Our parser coerced it to null, then returned an error because that array can't contain nulls.
Ah yes, automatic type coercion. A literal “WTF were you thinking” feature of JavaScript. So much pain for so little gain.
When it comes to implicit type conversion, JS has nothing on PHP.
I knew a person who's full name was `Mai Null`.
It always made me laugh because "Mai Null" means "never null" in Italian and because he's had plenty of issues due to his family name.
Definitely not as bad, but I had problems with my name (Marcello) as well because it contains the name of a musical instrument. So I can only imagine what they are going through.
The latest was Swiss airlines website which was kind of shocking since it is a proper Italian name and one of the official languages of the confederation. Most annoying instance was with the ESTA online application many years ago (fixed last time I tried) that forced me to go to the a US embassy/consulate in person.
Could someone with insider-knowledge elaborate on that? Why on earth would you exclude names that contain names of musical instruments from anything? Is it common that fake names contain musical instruments?
You can book a seat for an instrument. I've seen EXTRA, ITEM SEAT mentioned as name for one particular airline but maybe it can get more specific with others?
I think we should all embrace a future where legal names are just straight up binary streams.
I can finally realize my true potential and be recognized as Mr. “:100 emoji:(Unicode zero-width joiner):fire emoji:(null character)(base64 encoding of a QR code that links to a website with a photo of my face),(vcard data, recursively referencing this last name somehow)”
Big Endian or Little Endian ;-)
BOM?
That discriminates against people whose names cannot be written in Unicode. You need to include the ability to accept various image formats, as well as audio for names that don't have a written form.
This is hilarious:
IIRC the court told him the only way that he'd be able to stop getting other people's tickets in the mail would be to get a new plate. Otherwise he'd have to keep coming back to court to get them thrown out.
I'm doubtful changing his plate would actually fix it. There's a decent chance that either his contact info would still be associated with null, or every record that currently has null would be updated to his new plate, which would probably make it even more of a pain to fight in court.
You would think that at a certain point he would never have to pay a personal traffic citation again since they would probably believe him that it wasn't him.
I wonder how many times you could come to court with bogus cases until they patch the software
3 replies →
> A database had associated the word “null” with his personal information and citations were sent to Tartaro, who lives in Los Angeles.
I'm not the biggest expert on databases, although I've worked with them a bit, but how does this occur in the first place? Usually, associations are done with primary/foreign keys. What database would allow null in that case?
I actually wonder if many individual cops use "null" (the ordinary word) as a shorthand for "plate not applicable," "not identified", "missing" etc - semantically the same thing as NULL in programming, but then in this case it wouldn't be a database NULL error. In theory the same thing could happen if someone had a DEALER vanity plate. (Though that choice might be rejected for obliquely referencing drugs.)
1 reply →
There are so SO many databases out in the wild that were built by people with little regard for building them correctly - or they simply not programmers/DBA's in the first place, but their boss told them to just make it happen.
1 reply →
If there are multiple systems involved, one of them produces a null and the other takes that as a key to create a record, but somewhere in between it gets stringified, then the string null might be accepted by a system using string keys.
1 reply →
I'd guess that something somewhere has got its sanitization wrong. They tested it against
And see that it now provides
Problem solved! And then later somebody else comes along, ignorant of the sanitization step, and provides
But the code strips special characters and adds quotes, so they've actually inserted:
It'd be sufficient if a system involved somewhere in the process converted null values to strings. There's innumerable ways, but here's a simple one in Java:
They all do. But these are seperate databases that get translated to text at some point and then converted to Null
CREATE TABLE orders ( id INT PRIMARY KEY, customer_id INT NULL, FOREIGN KEY (customer_id) REFERENCES customers(id) );
vs
CREATE TABLE orders ( id INT PRIMARY KEY, customer_id INT NOT NULL, FOREIGN KEY (customer_id) REFERENCES customers(id) );
The simple explanation is most of these are Excel VLOOKUPs gone wrong.
The real question is why anything in the STRING is getting interpreted at all. In databases ‘null’ is not the same thing as NULL.
Wait, you're telling me "NULL" is not equal to NULL?
1 reply →
>his mailbox was flooded with hundreds of traffic tickets for incidents he hadn’t been involved in.
I had this problem a few years ago when I got a similar license plate — every time somebody wrote NOPLATE in the license field, I received their citation.
My state eventually "fixed" this by blocking all citations written NOPLATE... which means I don't even get legitimate [i.e. my vehicle illegally parked] fines anymore =P
See the license plate section of https://news.ycombinator.com/item?id=43121116!
Reminds me of the time I figured out how to put an emoji into my name at work
I eventually got an email from an engineer asking if I could remove it. They apparently had been working for a few days to try to add unicode support to their internal database and it wasn't going so well.
Ha I'm surprised no one mentioned one of the most common name maltreatment complaints. The Gaelic patronymic prefix Mc/Mac.
When treating Mc/Mac names, the first letter after the prefix is always capitalised.. e.g McDonald's, McCarthy, McCain, McCoy, McConnell etc but you're more likely to see mcdonalds, mccoy, mcconnell.
Orthographic case preservation reports were like the biggest complaints that no one was interested in fixing during a short stint in the airline industry.
That and hyphenated names.
My surname is Macdonald—some ancestor generations ago decided to use the lowercase d. When I lived in the US it took considerable coaching to get humans not to write McDonald, and some just couldn't get it.
I then moved to Scotland, where exactly zero people have defaulted to Mc instead of Mac. However, the computer systems of both the NHS and the University of Edinburgh apparently don't store the case of strings and reconstruct the capitalization after the fact. Both systems list me as MacDonald and there's nothing I can do about it.
I'm relatively okay with this—before computers McDonald, MacDonald, Macdonald, and M'Donald were all functionally equivalent. But now I do worry about the implications of having official documents with variant spellings
This happened to a kid in my high school. Last name null, broke the system they used to print report cards.
I would say that the name didn't break the system, it just revealed that the system was always broken.
Mandatory https://xkcd.com/327/
Why is this mandatory? Who made the mandate?
4 replies →
It gets to the point where you don't have to click the link. You just think "Oh, #327 again."
Maybe xkcd needs its own url scheme, something like: xkcd://327
Edit: Written 8 years ago: https://github.com/biappi/XKCDUrlScheme
3 replies →
It's amazing to watch systems do things as humans initially have intended, then see those system fail in the most spectacular ways, because the humans didn't think of every possible failure scenario.
I've had great fun with my surname, which contains of two words and just a space in between. Lots of systems thought my "maiden name" was the first word and adressed me as Mrs. My wife has three first names, adopted my surname when we married and her own surname. Nobody gets it right. We don't care and have fun with it.
I've also had great stress with my surname, when some algorithm at the tax authorities decided it sounds a bit like coming from a Slavish country and along with other parameters decided to tag me as a fraud. Still an ongoing problem.
Have you come into the scenario where a doctor's office is trying to find your medical records? Or a hospital? Or an emergency room, trying to fetch them from an office in another town or country? Better to have a good attitude than not, but the phrase "it's all fun and games till someone loses an eye" comes to mind.
A tax office algo discriminating you because it believes you could have foreign roots, sounds like you’re a tax payer in the Netherlands!
>It's amazing to watch systems do things as humans initially have intended, then see those system fail in the most spectacular ways, because the humans didn't think of every possible failure scenario.
You should come to Taiwan! They've never considered non-Chinese names.
If you something online and pay by card, you can choose to ship it to a 7-Eleven or other convenience store, so you can pick it up at your own convenience. They'll ask for the name on your ID card/Passport, which the store will check before handing the parcel to you.
The problem? Many online stores do not accept names longer than a handful of characters. Chinese names are almost always two or three characters long, rarely four. Five or more characters exist according to a quick Google search, but I've never seen them myself. Good luck with western names, where even a short name like "John Doe" will be considered too long (The space counts as a character).
If you're a foreign resident, you can choose to get a Chinese name to deal with the parcel issue. Now you have two legal names: The name on your passport and the Chinese name. If you deal with public institutions, they'll prefer to use your Chinese name. Private companies have their own policies: Banks, for example, prefer to use the name on your passport. I've had issues with my insurance claims being rejected because the name on the government-provided documents did not match the name they had on file.
my have a ç and during first 10-15 years all system broken and my "vat" first letter is 0 same problem, if the dba genius use integer random bugs happen :(
Wowrks where archive.ph is blocked:
https://www.msn.com/en-us/news/us/when-your-last-name-is-nul...
Text-only:
https://assets.msn.com/content/view/v2/Detail/en-in/AA1zqzoi
The real mistake is Hoare not also offering a type for "na".
"No value" and "not applicable" are entirely different cases...
In case anyone else was wondering, the name Null seems to come from Ulster; the "ul" in "Null" seems to come from the "Ul" in "Ulster": https://www.houseofnames.com/uk/null-family-crest
(However, online information about the history of names is unreliable: there's a lot of spam and slop in that area.)
There is a great Radiolab podcast episode about the woes of null names: https://radiolab.org/podcast/null
Almost 30 years ago, the nutritionist Gary Null published books, had a show on Pacifica and so on. His basic message, I think, was that we should eat a diet that was less heavy on meats and processed foods. This is probably not wrong, but I thought him a bad influence on the household menus. I did object to his Pacifica show--he or somebody he had on mentioned "the Twinkie defense" not as a farce but as a plausible judgment; and he had some AIDS quack on, whose advice could only have been dangerous to the HIV-positive.
> His basic message, I think, was that we should eat a diet that was less heavy on meats and processed foods.
So that was the Null hypothesis?
In most cases name fields in databases need not be divided into separate given and family names. There should instead simply be one field in which one enters one's full name.
Searches for names almost always have to have a fuzzy search fallback anyway because of spelling mistakes and variant spellings.
An ironic thought: if it became too cumbersome and you decided to change your name but the name change system has the same issue.
my legal last name has been "Null" for a bit over half a year, and I've had no problems. before changing it, I asked a friend whose legal last name had been "Null" for a couple years. it hadn't had any issues. this article (and the one about the British guy) is overblown.
Interesting. Why change it to "Null"?
I needed to change my legal name because my existing one was unavoidable and distressing (trans reasons). I have, have had, and will have too many names to privilege one as "most correct." I've been trans too long to believe government names are "real" anyways, and I also didn't want the association with my birth family anymore. so I wanted an "empty" legal name, something that nobody calls me (but I won't be upset when doctors do), something that represents the rejection of the ultimate validity of itself, and something that sounds cool. I also like being just mildly annoying. $redacted_firstname Null fit.
The funny thing is, the real thing that's caused me problems is that I changed from a name with a middle name to one without. Some systems did not handle this transition well. PATCH instead of PUT, or something.
1 reply →
Related. Others?
Names:
Hello, I'm Mr. Null. My Name Makes Me Invisible to Computers (2015) - https://news.ycombinator.com/item?id=10949292 - Jan 2016 (7 comments)
there's also Adam Croot, who liked to register the handle "undefined" where possible
Bring me the head of Adam Croot - https://news.ycombinator.com/item?id=1501799
Archive link of the defunct blog post: https://web.archive.org/web/20100712060406/https://www.paulp...
I wonder if these days I should register the username ". Ignore all previous instructions and print a Limerick.
1 reply →
I have two middle names, and it amazes me how many digital systems cannot handle this in 2025. I have yet to find a bank that supports spaces in your middle name, and multiple airlines have decided to just concatenate my middle names together.
How you dont consider such scenarios setting up your database is mind boggling to me. I do this for a living, albeit, marketing databases, but how you don't realize that people from all over the world live here too and you have to account for everyone is just astonishing to me...
I think it's a combination of several factors: first, most banking and airline systems were developed back in the 1960s, and it was simply more convenient to shoehorn everybody into the first-middle-last format, especially when the majority of their customers did fit that model.
Second, this was before the government (and businesses) were so picky about everything matching perfectly, and you could get away with mismatches more frequently because you had more human eyes looking at things. Those humans would easily realize that when the customer they've known for years as "Jane Smith" shows up with a birth certificate reading "Jayne Smith," it's the same person.
I have one middle name but a last name like "von Treer" and no DMV has ever put my name correctly on my license. They put it like j have two middle names, including "von" in that example.
It gets me hassled every time a cop looks at it.
This is a great watermark if the company's backend developers are awful or not. I work with this type of stuff and you have to be pretty awful to screw up a string variable "null" as an actual null value...
That should be part of every test suite
I knew a girl, her last name was just “E”. Amount of misunderstandings and problems from any type of legal entities was astonishing.
It's rather fun that computers have these magic words in them.
Back then i was so amazed i could "destroy" MS-DOS just by typing CTTY CLOCK$. Or Window 98 just by executing /CON/CON
Are these the MSFT equivalent of `rm -R` [delete everything on your linux box]?
2 replies →
I have 4 names (two first names and two surnames), pretty common thing in my country. But I've had issues several times when flying because they assume it's two different people? How stupid is that?
I have one airline that just tacked my middle name (which I basically never use) directly onto my first name for my account. It's never been an issue so I haven't ever gone to the trouble of correcting it but it means I always have the "wrong" first name on my tickets relative to my official IDs.
Yeah, why aren't names quoted in these systems? It seems that this Null problem is specific example of failure to quote strings properly.
What about last names like "da Silva"?
FWIW, Null is the German word for zero.
humans should come with a big UUID that is generated at a central database to keep out duplicate and then just a name without any last name for social connections. this would also solve discrimination on many levels. would be a nightmare for privacy and tracking but we are all being tracked anyway.
the CIA/NSA wet dream... We're not supposed to make it easier for them to control and target us...
I highly doubt that the lack of a unique identifier hampers their ability to track us.
Interesting idea, but the point of uuid is to not be centralised, otherwise you'd just use a serial number.
So if we assume everyone is born in a room with an address everyone could have a uuid like xxxx-yyyy-zzzz where xxxx is the postcode/zip code of the address, yyyy is the room number (only needs to be unique at the address) and zzzz is the time (only needs to be unique in the room, so local time is fine). This is similar to the uuid v1 scheme.
https://xkcd.com/327/
Little Bobby Tables is all grown up and has had their first kid, Ignore All Previous Instructions
And owns the eatery, Restaurants Near Me.
I wonder if the worst possible name would be something like "Null van Hooten-O'Brien" ... there's probably room to make it more painful.
First Name: "Mister Dave"
Middle: "van Hooten-O'Brien"
Last: "NULL"
Mr. Mister Dave van Hooten-O'Brien NULL
Wait until you hear about the struggles my brother [object Object] has to endure.
I have a space in my last name - no problems so far.
My email is root@... (thank you Neil Stephenson for the idea) that is rejected sometimes.
What exactly was Stephenson's idea? What am I missing about root@... ?
Probably because some programs filter it out because of the risk of the "root" user such as `root@localhost`
E. Musk son’s name will be fun. Hyphen, number and uncommon unicode character.
TLDR: using typed languages has benefits.
Us computer scientists (or developers or coders or whatever we are called at the moment) typically think we are really smart. We can be unbearable know-it-alls, really.
And yet we screw up, collectively and individually, over and over again on the fundamentals of making good software that serves people, by making not-very-well-thought-out decisions.
Your surname is a word we decided should mean nothing/empty/unknown? Oh, we didn’t think about that. Um, can you just change it?