Comment by jwr

2 days ago

I revisited this discussion after 3h and now I'm even more terrified, because most (if not all) replies totally miss the point.

Again: no computer system should ever interpret anything in the "last name" field. It should always be handled "in gloves", as an opaque value. It's not about typing, it's not about "paying for clear code", it's not about HTTP, I guess it might be about "best practices", but come on — this should be obvious!

I'm thinking about my systems now and I'm having a hard time coming up with a scenario where interpreting something inside a string value is even POSSIBLE (I use Clojure), unless you explicitly try to read from the string and interpret the results, and even then it's not easy.

I know that in the olden days we used to just feed user-supplied values into shells, with no regard for in-band vs out-of-band distinction. I also know that the HN-beloved SQL makes no distinction of in-band vs out-of-band, which causes a load of problems with proper escaping, hence Little Bobby Tables. But aren't we past that? Does anybody still construct SQL queries by concatenating user-supplied and app-supplied strings?

I agree with you within a single data domain.

But is a smart comment buried in there about ETL systems and data exchange where it's pretty easy, and arguably correctly in some cases, to get "null" in an exported field. Then the importing system, again arguable correctly, needs to handle the null case as a true null, not "null." I'm not sure there is a very easy fix for this or an obvious best practice.

  • Again, there is a misunderstanding.

    "null" is not the same as null. One is an opaque string, the other is a language value. One should NEVER build systems that confuse the two, or allow one to be interpreted as the other.

    If you export data, export nulls as nulls, not "nulls". If you import data, NEVER interpret what's inside strings.

NULL might be interpreted if you interpolate your strings directly to SQL instead of using parameters (Drilled into me in 1st year that you should avoid it like the plague, but for some reason surprisigly common in older systems I've encountered)

string name = null; $"INSERT INTO users (name) VALUES ('{name}')";

Basically an involuntary SQL Injection