Comment by jwr

2 days ago

This is terrifying, because it implies that so many computer systems interpret user-supplied data as what should be out-of-band values. No computer system should ever interpret what is in the "last name" field, it should be a sequence of characters only. Every attempt at interpretation is an exploit waiting to happen.

Typed data?! What's next, unit tests? Memory safety? Where will it stop?

  • I especially love when non-technical managers boldly claim that the customer doesn't pay for best practices and clean code, as if it's some sort of costumer-focused bold declaration that is more in touch with reality than what every other developer thinks.

This is the idealistic approach. Then reality comes knocking at the door.

It's sad people get collateral damage and it should be fixed, but the world turns around with millions of these workarounds everywhere.

The most interesting I remember is stock management software: at 3 different places I had the employees dealing with stock explain how they set outrageous prices (like 9999999.99 EUR for a toothbrush) to keep an item in the inventory system but mark it as unavailable for a day or less until it gets restocked again.

I'm sure there is an official way to put something out of stock and restock it again, but it's just painful for an operation that happens basically everyday, for dozens of items.

  • > I'm sure there is an official way to put something out of stock and restock it again

    And I wouldn't be surprised if there is no official way.

I revisited this discussion after 3h and now I'm even more terrified, because most (if not all) replies totally miss the point.

Again: no computer system should ever interpret anything in the "last name" field. It should always be handled "in gloves", as an opaque value. It's not about typing, it's not about "paying for clear code", it's not about HTTP, I guess it might be about "best practices", but come on — this should be obvious!

I'm thinking about my systems now and I'm having a hard time coming up with a scenario where interpreting something inside a string value is even POSSIBLE (I use Clojure), unless you explicitly try to read from the string and interpret the results, and even then it's not easy.

I know that in the olden days we used to just feed user-supplied values into shells, with no regard for in-band vs out-of-band distinction. I also know that the HN-beloved SQL makes no distinction of in-band vs out-of-band, which causes a load of problems with proper escaping, hence Little Bobby Tables. But aren't we past that? Does anybody still construct SQL queries by concatenating user-supplied and app-supplied strings?

  • I agree with you within a single data domain.

    But is a smart comment buried in there about ETL systems and data exchange where it's pretty easy, and arguably correctly in some cases, to get "null" in an exported field. Then the importing system, again arguable correctly, needs to handle the null case as a true null, not "null." I'm not sure there is a very easy fix for this or an obvious best practice.

    • Again, there is a misunderstanding.

      "null" is not the same as null. One is an opaque string, the other is a language value. One should NEVER build systems that confuse the two, or allow one to be interpreted as the other.

      If you export data, export nulls as nulls, not "nulls". If you import data, NEVER interpret what's inside strings.

  • NULL might be interpreted if you interpolate your strings directly to SQL instead of using parameters (Drilled into me in 1st year that you should avoid it like the plague, but for some reason surprisigly common in older systems I've encountered)

    string name = null; $"INSERT INTO users (name) VALUES ('{name}')";

    Basically an involuntary SQL Injection

You are right. At the same time this is a very common issue and an attack vector[1][2][3]. E.g.: an existing book called "<script>alert("!Mediengruppe Bitnik");</script>" is still not shown correctly by some websites[4].

[1]: SQL injection: https://en.wikipedia.org/wiki/SQL_injection

[2]: Cross-site scripting: https://en.wikipedia.org/wiki/Cross-site_scripting

[3]: Exploits of a Mom (Bobby Tables) XKCD: https://xkcd.com/327/

[4]: https://www.tomlinsons-online.com/p-16381221-scriptalertmedi...

  • I used this technique on an auction site once. It allowed the script tag in my username, so I used it to remove the "bid" button once I had bid -- nobody behind me could outbid me.

    It went about as well as you would expect using it for fraud. Which is to say, not well at all (;´Д`)

I believe the primary point where this goes wrong is in http. There, null and "null" are indistinguishable. Using json would help, but older forms don't.

In YAML (the gift that keeps on giving, see the False/Norway debacle) the last name of Null would have to be quoted, otherwise it would signify the null value.

pretty sure this is just the result of bad programmers trying to compare things against “null” (the string) and not whatever the null value is in their language. i don’t think they are evaling the the last name field or something.