← Back to context

Comment by tomjakubowski

7 hours ago

I've seen it quite a lot in my career: even when prepared statements are available and easy to use from a SQL client library, many programmers will simply not use them, in favor of format strings and string concatenation (maybe with an attempt to quote/escape user input).

Just having support for the right way isn't enough. You have to put up roadblocks when people try to go the wrong way.

Why is a format string or string concatenation (or interpolation, what I would use) the “wrong way” when all user input (more precisely: all string literals) are properly escaped?

  • The main reason is that a lot of the reason comes around that it is incredibly difficult to do this in a general case just because of the grammar of SQL. Especially with the very different dialects, in the worst case you can get unintended remote code execution[1]

    There's an incidental performance benefit on some database engines as well. When you write a SQL query, in general the database engine has to compile this to a form it can use

    If you use raw string concatenation, "SELECT USERS FROM table WHERE id=1" might compile to something like (pseudocode below)

        def prepstatement1():
            ...
    

    So if you use an explicit prepared statement[1], something like "SELECT USERS FROM table WHERE id=?" might compile to something like

        def prepstatement2(id: int):  # <--- notice the new parameter here
           ...
    

    Some database engines also have the ability to cache a prepared statement and so these are a lil bit faster. Remember, your database has to still compile the string concatenated case, it's just a little bit hidden.

    [1]: For example SQL Server has xp_cmdshell: https://learn.microsoft.com/en-us/sql/relational-databases/s...

    [2]: https://en.wikipedia.org/wiki/Prepared_statement