Sunday, November 19, 2023

My answer to the question on r/sql: "What is the most common SQL mistake you seen get made by folks?"

Concatenating user inputs instead of binding parameters. Then when a bug ticket comes in for customer O'Reilly (made-up example), instead of replacing the technique of concatenation with binding parameters (aka parameterization), writing a bug fix that escapes quotes. 🤦

I give talks at my company about SQL and how to use it effectively - I gave one just about this subject recently. The short version of an analogy I like (the long version involves a live demo) is that in Java coding, if you wanted to write a basic, command line program that takes an argument, you would use Java's String[] args parameter to the main method. You would NOT bypass this correct way to do it, instead using a special macro in the source code, whose value you replace with a tool such as sed before compilation. It's intuitively silly, and it unnecessarily forces you to enter a minefield that you must navigate - checking for dangerous special characters such as semicolons and comment characters that could lead to bugs, or even worse, the dreaded "Java injection". (Better sanitize that input! /s) In the live demo, I show that by using this ridiculous technique, one could delete a file by passing some file deletion code as an argument. Of course, you shouldn't have to do this - just pass your arbitrary argument to the program by the perfectly safe means provided to us by Java: String[] args and be done with it. Doesn't matter if the arg contains semicolons, comments, file.delete(); or anything you can think of. It is setting the value for a variable, as opposed to dynamically mixing the value with the source code. It is safe.

Somehow, it's less widely known, but it's the same situation in SQL. Think of your SQL query like source code - it gets parsed and a binary query object is produced, just like a Java source file being compiled to a class file. (Details vary per DBMS.) To plug in parameters, you should use the means provided to us: namely, parameterization via JDBC, ODBC or whatever. Do NOT modify the SQL "source code" to plug in parameters, just as you wouldn't do so in the Java analogy above. When you parameterize correctly, you do not have to worry about escaping quotes, handling semicolons, nor do you have to check for special keywords like DROP TABLE, or otherwise "sanitize your inputs" (as the famous Bobby Tables xkcd comic unfortunately suggests) in any way whatsoever. It's the exact same thing as the Java case. "SQL injection" should be as ridiculous of a phrase as "Java injection" was above. (I'm aware of dynamic SQL but I'm omitting nuance for the sake of brevity, and dynamically constructing queries based on arbitrary user inputs is far less common than passing in parameter values.) Parameterization is setting the value for a variable, as opposed to dynamically mixing the value with the source code. It is safe.

That's in the domain of writing application SQL. As someone who reviews MySQL and Snowflake schema changes for about 200 developers, in that department the most common general mistake is a lax attitude toward testing. When I ask a developer to provide evidence of what the state of the relevant table(s) looked like before and after their proposed change, I often get pushback: "What? I'm just modifying a varchar column's max length, why do I need to test that?"

So here's a demo of the single most common error I see regarding THAT specific thing in MySQL. Accidentally wiping out column attributes because those attributes must be re-specified! https://dbfiddle.uk/vmpGShfp

But this isn't the same in every SQL database. (Another general attitude that is totally mistaken is that SQL is SQL and is the same everywhere. I could rant about THIS subject for hours.) But let me just say for now, the above MySQL "gotcha" does not exist in, for example, Postgres. Here is a similar demo using Postgres. Note that in Postgres, extending the varchar length DOES NOT destroy the NOT NULL constraint, default value, etc.: https://dbfiddle.uk/OB8hRoJv

 

Source, myself on: https://old.reddit.com/r/SQL/comments/17wo0tr/what_is_the_most_common_sql_mistake_you_seen_get/k9i99x0/

No comments: