Saturday, February 03, 2024

A Bonkers SQL Feature!

It’s time for a “Bonkers SQL Feature” post:

A (IMO) bonkers standard SQL behavior is that newlines between string constants cause implicit concatenation. Spaces between them do not. And if you try a similar thing but with column names or variables names (i. e. not constants), there is no implicit concatenation. Use two column names in this manner, and you end up with the second column treated as an alias for the first. :)

A quick demo follows:

/* newlines concatenate */
postgres=# select 'a'
'b'
'c'
;
 ?column?
----------
 abc
(1 row)

/* spaces do not concatenate - causes syntax error */
postgres=# select 'a' 'b' 'c';
ERROR:  syntax error at or near "'b'"
LINE 1: select 'a' 'b' 'c';
                   ^

/* let's test newlines but with column names */
postgres=# create table t (a varchar, b varchar);
CREATE TABLE

postgres=# insert into t values('aaa','bbb');
INSERT 0 1
postgres=#
postgres=# select a
b
from t;
  b
-----
 aaa
(1 row)


Per the Postgres documentation, the behavior defined by standard SQL is, and kudos to them for calling it bizarre:

Two string constants that are only separated by whitespace with at least one newline are concatenated and effectively treated as if the string had been written as one constant.

(This slightly bizarre behavior is specified by SQL; PostgreSQL is following the standard.)

A potential danger in this behavior is if you break up a list using newlines but forget to write a comma, such as:

/* basic IN check */
postgres=# SELECT 'foo' IN ('foo','bar');
 ?column?
----------
 t
(1 row)

/* here 'bar' is on the second line and I forgot to add a comma, so the check becomes `'foo' IN ('foobar')` essentially, resulting in false
postgres=# SELECT 'foo' IN ('foo'
postgres(#  'bar');

 ?column?
----------
 f
(1 row)

MySQL does similarly, BUT horrifyingly to me, they decided to change “Two string constants that are only separated by whitespace with at least one newline” to “Two string constants that are only separated by whitespace.” In other words, it takes this oddball behavior one step further and does it for spaces as well, despite what the standard says. (Although perhaps it makes more intuitive sense not to differentiate between different kinds of whitespace, standard be damned??)

MySQL:

/* newline test is consistent with Standard and Postgres */
mysql> select
    -> 'a'
    -> 'b'
    -> 'c'
    -> ;
+-----+
| a   |
+-----+
| abc |
+-----+
1 row in set (0.00 sec)

/* space test - WAT!! */
mysql> select 'a' 'b' 'c';
+-----+
| a   |
+-----+
| abc |
+-----+
1 row in set (0.00 sec)

/* column name test is the same */
mysql> create table t (col1 varchar(3), col2 varchar(3));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t (col1, col2) values ('aaa', 'bbb');
Query OK, 1 row affected (0.01 sec)

mysql>
mysql> select col1
    -> col2
    -> from t;
+------+
| col2 |
+------+
| aaa  |
+------+
1 row in set (0.00 sec)

Not sure what else to say except...Be careful out there. :)