Thursday, November 10, 2016

MySQL - numeric types don't let you specify "length"

In MySQL I often see developers asking themselves, "Should I make this column an int(10) or int(11)? Should this bigint be bigint(15) or bigint(20)?" The answer is it probably doesn't matter! That number in parentheses does NOT indicate the maximum length of the number. For example, smallint(3) does not max out at 999, but rather it has a range of -32768 to -32767, just like any other smallint (which is 16 bits). The ONLY thing this parenthetical number has anything to do with is an esoteric, display-related feature called ZEROFILL.

ZEROFILL, if enabled, pads the number with zeroes for display purposes -- for example a int(10) with a value of 12345 will come out in your resultset as 0000012345. That's it. That's all it does. Read up more on it here:  

So, in my opinion you should NEVER even bother to define a column as "int(10)" or "bigint(20)" etc., etc., but rather just define an "int" or a "bigint". The only way to express a max length is in terms of which type to use in the first place: tinyint, smallint, int, bigint. (I think that's all of them - int types anyway. :)) So save yourself the clutter of useless additional information, and save yourself the waste of brainpower thinking about max lengths that aren't even real.

Thursday, July 07, 2016

Say NO to Venn Diagrams When Explaining JOINs

Nice post on the jOOQ blog: Say NO to Venn Diagrams When Explaining JOINs ...It's about how visualizing joins as Venn diagrams isn't the best fit -- it's much better for set operators.

Saturday, June 11, 2016

Neat graphical summarization of SQL joins

I'm not sure who to credit for it, I just saw it linked to on reddit with no source.... Note the first one (with the comma) is also called a CROSS JOIN.

Edit: Apologies to the color blind...

Generating a series of integers as a set of rows

In my last post, Cumulative row count over date ranges, I take advantage of a postgres function called generate_series(start, stop) - this generates a series of values, from start to stop. So, for example, generate_series(2,4) returns:

SELECT * FROM generate_series(1,4);

A question that’s been on my mind is how to achieve this in other database systems?


Oracle has a way to generate a series as well, but it’s just not quite as simple as calling a function:

SELECT level i
FROM dual
CONNECT BY level <= 4

Less intuitive but it works nicely. What it’s doing is it uses the CONNECT BY clause on the dual table - this means the 1 column, 1 row dual table is being “connected” to itself recursively — this is called a hierarchical query ( “Level” is a special pseudocolumn that tells you the level of recursion depth. So, while this method is more complicated, it does work nicely!


The MySQL way….

I didn’t know how to achieve generate_series in MySQL off the top of my head, so off to google! Unfortunately the method I found depends on selecting from a table of at least n size, where n is the number of values you wish to generate. Join on a subquery that declares a variable @num, and in your SELECT clause, increment this value per row. I found this solution here, they are using dates but it can be simplified to generate integers.

SELECT @num:=@num+1

| @num:=@num+1 |
|            1 |
|            2 |
|            3 |
|            4 |

4 rows in set, 1 warning (0.02 sec)

Works great but with the caveat that the "dummy" table you select from, in this example INFORMATION_SCHEMA.COLUMNS, must have at least as many rows as you want values. A little scary, in my opinion, to use in a generalized way with this limitation.

MS SQL Server

Oracle-like recursion can be done in SQL Server as well, and wouldn't you know it, some nice developer wrote a generate_series function for SQL Server and shared it on github

Why is generating a series useful, anyway?

There are probably more use cases for generating a series than I can come up with! But as in the previous post I linked to at the top of this one, a series can be useful for implementing something like a for loop. Join against the series as if it were another table, and use its value (call it i) as a parameter to a function, then voila, suddenly you're calling function f for each value of i: f(1), f(2), ..., f(n). This is a great way to create more elegant queries by avoiding copied/pasted repetition. (Again as see in my last post.) 

That's all for now!

Saturday, June 04, 2016

Cumulative row count over date ranges

I recently helped a coworker (Mr. M) with his SQL query that gets a cumulative count over a set of ranges of dates. He wrote a query that functionally works just fine, but he didn't like it because there was a lot of repetition happening. It looked something like:


Imagine in place of [datex] there is some mathy date expression -- I'm leaving it out for now (because I'm lazy) but the idea was subtract a week from today for date1, subtract 14 days from today for date2, etc.

So Mr. M did not like that his copy/pasted query parts violated DRY - Don't Repeat Yourself. I think that's less bad than all this UNIONing and repeated checking against CREATED_ON. It's running 5 different queries that are kind of redundant as they are looking up by the same column repeatedly. How can we improve this query? My first instinct was to use windowing functions, but alas, this is MySQL which lacks them (remember, MySQL is a hair band). I'm sure we can do better, but here's what I came up with as a first hack:

SELECT SUM(bucket1) AS week1,
 SUM(bucket2) AS week2,
 SUM(bucket3) AS week3,
 SUM(bucket4) AS week4,
 SUM(bucket5) as week5
 SELECT CASE WHEN CREATED_ON <= ADDDATE(current_date(), INTERVAL -1-(0*7) DAY) THEN 1 ELSE 0 END AS bucket1,
 CASE WHEN CREATED_ON <= ADDDATE(current_date(), INTERVAL -1-(1*7) DAY) THEN 1 ELSE 0 END AS bucket2,
 CASE WHEN CREATED_ON <= ADDDATE(current_date(), INTERVAL -1-(2*7) DAY) THEN 1 ELSE 0 END AS bucket3,
 CASE WHEN CREATED_ON <= ADDDATE(current_date(), INTERVAL -1-(3*7) DAY) THEN 1 ELSE 0 END AS bucket4,
 CASE WHEN CREATED_ON <= ADDDATE(current_date(), INTERVAL -1-(4*7) DAY) THEN 1 ELSE 0 END AS bucket5

Well, I wasn't able to remove copy/pasted repetition, but I do think removing the UNION ALLs is a positive. What does this query do?

Step 1) The subquery. This is the bulk of the query, which passes over the whole table just once, and puts each row in an bucket from 1 through 5. It will perform 1 scan per row and 5 comparisons per row, vs. scanning each row 5 scans per row and 5 comparison per row.

Step 2) Sum them up!

Now in this specific instance the performance gain was pretty negligible, and we didn't really neaten the query all that much. I wouldn't blame Mr. M if he decides to stick with old query. As far as performance goes, the issue is this table was very small with only about 50k rows, therefore the cost of repeating 5 queries was negligible compared to all the added logic of those CASE WHEN expressions that I added.

Is there a way to avoid the repetition? 
The answer is, for MySQL, I don't know! But I wonder if I can figure something out with Postgres...

So first I made a sample table MYTABLE with 100,001 rows of randomly generated data.

To write the query, I decided to define the 5 dates using a common table expression, this way we can reference it as an imaginary table, so to speak:

mwrynn=# WITH weeks AS (SELECT now() - i * interval '1 week' AS date_point from generate_series(0,4) i)
SELECT * FROM weeks;

 2016-06-04 13:14:00.560272-04
 2016-05-28 13:14:00.560272-04
 2016-05-21 13:14:00.560272-04
 2016-05-14 13:14:00.560272-04
 2016-05-07 13:14:00.560272-04
(5 rows)

Next, let's join that to our table so that for each date_point, get the rows where created_on is less than or equal to the date_point...

mwrynn=# WITH weeks AS (SELECT now() - i * interval '1 week' AS date_point from generate_series(0,4) i)
SELECT * FROM weeks JOIN mytable ON mytable.created_on <= weeks.date_point;

          date_point           | created_on 
 2016-06-04 13:15:40.869402-04 | 2015-12-12
 2016-06-04 13:15:40.869402-04 | 2016-02-19
 2016-06-04 13:15:40.869402-04 | 2016-04-20
 2016-06-04 13:15:40.869402-04 | 2016-02-01
[snipping tons of rows]

Final step: get the counts per date_point:

mwrynn=# WITH weeks AS (SELECT now() - i * interval '1 week' AS date_point from generate_series(0,4) i)
SELECT COUNT(*), date_point
     mytable ON mytable.created_on <= weeks.date_point
GROUP BY date_point
ORDER BY date_point DESC;

 count  |          date_point          
 100001 | 2016-06-04 13:17:29.33812-04
  98232 | 2016-05-28 13:17:29.33812-04
  96290 | 2016-05-21 13:17:29.33812-04
  94313 | 2016-05-14 13:17:29.33812-04
  92350 | 2016-05-07 13:17:29.33812-04
(5 rows)

TA-DA!!! WE DID IT! We got the cumulative counts without copying/pasting several queries or several conditions - we just had to define the weeks we concerned about (weeks 0 through 4) in the call to generate_series(), put that in a CTE, and query that with a join... [Edit: Note, we should even be able to parameterize the number of weeks by doing something like generate_series(0,?) -- this way we could easily use any number of weeks.] My initial instinct of requiring window functions was wrong. But a big THANK YOU TO COMMON TABLE EXPRESSIONS. :) 

Question: Does anyone know how to do this in MySQL??

Day-After Edit: I guess a similar technique could be used in MySQL, only we must replace the CTE with a subquery, and replace generate_series with some way of selecting 0-4.... I guess this would work: (SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4) - but perhaps this would count as the type of repetition we want to avoid...Another edit: apparently there is a solution, albeit an ugly one:

Saturday, May 07, 2016

Postgres queries using Window Functions, Part Deux

Since my post Postgres query using window functions, recursive query is my most popular of all-time (maybe not saying so much for my little blog :)), I thought I would write more on the subject. How about some useful examples?

A Cumulative Sum

First time I saw this my mind was blown - I didn't realize SQL could be so flexible! Maybe I have a table that represents sales per day:

mwrynn=# create table sales (sales_date date, dept varchar, total_sales numeric(10,2));

/* after some inserts which I'll skip, we have 4 days per dept (maybe it's a department store database) */
mwrynn=# select * from sales;
 sales_date | dept  | total_sales 
 2016-01-01 | men   |      100.00
 2016-01-01 | women |      140.00
 2016-01-02 | men   |       85.00
 2016-01-02 | women |       70.00
 2016-01-03 | men   |      135.00
 2016-01-03 | women |      135.00
 2016-01-04 | men   |      200.00
 2016-01-04 | women |      240.00
(8 rows)

/* now let's get a cumulative sum for each dept per day */
mwrynn=# select sales_date, dept, total_sales, sum(total_sales) over (partition by dept order by sales_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) from sales;
 sales_date | dept  | total_sales |  sum   
 2016-01-01 | men   |      100.00 | 100.00
 2016-01-02 | men   |       85.00 | 185.00
 2016-01-03 | men   |      135.00 | 320.00
 2016-01-04 | men   |      200.00 | 520.00
 2016-01-01 | women |      140.00 | 140.00
 2016-01-02 | women |       70.00 | 210.00
 2016-01-03 | women |      135.00 | 345.00
 2016-01-04 | women |      240.00 | 585.00
(8 rows)

So simple! But the syntax is a little unintuitive, at least to me. Let's break down the key parts.

sum(total_sales): this is just your basic aggregate function sum()
over (): this means to apply the sum function OVER specific subsets, or partitions, of our data.
partition by: defines the partitions. Here we are telling Postgres, "split up the data by dept, aka into men and women, and apply our sum function to each as a completely separate subset (partition)"
order by: sort each partition by this clause, sales_date in our case.
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: means to apply the sum for the current row and all preceding rows. This is known as the windowing frame. This is the key to achieving a cumulative sum - for each row, we want all the previous rows + the current row, but not the subsequent rows...

We can also run a simpler version of the query, taking the whole dataset as one partition instead of splitting by men and women:

 mwrynn=# select sales_date, dept, total_sales, sum(total_sales) over (order by sales_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) from sales;
 sales_date | dept  | total_sales |   sum   
 2016-01-01 | men   |      100.00 |  100.00
 2016-01-01 | women |      140.00 |  240.00
 2016-01-02 | men   |       85.00 |  325.00
 2016-01-02 | women |       70.00 |  395.00
 2016-01-03 | men   |      135.00 |  530.00
 2016-01-03 | women |      135.00 |  665.00
 2016-01-04 | men   |      200.00 |  865.00
 2016-01-04 | women |      240.00 | 1105.00
(8 rows)

The rank() Function

We can riff off this same table/query and ask Postgres another query -- give me the rank of sales per dept, lowest to highest. I want to see on which date I sold the least, the most and everything in between. The key is in the rank() function...

mwrynn=# select sales_date, dept, total_sales, rank() over (partition by dept order by total_sales) from sales;
 sales_date | dept  | total_sales | rank 
 2016-01-02 | men   |       85.00 |    1
 2016-01-01 | men   |      100.00 |    2
 2016-01-03 | men   |      135.00 |    3
 2016-01-04 | men   |      200.00 |    4
 2016-01-02 | women |       70.00 |    1
 2016-01-03 | women |      135.00 |    2
 2016-01-01 | women |      140.00 |    3
 2016-01-04 | women |      240.00 |    4
(8 rows)

The ntile() Function

This function is called ntile(), perhaps as in a generic version of "quartile" or "percentile", in which you can specify any number. Let's say I want to put the above sales into buckets. I only have a small amount of data so we'll use a bucket number of 2 -- this allows us to bucket each row into either the lesser half of sales or the greater half.

mwrynn=# select sales_date, dept, total_sales, ntile(2) over (partition by dept order by total_sales) from sales;

 sales_date | dept  | total_sales | ntile 
 2016-01-02 | men   |       85.00 |     1
 2016-01-01 | men   |      100.00 |     1
 2016-01-03 | men   |      135.00 |     2
 2016-01-04 | men   |      200.00 |     2
 2016-01-02 | women |       70.00 |     1
 2016-01-03 | women |      135.00 |     1
 2016-01-01 | women |      140.00 |     2
 2016-01-04 | women |      240.00 |     2
(8 rows)

Simple enough, right? So again we've partitioned by the mens vs. women dept, and we're getting the lesser half of sales (bucket 1) vs. the greater half of sales (bucket 2).

Wrapping Up

Well that's enough of window functions for now. We've only begun to scratch the surface, so I may come back and add to this post. If you have any questions or suggestions, let me know!

Saturday, April 30, 2016

Comparing adding a column with default values between MySQL and Oracle, or, "Adding a new column is sloooow :("

In this post, I want to talk a little about something that seems to baffle some developers from time to time. It baffles us in the sense of, "I can't believe this simple operation is slow, what the hell is going on?" Happened today in fact with MySQL. That simple operation is nothing more than adding a column to a table! At least, adding column to a table that's full of data, perhaps millions of rows. Why should a column take so long to add? Well, it doesn't ALWAYS - it depends on whether your database has optimized column adding, and how much data your table has. Let's look into it a little...


In MySQL land, here's what happens:
mysql> CREATE TABLE t (a INT, b INT); 
Query OK, 0 rows affected (0.03 sec)
mysql> insert into t SELECT rand(), rand() FROM INFORMATION_SCHEMA.COLUMNS c1 
Query OK, 3825936 rows affected, 2 warnings (10.99 sec)
Records: 3825936  Duplicates: 0  Warnings: 2
Query OK, 3825936 rows affected (12.77 sec)
Records: 3825936  Duplicates: 0  Warnings: 0
12.8 seconds might not seem so bad, but what if this were a busy production table? What if there were 10 times as many rows? Quickly, just to prove that the time does increase linearly with number of rows, let's double the table size then try again:
mysql> INSERT INTO t SELECT * FROM t; Query OK, 3825936 rows affected (14.38 sec) Records: 3825936 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE t ADD COLUMN d INT DEFAULT NULL; Query OK, 7651872 rows affected (25.80 sec) Records: 7651872 Duplicates: 0 Warnings: 0
Yup, double the size, double the time. [Note: MySQL version is 5.5 -- in this version an alter table like the above will recreate the entire table!]


In Oracle land, as usual they're ahead of the curve and have a solution for version 12c. Let's quote Mr. Kyte's description of this feature from
More Online Operations: Better Column Addition. In Oracle Database 11g you were able to perform a fast add of a column to a table if it had a default value and was defined as NOT NULL. (Arup Nanda has written about this at However, if you attempted to add a column with a default value and that column permitted null values, the ADD COLUMN operation could take a significant amount of time, generate a large amount of undo and redo, and lock the entire table for the duration of the operation. In Oracle Database 12c, that time, volume, and locking are no longer part of the process.
To demonstrate this, I copy ALL_OBJECTS into a table and measure its space—in blocks and bytes—using the show_space utility, posted on
SQL> create table t
  2  as
  3  select *
  4    from all_objects;
Table created.

SQL> exec show_space('T')
Full Blocks        ....        1,437
Total Blocks...........        1,536
Total Bytes............   12,582,912
Total MBytes...........           12

PL/SQL procedure successfully completed.

Now I add a column to table T, and this column will have a large default value. Because the column I’m adding is a CHAR(2000), it will always consume the full 2,000 bytes, given that the CHAR type is always blank-padded and fixed-width. Table T has more than 87,000 records, so adding a column would typically take a significant amount of time, but as you can see, the addition is practically instantaneous in Oracle Database 12c
SQL> set timing on
SQL> alter table t 
add (data char(2000) default 'x');
Table altered.
Elapsed: 00:00:00.07

I perform the identical operation in Oracle Database 11g and observe the following timing: 
SQL> set timing on
SQL> alter table t 
add (data char(2000) default 'x');
Table altered.
Elapsed: 00:00:28.59

Clearly, that’s a significant difference in runtimes.
What magic is going on under Oracle's hood? How can it get away with a near instant column addition while MySQL (and others) need to write the new default value n times. I don't precisely know what Oracle is doing, but I can offer a conjecture. What would I do to solve the problem? I would store the default once as a property of the table, and any time the table is queried, if the column value is missing, just kind of inject the default (which we have ready to go in memory at the time) in place. I found some insight in this article by Mohamed Houri at Oracle:> alter table t1 add C_DDL number default 42 not null;

Table altered.
Elapsed: 00:00:00.04
The C_DDL column has been added instantaneously in 11gR2 database while it took almost 49 seconds in 10gR2. What is this new mechanism that allows such an extremely rapid execution time when adding a not null column with default value to an existing table?
How could 3 million of rows be updated in 4 milliseconds?
Let’s verify visually if the update has been really done (from now and on when the Oracle version is not specified it will then refer to
SQL> select count(1) from t1;


SQL> select count(1) from t1 where c_ddl = 42;


Although Oracle has altered the t1 table instantaneously, the query is showing that the whole bunch of C_DDL column has been updated with their default value set to 42. How could this be possible? Will the execution plan be of any help here?
SQL> select * from table(dbms_xplan.display_cursor);

| Id | Operation        | Name | Rows | Bytes | Cost (%CPU)| Time          |
| 0 | SELECT STATEMENT  |      |      |       | 3016 (100) |               |
| 1 | SORT AGGREGATE    |      |    1 |     3 |            |               |
|*2 | TABLE ACCESS FULL | T1   | 2999K|  8788K| 3016   (5) | 00:00:10      |

 Predicate Information (identified by operation id):
     2 - filter(NVL("C_DDL",42)=42) 

Notice here again how the predicate part of the above execution plan can reveal vital information when trying to understand what is happening behind the scene. Despite I haven’t used the NVL function in my query, this one appears in the predicate part indicating that, internally, Oracle is still considering the C_DDL column as to be potentially able to contain null values (which means it has not been updated) and, as such, Oracle is replacing it with its default value 42.
So there you have it- it does pretty much what I guessed. But that's an 11g example that requires a NOT NULL constraint.

So what about when the default is 42 but NULLs are allowed? Can it optimize this situation? According to the Tom Kyte quote above, the answer appears to be yes. But how does it distinguish a "real" null with simply a non-set value due to the ADD COLUMN optimization? I don't have the answer, but perhaps Oracle writes an explicit NULL when you do something like "UPDATE t SET data=NULL" as opposed to when it is simply not set. NULL is different from missing. The Mohamed Houri article continues, with respect to default values on columns that allow NULLs:> alter table t1 add C_DDL_2 number default 84;

Table altered.

Elapsed: 00:00:58.25

12c> alter table t1 add C_DDL_2 number default 84;

Elapsed: 00:00:00.02

While adding the nullable C_DDL_2 column took 58 seconds to be honored in 11gR2 it has been instantaneously done in 12c.
This is a clear demonstration that in Oracle Database 12c, DDL optimization has been extended to include null columns having default values. Indeed, when you query t1 table to get the distinct values of the newly added column (C_DDL_2) you will realize that the entire table rows have seen their metadata (default value 84) updated as shown via the following query:
12c> select c_ddl_2, count(1) from t1 group by c_ddl_2;

   C_DDL_2    COUNT(1)
   -------   ----------
    84        3000000

SQL> select count(1) from t1 where c_ddl_2=84;


SQL> select * from table(dbms_xplan.display_cursor);

| Id | Operation        | Name | Rows | Bytes | Cost (%CPU)| Time         |
| 0 | SELECT STATEMENT  |      |      |       | 3803 (100) |              |
| 1 | SORT AGGREGATE    |      |    1 |    13 |            |              |
|* 2| TABLE ACCESS FULL | T1   | 3538K|    43M|   3803 (1) |  00:00:01    |

Predicate Information (identified by operation id):
    2 - filter(DECODE(TO_CHAR(SYS_OP_VECBIT("SYS_NC00006$",0)),NULL,NVL("
[Mwrynn: WHOA, what is this funkiness??]
- dynamic statistics used: dynamic sampling (level=2) 

However, in order to ensure DDL optimization for null columns with default value, things became more complex than it used to be for not null columns in the preceding release. We went from a simple implicit use of the NVL function to a complex and exotic predicate part involving SYS_OP_VECBIT Oracle non documented function and a new internal column SYS_NC00006$ in order to honor the default value since this one has not been physically updated.

One more thought courtesy of my buddy Kevin. What happens if you change the default? So you have:

  • At time=0, table t has columns a,b with a million rows
  • At time=1, table t has columns a,b,c with c defaulting to 42. Therefore the million new c fields are set to 42 (though optimized)
  • At time=2, table t column c is changed to default to 41. Now the optimization would be broken, since all the values from time=1 should have 42. Does Oracle at this point decide to "fill in" all the 42s for existing rows?? If that's the case, this step should be slow. I think this calls for an experiment! Stay tuned...

Monday, April 04, 2016

Common Table Expressions: Postgres vs. Oracle vs. MySQL

The Gist of CTEs

SQL Common Table Expressions (CTEs) are a neat way to enhance the readability of your SQL queries. They can help you manage complexity by avoiding copied/pasted subqueries, and they remove the (what I feel is often an imagined) need for temporary tables to store temporary results for another query to work with. CTEs were defined in ANSI/ISO Standard SQL-99, so if your SQL database does not support CTEs, that means that like your favorite hair band, they still haven't caught up with the 90s.

Quick Examples of CTEs

The idea is you have this kind of ugliness:

FROM (SELECT ... WHERE ...) AS subqry

Maybe it's even multiple subquery levels deep:

FROM (SELECT ... FROM (SELECT ... WHERE ...) AS inner_subqry WHERE ...) AS outer_subqry

Hard to write, hard to edit, hard to look at. A CTE is simply the use of the "WITH" clause to separate that subquery from the main body, substituting the subquery with the assigned alias.

WITH subqry AS (SELECT ... WHERE ...)
SELECT ... FROM subqry WHERE ...

Another use case: If you ever find yourself saying "Hmm, if I put THIS resultset into a temporary table, then I can query it later..." Like so:

--in general, DON'T DO THE FOLLOWING if you can help it.
CREATE TABLE helper_table AS SELECT ... FROM ... WHERE ... ;
SELECT some_stuff
FROM ... helper_table 
--maybe you use it multiple times, to avoid recomputing helper_table's query!
SELECT some_other_stuff
FROM ... helper_table 
DROP TABLE helper_table;

^^^The above makes me sad. :( You're using another table when you don't need to. It's messier. It's slower. It's unnecessary DDL (and on most databases, DDL necessitates a commit, don't forget), or even if you avoid the CREATE TABLE by keeping helper_table table alive all the time (maybe you delete/insert with every usage), you're creating unnecessary maintenance; unnecessary transaction logging. It's more complex. And if you have CTEs there's (rarely) any good reason to do it.

The Postgres Optimization Fence

Unfortunately, in Postgres there is an infamous downside to the otherwise lovely CTE: The Optimization Fence. The Postgres planner will, upon seeing your CTE query, decide to resolve that WITH part first, then take the results of that, and work it into the "main" part of the query. So behind the curtain, it's actually ALMOST doing something like the "CREATE TABLE helper_table" example above -- resolve that query first, store the results in memory, use that data when it is asked for.

Here is the mention of this problem from the docs:
“…that the optimizer is less able to push restrictions from the parent query down into a WITH query than an ordinary sub-query. The WITH query will generally be evaluated as written, without suppression of rows that the parent query might discard afterwards. (But, as mentioned above, evaluation might stop early if the reference(s) to the query demand only a limited number of rows.)”
Ok! Let's observe by comparing two examples:

First I made a table called person_location that has 50,000,000 rows -- to represent a million people residing in each of the 50 states in the US.

1) CTE to get the number of people per state
WITH peeps_by_state AS 
  (SELECT COUNT(*) AS cnt, state FROM person_location GROUP BY state)
FROM peeps_by_state
WHERE state=13;
                                                                 QUERY PLAN                                                                  
 CTE Scan on peeps_by_state  (cost=971241.21..971245.71 rows=1 width=8) (actual time=28868.625..28868.645 rows=1 loops=1
   Filter: (state = 13)
   Rows Removed by Filter: 49
   CTE peeps_by_state
     ->  HashAggregate  (cost=971239.21..971241.21 rows=200 width=4) (actual time=28868.592..28868.603 rows=50 loops=1)
           Group Key: person_location.stat
           ->  Seq Scan on person_location  (cost=0.00..721239.14 rows=50000014 width=4) (actual time=0.016..9207.389 rows=50000000 loops=1)
Planning time: 0.168 ms
Execution time: 28868.783 ms
(9 rows)

2) replace that reference to the CTE with a subquery:

FROM (SELECT COUNT(*) AS cnt, state FROM person_location GROUP BY state) peeps_by_state
WHERE state=13;
                                                               QUERY PLAN                                                               
 Subquery Scan on peeps_by_state  (cost=0.00..851089.22 rows=1 width=8) (actual time=9925.626..9925.627 rows=1 loops=1)
   ->  GroupAggregate  (cost=0.00..851089.21 rows=1 width=4) (actual time=9925.624..9925.624 rows=1 loops=1)
         Group Key: person_location.state
         ->  Seq Scan on person_location  (cost=0.00..846239.20 rows=970000 width=4) (actual time=0.029..9750.655 rows=1000000 loops=1)
               Filter: (state = 13)
               Rows Removed by Filter: 49000000

 Planning time: 0.124 ms
 Execution time: 9925.680 ms
(8 rows)

Unfortunately, the uglier version of the query will be planned much better, so, sadly, Postgres gives you a performance incentive to use the ugly one. In this simple example above, it may not seem like much of a big deal to write the ugly one, but in a larger, more complex example, the ugly factor could be much more extreme.

In the above example, though, you could just apply the same filter to the CTE and probably get good results. But this is something extra to think about, something extra to deal with, and shouldn't be necessary in an ideal world. So let's try that out (looks good):

3) add the filter to the CTE:
mwrynn=# EXPLAIN ANALYZE WITH peeps_by_state AS (SELECT COUNT(*) AS cnt, state FROM person_location WHERE state=13 GROUP BY state)
FROM peeps_by_state
WHERE state=13;
                                                                QUERY PLAN                                                                
 CTE Scan on peeps_by_state  (cost=851089.21..851089.23 rows=1 width=8) (actual time=9895.863..9895.864 rows=1 loops=1)
   Filter: (state = 13)
   CTE peeps_by_state
     ->  GroupAggregate  (cost=0.00..851089.21 rows=1 width=4) (actual time=9895.851..9895.852 rows=1 loops=1)
           Group Key: person_location.state
           ->  Seq Scan on person_location  (cost=0.00..846239.20 rows=970000 width=4) (actual time=0.028..9722.994 rows=1000000 loops=1)
                 Filter: (state = 13)
                 Rows Removed by Filter: 49000000
 Planning time: 0.125 ms
 Execution time: 9895.923 ms

(10 rows)

Don't Worry, Oracle's Got This

Oracle, on the other hand, is quite a bit smarter about CTEs. It can "push down" the CTE and analyze the query as a whole. In other words, Oracle's query optimizer views the two queries as the same, and therefore they both resolve to the same (better) query plan. Good stuff. Let's observe...

--QRY 1
WITH peeps_by_state AS (SELECT COUNT(*) AS cnt, state FROM person_location GROUP BY state)
FROM peeps_by_state
WHERE state=13;

--QRY 2
FROM (SELECT COUNT(*) AS cnt, state FROM person_location GROUP BY state) peeps_by_state
WHERE state=13;

Ta-da - same plan either way. Same time to execute either way. This is what we want.

(Note that the fact that the Oracle queries took longer than Postgres doesn't mean anything. I'm running Postgres on my Macbook Pro, and Oracle is running on the tiniest RDS instance Amazon will let me use. :))

MySQL's CTE Implementation

Finally, let's check out how well MySQL (InnoDB) handles CTEs:

(Still hasn't caught up with the 90s.)

Thursday, March 31, 2016

MySQL, auto_increment and the magic number 0

Ran into a MySQL issue at work recently. My group created three static lookup tables which we populated via handcoded insert statements. For this kind of lookup table, we like having control over the primary key (`id`), so as a general rule we explicitly set this column, rather than relying on auto_increment. For example a table might look like the following (this is a made-up example, as I try to maintain separation of work and blog):

EMPLOYEE_TYPE (id, type)
0 Unknown
1 Manager
2 Grunt
3 Overlord

We create the table then insert like so:


insert into EMPLOYEE_TYPE (id, type) values (0, 'Unknown');
insert into EMPLOYEE_TYPE (id, type) values (1, 'Manager');
insert into EMPLOYEE_TYPE (id, type) values (2, 'Grunt');
insert into EMPLOYEE_TYPE (id, type) values (3, 'Overlord');

(We thought it would be a nice convention to use 0 to represent 'Unknown' across several tables.)

When we kicked off this script, it never made it beyond the second statement because:

mysql> insert into EMPLOYEE_TYPE (id, type) values (1, 'Manager');
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'

MySQL first overrode our 0 value for 'Unknown', setting it to 1. Then we tried to insert 'Manager' with id 1 and BOOM! Primary Key violation. It turns out MySQL has an oddball special case for auto_increment columns. If you explicitly specify your id, it is used instead of the auto_increment. Fine. Perfect. UNLESS THAT NUMBER IS 0, IN WHICH CASE, 0 IS IGNORED AND THE AUTO_INCREMENT VALUE IS INSTEAD USED. WHAT THE--??

What is the meaning of this? From a page in the documentation:
No value was specified for the AUTO_INCREMENT column, so MySQL assigned sequence numbers automatically. You can also explicitly assign 0 to the column to generate sequence numbers. If the column is declared NOT NULL, it is also possible to assign NULL to the column to generate sequence numbers. When you insert any other value into an AUTO_INCREMENT column, the column is set to that value and the sequence is reset so that the next automatically generated value follows sequentially from the largest column value.
Ok so at least it's documented, but wow, there are such odd design decisions in MySQL sometimes! Fortunately they often provide configurable database parameters to disable them. And this strange case is no exception in that one can set sql_mode to NO_AUTO_VALUE_ON_ZERO to disable this loony behavior. Unfortunately I find nobody ever bothers to change these kinds of parameters from the defaults, and the defaults tend to err on the crazy side.

Saturday, March 26, 2016


I saw this BBC article yesterday - These unlucky people have names that break computers - it's about people with names such as Jennifer Null who have problems inputting their name into various systems because, you know, null.
But to any programmer, it’s painfully easy to see why “Null” could cause problems for a database. This is because the word “null” is often inserted into database fields to indicate that there is no data there. Now and again, system administrators have to try and fix the problem for people who are actually named “Null” – but the issue is rare and sometimes surprisingly difficult to solve.
Poor Jennifer Null - always at the mercy of poor programming. But the ghost of my favorite Tom Kyte post always returns when I hear of this kind of problem. JUST BIND!!

I don't know if it's difficult to deal with in Javascript or other parts of a modern system, but the database tier should have no problem. It doesn't matter if you have "NULL" in your name, it doesn't matter if your name is Bobby Tables or has any number of quotes, semicolons or reserved words. It doesn't matter if it contains the string DROP DATABASE or DELETE if...YOU. JUST. BIND.

Saturday, March 05, 2016

Dan Martensen's blog: SQL Performance of Join and Where Exists

Here's a neat little article on the blog of a fellow named Dan Martensen. I really enjoy performance experiments that show how one approach fares vs. another - it's something I do myself as well. :) This one shows how "where exists" performs vs. an inner join in Postgres 9.5.

Here is the article.

It turns out "where exists" is faster, because as the explain plan reveals, a join performs a Merge Join step, while "where exists" can get away with a Merge Semi Join step. What do these steps mean? Let's turn to the documentation (Source):
  • merge join: Each relation is sorted on the join attributes before the join starts. Then the two relations are scanned in parallel, and matching rows are combined to form join rows. This kind of join is more attractive because each relation has to be scanned only once. The required sorting might be achieved either by an explicit sort step, or by scanning the relation in the proper order using an index on the join key. 
As for Merge Semi Join, I can't find such a clear definition in the docs, but Dan Martensen's article (that I linked to above) mentions: "It’s worth noting Where Exists doesn’t need to return a value to the parent query in the SELECT. It’s WHERE clause compares parent to subquery ids using a boolean short circuit early exit."

So maybe that's a clue as for what goes on under the hood -- it just checks for the matching rows' existence then quits. No data retrieval required.

Perhaps I will soon check if MySQL behaves similarly, as my work will involve more MySQL soon!

Saturday, February 27, 2016

Your comments and questions wanted!

So once in a while I'll take a look at my blog statistics, and while I don't exactly have a vast readership, I do find that my most-read post by far is Postgres Query Using Window Functions. Here are the stats:


I'd like to write more on this apparently popular topic - I just need some inspiration for specific questions and/or use cases to experiment with. So if anyone reading this has a specific question on window functions (or anything else for that matter), I'd be happy to attempt to answer!

Feel free to just ask away in the comments!

Wednesday, February 24, 2016

From the "Databases at CERN" blog...

"My experience testing the Oracle In-Memory Column Store" The post is from 2014 but I just stumbled across it Googling for benchmarks for the Oracle In-Memory Column Store...Just purely out of curiosity. Cool to see that CERN made good use of it! Though I can't say I understand the physics analysis query they use as an example. :D

Friday, February 05, 2016

mwrynn PL/SQL table API now on github!

See title. :)

This inspires me to work on it some more - clean it up, document better, add new stuff. (Would really love to add column groupings/tags. i.e. tag a bunch of columns as "dimension" for example, and tell call an easy-peasy function to just "grab all columns tagged as 'dimension'"...

Thursday, January 07, 2016

Postgresql 9.5...released!

Check out some of these fabulous features of Postgres 9.5 if you haven't already:

We finally have upsert. Finally have CUBE, ROLLUP and GROUPING SETS (I really wanted these when I was playing around with Retrosheet baseball data). We also have BRIN indexes for indexing large-scale tables. And there's now even a way to interface with...dun dun dun...BIG DATA DATABASES.