Thursday, March 14, 2013

OpenJPA headaches

I'm met with some incredulity every time I comment that fiddling with/working around the problems with O/R Mapper xyz wastes more of my time than it would take for me to handcode all my SQL. Here's another experience that makes me say just that.

Once upon a time, our Spring application used OpenJPA 2.1. The database was Postgres. Everything was happy until We encountered this bug which broke some of our app's queries: https://issues.apache.org/jira/browse/OPENJPA-2056 ...So, we upgraded to OpenJPA 2.2 which includes this fix.

Along with that fix, we got a brand new wonderful feature that we didn't ask for, that can't be disabled! This feature is ID caching, via a new allocationSize parameter (to the @SequenceGenerator annotation), which caches [allocationSize] number of IDs on the application side. 

OpenJPA sends ALTER SEQUENCE commands to the database in order to update the sequence appropriately - this is necessary for its caching scheme to work. They decided this new parameter should default to 50, so that's what we were using. You see when we upgraded to OpenJPA 2.2, we neglected to account for new features that might attempt to screw around with database objects behind our backs. The problem is that ALTER SEQUENCE can only be run by the sequence's owner, the application user is not the owner, and so an error is thrown. The OpenJPA docs say you can set the value of the allocationSize parameter to 1 to avoid using this feature, but when we tried that, it instead sent a command with a completely invalid syntax: "ALTER SEQUENCE mytable_id_seq"!

This invalid statement is fixed in OpenJPA 2.3, which has not been released at this time. We tested a SNAPSHOT build of 2.3, and found while it does correct the invalid syntax, it's still a problem. It sends an ALTER SEQUENCE statement when we don't want it to send any. Can't we just turn this thing off? Well, OpenJPA now has this feature that when this particular ALTER SEQUENCE statement fails, OpenJPA will just ignore it and log a warning. That's great and all but:
1) Now I have unnecessary "ERROR"s in my Postgres log. I monitor the log for "ERROR" and now this junk is cluttering it
2) One of my tables requires a special lock before modifying, therefore before we persist on the Java side, we begin a new transaction. So, this failed ALTER SEQUENCE statement winds up in the same transaction as the UPDATE/INSERT. Now, Postgres doesn't like when there's an error in your transaction - it forces all subsequent statements to also throw errors, and the only option is to rollback. The end result: we can't modify the data at all. It doesn't matter that the application pretends the first statement wasn't an error.

We worked around the issue by writing our own custom class for sequences, that subclasses AbstractJDBCSeq. We tell it to us a favor and not cache IDs in a way that won't work for us.

These ORMs tend to operate with the assumption that:
1) The ORM has total control over every database object
2) I actually want it to have total control over every database object. You know in the past I've worked with database clusters, each node replicated, where the databases need control over the sequences. For example, imagine such a cluster with three nodes. In order to avoid ID collisions, Node 1 might use IDs 1, 4, 7, ... Node 2 uses 2, 5, 8, ... and finally Node 3 uses 3, 6, 9, ... (This kind of configuration is not the norm, but it isn't all that uncommon either.) Now if you decide it's OK to start adding numbers to these sequences without my permission, well there goes the neighborhood.

If you're going to introduce a feature that requires a new set of database privileges, or does anything non-standard with database objects, at least make it optional!

Sigh!

No comments: