Thursday, June 16, 2011

Pgpool-II Online Recovery (continued)

There is a caveat mentioned in the Pgpool-II manual, in the Online Recovery section:

"Note that there is a restriction about online recovery. If pgpool-II itself is installed on multiple hosts, online recovery does not work correctly, because pgpool-II has to stop all clients during the 2nd stage of online recovery. If there are several pgpool hosts, only one will have received the online recovery command and will block connections."

My question to the newsgroup (pgpool-general) was:

What should be the course of action if I run multiple pgpool-II instances, and I need to recover a database? For example, let my environment consist of multiple, redundant application servers, each with its own pgpool. Attached to each pgpool is the same set of Postgres clusters, of course. My guess would be to perform the following steps:

1) Shut down all pgpool/app server pairs except for one (thus, for a small period of time, all connections will go through the one remaning pgpool/app server - this is fine for me)
2) Perform online recovery on the remaining pgpool
3) Start up all the previously-shutdown pgpools/app servers.

I was told this is the right course to take, with one caveat noted. Pgpool keeps a status file indicating the status of each node (attached, detached, ...) This file should either be deleted before starting up the rest of the pgpool instance, OR when starting up, use the -D option (Discard pgpool_status file and do not restore previous status). Starting up pgpool without -D, then running pcp_attach_node is NOT a good solution, as transactions may be written while one or more nodes are considered detached.

No comments: