Wednesday, June 29, 2011

resizing SHMMAX

Anyone who has used Postgres and changed their memory settings to "greater than tiny" has surely run into this problem. (I'm sure this applies to other databases too - I believe I've run into this in Oracle also). The problem is the kernel won't let Postgres take as much shared memory as it is requesting.


postgres@mwrynnix:~$ service postgresql restart
 * Restarting PostgreSQL 9.0 database server                                                                                                                                                                    * The PostgreSQL server failed to start. Please check the log output:
4e0b32b0.2cd4.1(0) 2011-06-29 10:12:01.406 EDT @: FATAL:  could not create shared memory segment: Invalid argument
4e0b32b0.2cd4.2(0) 2011-06-29 10:12:01.406 EDT @: DETAIL:  Failed system call was shmget(key=5432001, size=139411456, 03600).
4e0b32b0.2cd4.3(0) 2011-06-29 10:12:01.406 EDT @: HINT:  This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter.  You can either reduce the request size or reconfigure the kernel with larger SHMMAX.  To reduce the request size (currently 139411456 bytes), reduce PostgreSQL's shared_buffers parameter (currently 16384) and/or its max_connections parameter (currently 104).
        If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for.
        The PostgreSQL documentation contains more information about shared memory configuration.
                                                                                                                                                                                                        [fail]

(In case you didn't realize startup didn't work, it tells you [fail] at the end there.)

This page explains it all - http://www.dbatodba.com/db2/how-to-do/how-to-alter-shmall-and-shmmax-kernel-parameters-on-linux

I would just like to add that in addition to changing the conf file (mentioned on that page), you can modify the file /proc/sys/kernel/shmmax to set it to the value you want NOW. You'll still need to change the .conf to "persist" the change when you reboot. Example below - changing 32 MB shmmax to 256MB:

root@mwrynnix:~# cat /proc/sys/kernel/shmmax
33554432
root@mwrynnix:~# echo 268435456 > /proc/sys/kernel/shmmax
root@mwrynnix:~# cat /proc/sys/kernel/shmmax
268435456

And now we start up Postgres...

root@mwrynnix:~# service postgresql start
 * Starting PostgreSQL 9.0 database server                                                                                                                                                              [ OK ]
root@mwrynnix:~#

Thursday, June 16, 2011

More on Pgpool-II: Errors that can be ignored during online recovery

Question: In the pgpool log, why are there some file not found errors like below? (These errors occurred during online recovery.)

FATAL:  the database system is starting up

...
4ded4257.51ef.7(0) 2011-06-06 17:10:51.493 EDT @: LOG:  restored log file "0000000A00000000000000C8" from archive
scp_wrapper: arg1=mwrynnix:/var/lib/postgresql/9.0/main2/archive_log/0000000A00000000000000C9, arg2=pg_xlog/RECOVERYXLOG arg3=
scp: /var/lib/postgresql/9.0/main2/archive_log/0000000A00000000000000C9: No such file or directory
4ded4257.51ef.8(0) 2011-06-06 17:10:51.775 EDT @: LOG:  could not open file "pg_xlog/0000000A00000000000000C9" (log file 0, segment 201): No such file or directory

Answer I found from author of pgpool: I think you can disregard these errors. The files rsync complains did
exist once so it was counted in the list to be copied. While rsync was
doing the job, they are disappeard, probably removed by PostgreSQL.
This could happen if PostgreSQL actively processes transactions, I
think. (http://pgfoundry.org/pipermail/pgpool-general/2011-May/003687.html)

Pgpool-II Online Recovery (continued)

There is a caveat mentioned in the Pgpool-II manual, in the Online Recovery section:

"Note that there is a restriction about online recovery. If pgpool-II itself is installed on multiple hosts, online recovery does not work correctly, because pgpool-II has to stop all clients during the 2nd stage of online recovery. If there are several pgpool hosts, only one will have received the online recovery command and will block connections."

My question to the newsgroup (pgpool-general) was:

What should be the course of action if I run multiple pgpool-II instances, and I need to recover a database? For example, let my environment consist of multiple, redundant application servers, each with its own pgpool. Attached to each pgpool is the same set of Postgres clusters, of course. My guess would be to perform the following steps:

1) Shut down all pgpool/app server pairs except for one (thus, for a small period of time, all connections will go through the one remaning pgpool/app server - this is fine for me)
2) Perform online recovery on the remaining pgpool
3) Start up all the previously-shutdown pgpools/app servers.

I was told this is the right course to take, with one caveat noted. Pgpool keeps a status file indicating the status of each node (attached, detached, ...) This file should either be deleted before starting up the rest of the pgpool instance, OR when starting up, use the -D option (Discard pgpool_status file and do not restore previous status). Starting up pgpool without -D, then running pcp_attach_node is NOT a good solution, as transactions may be written while one or more nodes are considered detached.

Pgpool-II Online Recovery

I've spent quite a bit of time learning about Pgpool-II online recovery, and getting it working. To recover a broken database, run the recovery script which will sync up the database then attach it to pgpool. Unfortunately the examples in the Pgpool-II manual are not complete - they just give you the general idea. So I had to comb the internet to clear up some haziness, and modify examples to make them work for me. Below is what I've got. It's tested and works. Stress tested even - by bombarding pgpool with inserts before during and after recovery - the result is all databases are in sync. Some of this is borrowed code and I've lost who the author is. If it's your code let me know and I'll properly credit you.

###############################################
#copy_base_backup.sh - 1st stage of recovery
#XXX TODO: remove exit 0 - make exit out with error codes when any command fails
#!/bin/sh

PSQL=/usr/bin/psql
SCP=/usr/bin/scp
SSH=/usr/bin/ssh
LOGGER="log /var/log/postgresql/pgpool_copy_base_backup.log"
#"/usr/bin/logger -i -p local0.info -t pgpool"
RSYNC="/usr/bin/rsync --archive --quiet --compress --rsh=$SSH --delete"
BASENAME=`/usr/bin/basename $0`
HOSTNAME=`/bin/hostname`
ID=`/usr/bin/id -un`

# $1 = Database cluster path of a master node.
# $2 = Hostname of a recovery target node.
# $3 = Database cluster path of a recovery target node.

#PG_HOME=/var/lib/postgresql
SRC_ARCHIVE_LOG_DIR=/var/lib/postgresql/9.0/main2/archive_log

SRC_DATA=$1
DST_HOST=$2
DST_DATA=$3

SRC_DBNAME=postgres
SRC_PORT=5431

log()
{
echo $(date +"%D %T") $2 >> $1
}

$LOGGER "Executing $BASENAME as user $ID"

$LOGGER "Executing pg_start_backup"
$PSQL -p $SRC_PORT -d $SRC_DBNAME -c "select pg_start_backup('pgpool-recovery')"

$LOGGER "Creating file recovery.conf"
echo "restore_command = '$SCP $HOSTNAME:$SRC_ARCHIVE_LOG_DIR/%f %p'" > $SRC_DATA/recovery.conf

$LOGGER "Rsyncing directory base"
$RSYNC $SRC_DATA/base/ $DST_HOST:$DST_DATA/base/
$LOGGER "Rsyncing directory global"
$RSYNC $SRC_DATA/global/ $DST_HOST:$DST_DATA/global/
$LOGGER "Rsyncing directory pg_clog"
$RSYNC $SRC_DATA/pg_clog/ $DST_HOST:$DST_DATA/pg_clog/
$LOGGER "Rsyncing directory pg_multixact"
$RSYNC $SRC_DATA/pg_multixact/ $DST_HOST:$DST_DATA/pg_multixact/
$LOGGER "Rsyncing directory pg_subtrans"
$RSYNC $SRC_DATA/pg_subtrans/ $DST_HOST:$DST_DATA/pg_subtrans/
$LOGGER "Rsyncing directory pg_tblspc"
$RSYNC $SRC_DATA/pg_tblspc/ $DST_HOST:$DST_DATA/pg_tblspc/
$LOGGER "Rsyncing directory pg_twophase"
$RSYNC $SRC_DATA/pg_twophase/ $DST_HOST:$DST_DATA/pg_twophase/
$LOGGER "Rsyncing directory pg_xlog"
$RSYNC $SRC_DATA/pg_xlog/ $DST_HOST:$DST_DATA/pg_xlog/
$LOGGER "Rsyncing file recovery.conf (with source deletion)"
$RSYNC --remove-source-files $SRC_DATA/recovery.conf $DST_HOST:$DST_DATA/

$LOGGER "Executing pg_stop_backup"
$PSQL -p $SRC_PORT -d $SRC_DBNAME -c 'select pg_stop_backup()'

exit 0

###################################################
#pgpool_recovery_pitr.sh - 2nd stage of recovery
#XXX TODO: remove exit 0 - make exit out with error codes when any command fails
#SRC_PORT and SRC_DBNAME need to be modified according to your system
#!/bin/sh

DATADIR=$1
PSQL=/usr/bin/psql
LOGGER="log /var/log/postgresql/pgpool_recovery_pitr.log"
BASENAME=`/usr/bin/basename $0`
ID=`/usr/bin/id -un`
SRC_PORT=5431
SRC_DBNAME=postgres

log()
{
echo $(date +"%D %T") $2 >> $1
}

$LOGGER "Executing $BASENAME as user $ID"

$LOGGER "Flushing sequence values to xlog"
# Force to flush current value of sequences to xlog
psql -p $SRC_PORT -t -c 'SELECT datname FROM pg_database WHERE NOT datistemplate AND datallowconn' template1|
while read i
do
if [ "$i" != "" ];then
psql -p $SRC_PORT -c "SELECT setval(oid, nextval(oid)) FROM pg_class WHERE relkind = 'S'" $i
fi
done

$LOGGER "Executing pgpool_switch_xlog"
$PSQL -p $SRC_PORT -d $SRC_DBNAME -c "select pgpool_switch_xlog('$DATADIR/archive_log')"

exit 0

###################
#pgpool_remote_start - 3rd stage (note there is no "3rd stage" in pgpool.conf - it seems hardcoded to run this script by name
#XXX TODO: remove exit 0 - make exit out with error codes when any command fails
#DST_DB_START should be modified - it is the command that is run on the recovery node - maybe service postgreql start is fine, maybe you need to run pg_ctl, for example

#!/bin/sh

SSH=/usr/bin/ssh
LOGGER="log /var/log/postgresql/pgpool_remote_start.log"
DST_DB_START="service postgresql start"

BASENAME=`/usr/bin/basename $0`
ID=`/usr/bin/id -un`

DST_HOST=$1
DST_DIR=$2

log()
{
echo $(date +"%D %T") $2 >> $1
}

$LOGGER "Executing $BASENAME as user $ID"
$LOGGER "Starting remote PostgreSQL server"
$SSH -T $DST_HOST '$DST_DB_START' 2>/dev/null 1>/dev/null

exit 0