Backup Troubleshooting - PostgreSQL iDataAgent

Backup Restore  
Table of Contents

PSQL0001: Unable to create a PostgreSQL instance

Symptom

Instance creation fails with the following error messages:

Invalid PostgreSQL User Name and Password.

OR

Connection to PostgreSQL server failed.

Cause 1

PostgresBrowseAgent.log shows the following error during instance creation:

7991 21550700 08/21 12:50:34 ### libpqInterface :: ConnectDb() - Connection Info string: port = '5432' dbname = 'postgres' user = 'pg_backup_agent' connect_timeout = '' password = **********
7991 21550700 08/21 12:50:34 ### libpqInterface :: ValidateConnection() - eRRoR -- PQreset failed to reset the connection. Error:FATAL: no
pg_hba.conf entry for host "[local]", user "pg_backup_agent", database "postgres", SSL off
FATAL: no pg_hba.conf entry for host "[local]", user "pg_backup_agent", database "postgres", SSL off

Here "[local]" relates to a UNIX socket connection, and thus the pg_hba.conf file lacks an entry to enable the socket connection from cvd that spawns psql -u postgres [username] from working. This can be tested from the client as root.

Resolution 1

Ensure pg_hba.conf file has an applicable line to match the user. The default line below will allow all users to connect to the client via the UNIX socket, as part of a default PostgreSQL install. It may be hashed out.

local all all md5

Alternatively, you can use the variation below to make it secure:

local all [username_entered_in_gui_for_instance_creation] md5

This will allow the user to connect to all databases locally via the UNIX socket.

Cause 2

PostgresBrowseAgent.log shows the following error during instance creation:

18041 1184f700 09/28 18:57:36 ### libpqInterface :: libpqInterface() - eRRoR -- Exception in [libpqInterface :: libpqInterface]
Type:[APPLICATION_EXCEPTION] Msg:[Lib dir is not present]
18041 1184f700 09/28 18:57:36 ### ExecutePostgre::init() - eRRoR --
Exception in [libpqInterface :: libpqInterface]
Type:[APPLICATION_EXCEPTION] Msg:[Lib dir is not present]
18041 1184f700 09/28 18:57:36 ### PostgresBrowseAgent::verifyUserInfo() -
connection failed
18041 1184f700 09/28 18:57:36 ### PostgresBrowseAgent::verifyUserInfo() -
exepg.postgre_exe failed

Resolution 2

Ensure the correct PostgreSQL lib directory is entered in the instance creation dialog. It must contain libpq.so library file (and other related library files). It is  used to connect to the database and execute queries.

  • To locate the lib directory, run the following command:

    bash-3.2# ./pg_config --libdir

    Example output:

    /PostgreSQL/9.2/lib

    If pg_config is not present which is installed as part of PostgreSQL install, run the following command (on UNIX clients):

    bash-3.2# locate libpq.so

    Example output:

    /PostgreSQL/9.2/lib/libpq.so

  • If the lib path contains libpq.so.x.x instead of libpq.so, create a symlink libpq.so to libpq.so.x.x before attempting instance creation.

    For example:

    On CentOS 5.8 using packaged PostgreSQL 8.2 or on CentOS 6.3 using custom PostgreSQL 9.2, libpq.so may not be present, but there will be a numbered version like libpq.so.4 or libpq.so.5.

    In this case, create a symlink version of libpq.so pointing to the numbered one:

    bash-3.2# ln -s /usr/lib64/libpq.so.x.x /usr/lib64/libpq.so

    You should be able to create instance successfully now.

Cause 3

PostgresBrowseAgent.log shows the following error during instance creation:

24576 1026b700 09/28 23:10:39 ### PostgresBrowseAgent::verifyUserInfo() -
m_PostgresServer->GetConfigFilePath() = SnapProtect
24576 1026b700 09/28 23:10:39 ### libpqInterface :: ConnectDb() -
Connection Info string: port = '5432' dbname = 'snapprotect' user = 'snapprotect'
connection_timeout = '' password = **********
24576 1026b700 09/28 23:10:39 ### libpqInterface :: IsSuperUser() - eRRoR
- Exception in [libpqInterface :: IsSuperUser]
Type:[APPLCATION_EXCEPTION] Msg:[The user is not a super user]
24576 1026b700 09/28 23:10:39 ### libpqInterface :: libpqInterface() -
eRRoR - Exception in [libpqInterface :: IsSuperUser]
Type:[APPLCATION_EXCEPTION] Msg:[The user is not a super user]
24576 1026b700 09/28 23:10:39 ###ExecutePostgre::init() - eRRoR -
Exception in [libpqInterface :: IsSuperUser] ]
Type:[APPLCATION_EXCEPTION] Msg:[The user is not a super user]
24576 1026b700 09/28 23:10:39 ### PostgresBrowseeAgent::verifyUseerInfo()
- connection failed
24576 1026b700 09/28 23:10:39 ###PostgresBrowseeAgent::verifyUseerInfo()
- exepg.postgre.exe failed
24576 1026b700 09/28 23:10:39 ### PostgresBrowseeAgent::BrowseDBs() -
verifyUser failed (err=)
24576 1026b700 09/28 23:10:39 ### PostgresBrowseeAgent::DoBrowse() -
BrowseDBs failed. M_errorNum = 1577058305
24576 1026b700 09/28 23:10:39 ###
PostgresBrowseeAgent::SendBrowseResponse() - Sending mstype =
POSTGRESQL_BROWSE_FAILED
24576 1026b700 09/28 23:10:39 ### Main() - DoBrowse() failed

Here, the user provided in the instance creation dialog box doesn't have the necessary usesuper or superuser attribute.

Resolution 3

Use the following steps to add the superuser attribute to the user:

  1. Once connected to PostgreSQL, run the following query to check if the usesuper attribute is incorrectly set:

    select * from pg_user;

    In the screenshot above, the snapprotect user has a  f set against usesuper attribute. Instance creation using this user will therefore generate the above error.

  1. When connected to PostgreSQL as superuser, run the following query to add the attribute:

    PGSQL> ALTER USER username WITH superuser;

  2. Run the following query again to check if the attribute now shows a t against usesuper attribute.

    select * from pg_user;

Cause 4

PostgresBrowseAgent.log shows the following error during instance creation due to the PostgreSQL server connection failure.

25925 ae3d6740 03/06 16:35:25 ### libpqInterface :: ConnectDb() - Connection Info string: port = '5433' dbname = 'postgres' user = 'postgres' connect_timeout = ' password = **********

25925 ae3d6740 03/06 16:35:25 ### libpqInterface :: ValidateConnection() - eRRoR -- PQreset failed to reset the connection. Error:FATAL: Peer authentication failed for user "postgres"

FATAL: Peer authentication failed for user "postgres"

25925 ae3d6740 03/06 16:35:25 ### libpqInterface :: ValidateConnection() - eRRoR -- Exception in [libpqInterface :: ValidateConnection] Type:[POSTGRE_EXCEPTION] Msg:[PQreset failed to reset the connection]

25925 ae3d6740 03/06 16:35:25 ### libpqInterface :: libpqInterface() - eRRoR -- Exception in [libpqInterface :: ValidateConnection] Type:[POSTGRE_EXCEPTION] Msg:[PQreset failed to reset the connection]

25925 ae3d6740 03/06 16:35:25 ### ExecutePostgre::init() - eRRoR -- Exception in [libpqInterface :: ValidateConnection] Type:[POSTGRE_EXCEPTION] Msg:[PQreset failed to reset the connection]

25925 ae3d6740 03/06 16:35:25 ### PostgresBrowseAgent::verifyUserInfo() - connection failed

25925 ae3d6740 03/06 16:35:25 ### PostgresBrowseAgent::verifyUserInfo() - exepg.postgre_exe failed

25925 ae3d6740 03/06 16:35:25 ### PostgresBrowseAgent::BrowseDBs() - verifyUser failed (err=)

Resolution 4

During instance creation, the software tries to connect to the PostgreSQL server as a root user. So, for successful peer authentication, add the following mappings in the pg_hba.conf and the pg_ident.conf files.

  • In the pg_hba.conf file, add the corresponding map name configured for the PostGres DB user as shown in the example below.

    # TYPE    DATABASE    USER    ADDRESS    METHOD

    # "local" is for Unix domain socket connections only

      local    all    postgres    peer    map=testmap

  • In the pg_ident.conf file, add mapping between the PostGres DB user and the system user 'root' as shown in the example below.

    # Put your actual configuration here

    # ----------------------------------

    # MAPNAME    SYSTEM-USERNAME    PG-USERNAME

      testmap    postgres    postgres

      testmap    root    postgres

Now reload the PostgreSQL server using the following example command.

./pg_ctl -D /opt/PostgreSQL/9.1/data reload

PSQL0002: Insufficient disk space due to transaction logs generated during File System based backups

Cause

During File System based backup, if you have only full backups scheduled, then the transaction logs get saved between two full backups. These logs are not pruned even if Archive Delete checkbox is selected under Log Backup tab in the instance properties. These logs are not pruned because the software does not backup these logs during the current backup. Hence, with time, the transaction logs fill up considerable space in the disk.

Resolution

To handle this situation, schedule several log only backups between two full backups. Running log backups between full backups helps in pruning the logs accumulated since the last full backup. This also facilitates support for Point-in-time restores.

PSQL0003: Log only backup operation from the CommCell Console fails for Postgres Plus® Advanced Server (PPAS) version 9.0

Cause

Backup operation from the CommCell Console fails if the WAL segment size is not configured at 16MB.

Resolution

For successful log only backups, it is mandatory that the WAL segment size for PostgreSQL is 16 MB both for Windows and UNIX clients. For step-by-step instructions to configure WAL segment size, click Performing Log Backups.

PSQL0004: PostgreSQL backup operation fails due to incorrect Archive Log directory configuration

Symptom

Backup operation fails with the following error messages:

Error Code: [94:13]
Description: PostgreSQL Database: [~WAL Archiving is NOT ENABLED. Please enable WAL Archiving for PostGres server.~] Data Backup Failed.
Source: kmlinsnap, Process: PostGresBackupParent

2388 b24 02/14 21:30:07 279 PGLogBackup :: GenerateCollectFile() - eRRoR -- Exception in [PGLogBackup :: GenerateCollectFile] Type:[POSTGRE_EXCEPTION] Msg:[Some log files are missing from the WAL dir]

2388 b24 02/14 21:30:07 279 PostGreLogBackupInterface() - eRRoR -- Exception in [PGLogBackup :: GenerateCollectFile] Type:[POSTGRE_EXCEPTION] Msg:[Some log files are missing from the WAL dir]

2388 b24 02/14 21:30:07 279 PostGresClBackupBase::Initialize() - Error generating collect file

Cause

The backup operation fails with the following errors in the PostGresBackupChild.log file:

6178 aed88300 01/24 11:01:21 247081 PostGreFsBackupInterface() - Enter
6178 aed88300 01/24 11:01:21 247081 libpqInterface :: ConnectDb() - Connection Info string: port = '5492' dbname = 'postgres' user = 'postgres' connect_timeout = '' password = **********
6178 aed88300 01/24 11:01:21 247081 PGFsBackup :: IsWALArchivingEnabled() - eRRoR -- Exception in [PGFsBackup :: IsWALArchivingEnabled] Type:[POSTGRE_EXCEPTION] Msg:[WAL archiving is not enabled]
6178 aed88300 01/24 11:01:21 247081 PGFsBackup :: PGFsBackup() - eRRoR -- Exception in [PGFsBackup :: IsWALArchivingEnabled] Type:[POSTGRE_EXCEPTION] Msg:[WAL archiving is not enabled]
6178 aed88300 01/24 11:01:21 247081 PostGreFsBackupInterface() - eRRoR -- Exception in [PGFsBackup :: IsWALArchivingEnabled] Type:[POSTGRE_EXCEPTION] Msg:[WAL archiving is not enabled]
6178 aed88300 01/24 11:01:21 247081 PostGresClBackupBase::Initialize() - Error generating collect file

2072 9b8 12/10 10:10:29 21565 libpqInterface :: AnalyzeResult() - eRRoR -- PQresultStatus indicates failure for SHOW archive_mode. Error:ERROR: unrecognized configuration parameter "archive_mode"
2072 9b8 12/10 10:10:29 21565 PGFsBackup :: IsWALArchivingEnabled() - Could not obtain the archive_mode setting.
2072 9b8 12/10 10:10:29 21565 PGFsBackup :: IsWALArchivingEnabled() - eRRoR -- Either provide a WAL archive command or enable EmptyWalArchiveCommand

Resolution

  1. Check the permission for WAL directory. Postgres user and group should have read and write permissions to the WAL directory.

    drwx-------  15    postgres postgres    4096    Jan 28 12:20    data
    drwxrwxrwx    2    postgres postgres    4096    Jan 28 12:20    wal

  2. Check whether the Archive Log Directory configured in the PostgreSQL instance properties and the postgresql.conf file is the same.
  3. Follow all the checkpoints listed under Configuring the Archive Log Directory.

PSQL0005: FS based backup operation hangs due to incorrect archive command

Cause

Incorrect archive command provided in the postgresql.conf file causes the pg_stop_backup process to hang resulting in the FS based backup operation to hang.

Resolution

Follow these steps for successful FS based backup operations,

  1. Execute the following query to identify the hung pg_stop_backup process.

    select datname,pid,usename,waiting,state,query from pg_stat_activity;

    For example,

    postgres=# select datname,pid,usename,waiting,state,query from pg_stat_activity;

    datname  | pid  | usename  | waiting | state | query
    postgres | 4903 | postgres | f | active | select pg_stop_backup();
    postgres | 5089 | postgres | f | active | select datname,pid,usename,wait

    For PostgreSQL version 8.4 or below, use the following query.

    select datname,procpid,waiting,usename,current_query from pg_stat_activity;

  2. Execute the following query to terminate the hung pg_stop_backup process.

    select pg_cancel_backend(<PID>);

    For example,

    postgres=# select pg_cancel_backend(4903);
    pg_cancel_backend
    -------------------
    t
    (1 row)

    For PostgreSQL version 8.4 or below, use the following query.

    select pg_cancel_backend(<procpid>);

  3. Correct the archive command in the postgresql.conf file.

    Example of a correct archive command,

    archive_command = 'cp %p /vkm/PostgreSQL/9.3/wl/%f'

  4. Restart the PostgreSQL server.
  5. Make sure that the archive command provided in the postgresql.conf file is correct. You can test this by executing the following commands successfully.
  6. Select pg_start_backup(‘Testing’);
    Select pg_stop_backup();

  7. Run a full PostgreSQL FS based backup operation.

PSQL0008: PostgreSQL dump based backup operation may fail on Ubuntu

Symptom

Dump based backup operation may fail on Ubuntu with the following error messages:

Error Code: [94:15]
Description: PostgreSQL Database: [~template1~] Data Backup Failed with PostgreSQL Error: [~pg_dump: [archiver (db)] connection to database "template1" failed: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"? ~].
Source: pgubun124, Process: PostGresBackupChild

Cause

The backup operation may fail due to running third-party PostgreSQL packages. You will find the following errors in the PostGresBackupChild.log file:

9800 d2387700 02/06 14:35:11 259550 PostGresBackupDb::writeFifo() - sh -c "/opt/PostgreSQL/9.0/bin/pg_dump -U postgres -Fc -b --port=5432 'template1' > /opt/simpana/iDataAgent/jobResults/2/10087/pipe259550_9800" 1
9800 d2387700 02/06 14:35:11 259550 PostGresBackupDb::writeFifo() - System command [sh -c "/opt/PostgreSQL/9.0/bin/pg_dump -U postgres -Fc -b --port=5432 'template1' > /opt/simpana/iDataAgent/jobResults/2/10087/pipe259550_9800"] failed, ret = 1 out=[] err=[pg_dump: [archiver (db)] connection to database "template1" failed: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
]
9800 eddc8740 02/06 14:35:11 259550 PostGresBackupDb::DoBackupDb() - result = failed
9800 eddc8740 02/06 14:35:11 259550 ::PostGresClBackupBase::DoBackupPostGres() - PostGresClBackupBase::DoBackup* returned error

Resolution

Create a symbolic link to the socket file as shown in the below example.

root@pgubun124: # ln -s /tmp/.s.PGSQL.5432 /var/run/postgresql/.s.PGSQL.5432

PSQL0010: File System based backup operation may fail if user tablespace is created under PostgreSQL server data directory

Symptom

The operation goes to a pending state when the job progress bar displays 25%. You will find the following error in the PostGresBackupParent.log file:

3315 89a0a300 09/24 21:42:35 360381 PGFsBackup :: DumpFileList() - DirPath = [/var/lib/pgsql/data/pg_tblspc] : WalDir [/largevol/8.4/wal]: ClusterDir_m = [/var/lib/pgsql/data]
3315 89a0a300 09/24 21:42:35 360381 PGFsBackup :: DumpFileList() - DirPath = [/var/lib/pgsql/data/pg_tblspc] : WalDir [/largevol/8.4/wal]: ClusterDir_m = [/var/lib/pgsql/data]
3315 89a0a300 09/24 21:42:35 360381 PGFsBackup :: DumpFileList() - DirPath = [/var/lib/pgsql/data/pg_tblspc] : WalDir [/largevol/8.4/wal]: ClusterDir_m = [/var/lib/pgsql/data]
3315 89a0a300 09/24 21:42:35 360381 PGFsBackup :: DumpFileList() - DirPath = [/var/lib/pgsql/data/pg_tblspc] : WalDir [/largevol/8.4/wal]: ClusterDir_m = [/var/lib/pgsql/data]
3315 89a0a300 09/24 21:42:35 360381 finishedWorkWithoutExit() - Failed to release resource for JobID [360381]. ErrCode [-1]

Cause

The $PGDATA/pg_tblspc directory (data directory/pg_tblspc) contains symbolic links that point to each of the user created tablespaces defined in the cluster. Creating tablespace inside the data directory creates a circular symbolic link for each tablespace to pg_tblspc directory. This causes FS based backup operation to fail.

For example:

PGDATA = /var/lib/pgsql/data

postgres=# select spcname, spclocation from pg_tablespace;
spcname      | spclocation
------------+-------------------------------
pg_default    |
pg_global     |
test                 | /var/lib/pgsql/data/pg_tblspc
(3 rows)

-bash-3.2$ ls -l /var/lib/pgsql/data/pg_tblspc
total 4
lrwxrwxrwx 1 postgres postgres 29 Sep 24 21:38 16400 -> /var/lib/pgsql/data/pg_tblspc
-rw------- 1 postgres postgres 4 Sep 24 21:38 PG_VERSION

-bash-3.2$ ls -l 16400/
total 4
lrwxrwxrwx 1 postgres postgres 29 Sep 24 21:38 16400 -> /var/lib/pgsql/data/pg_tblspc
-rw------- 1 postgres postgres 4 Sep 24 21:38 PG_VERSION

-bash-3.2$ ls -l 16400/16400
lrwxrwxrwx 1 postgres postgres 29 Sep 24 21:38 16400/16400 -> /var/lib/pgsql/data/pg_tblspc

Resolution

Check if any user tablespace is created inside $PGDATA. Move the user tablespace outside of the PGDATA directory.

PSQL0011: Dump based backup operation fails for PostgreSQL version 8.2 on Windows

Symptom

Dump based backups for PostgreSQL version 8.2 on Windows fail with below error:

PostgreSQL Database: [~Tar Dump format is not supported for PostgreSQL version '8.2'. You need to set registry keys; 'sPGCompDump' , 'sPGStagingDir' to enable Compressed Dump Format. Please refer to the Documentation for more details.~] Data Backup Failed.

Cause

Dump based backup operation is performed using the –Ft option that provides the dump output in Tar format.

Resolution

Create the additional setting sPGCompDump and set its value as Y. This provides the dump in a compressed output format and the dump is staged at the job results folder by default.

You can change the staging location by creating additional setting sPGStagingDir and setting its value as the new staging location.

PSQL0012: Dump based backup operation fails for table size greater than 8GB

Symptom

PostgreSQL dump based backup for table size greater than 8GB fails with the below error:

PostgreSQL Database: [~Tar Dump format is not supported if table size is more than 8GB. You need to set registry keys; 'sPGCompDump' , 'sPGStagingDir' to enable compressed dump format. Please refer to the Documentation for more details.~] Data Backup Failed.

Cause

Tar format has a limit of 8 GB for backing up individual tables.

Resolution

Create the additional setting sPGCompDump and set its value as Y. This provides the dump in a compressed output format and the dump is staged at the job results folder by default during backups and restores.

You can change the staging location by creating additional setting sPGStagingDir and setting its value as the new staging location.