Loading...

Best Practices - SnapProtect for VMware

Table of Contents

General

  • A separate initiator group must be used for the client, and proxy computers and LUNs should be added to both.
  • If the Virtual Server Agent is no longer required to run data protection operations, you can release the Virtual Server Agent license instead of uninstalling the agent. If the agent is uninstalled, you will not be able to:

    • Perform Live Browse.
    • Unmount snapshots mounted by any Virtual Server Agent.
  • In environments leveraging Fibre Channel storage (required for HDS), install the Virtual Server Agent and MediaAgent on a physical computer.
  • In array management. enter storage addresses for iSCSI and NFS in the same format used for ESX servers. For example, use an IP Address for both entries.
  • When using NFS storage, enter each IP address that is used into array management. Entering only a single IP for a management interface is not sufficient.
  • The Virtual Server Agent proxy must have access to the storage network. If you have an isolated network, an additional network connection must be added to the proxy.
  • It is recommended to use a short name for an NFS datastore. When you perform the SnapProtect backup of an NFS datastore, SnapProtect can append up to 20 characters with the name of NFS datastore to create the volume label. The ESX server supports a volume label of 42 characters.

Subclient Configuration and Datastores

Create user-defined subclients to optimize backup operations against virtual machine datastores or datastore clusters.  The following guidelines can help increase the efficiency of backups:

  • Use separate datastores or datastore clusters for large, critical, and high transaction virtual machines, with a small number of virtual machines for each datastore.
  • Create separate subclients for different classes of virtual machines, where the VMs in each class have the same protection requirements and backup methods. In particular, create separate subclients for:
    • Regular streaming backups
    • SnapProtect backups with backup copy
  • For each user-defined subclient, include virtual machines that reside on the same datastore or datastore cluster as described in Adding Subclient Content. You can define automatic discovery rules to target virtual machines on specific datastores, hosts, or other entities, and filters to exclude virtual machines based on similar criteria.
  • Avoid having multiple subclients that address the same datastore, so that backups for different subclients do not have to take multiple hardware snapshots of the same datastore during SnapProtect backups.
  • Use the same storage policy for virtual machines in the same datastore to minimize the number of snapshots.
  • Minimize the number of LUNs that connect to the ESX proxy server to increase the efficiency of mounting snapshots.

The virtual machines in each subclient are processed in a single backup job. When different subclients address different datastores, backup jobs can run in parallel for better performance when backing up many virtual machines.

This approach provides greater scalability, increases the efficiency of backups, reduces the impact on production systems, and simplifies management.

Using ESX Server for Snapshot Mount

To avoid scanning all ESX servers in a cluster when unmounting snapshots, use a standalone ESX server to mount a snapshot.

To mount a hardware snapshot to a proxy ESX server that is part of a VMware cluster, configure the nClusterMount additional setting on the Virtual Server Agent proxy to ensure that snapshots are properly unmounted when the operation is completed. All ESX servers in the cluster are scanned when unmounting snapshots.

Note: The nClusterMount additional setting is valid for V11 but might not be found using the Lookup option. You can enter this additional setting manually with the following parameters: Name = nClusterMount; Category = VirtualServer; Type = Integer; Value = 1. If you see a warning that the setting is not valid for the current release, click Yes to continue, then make sure that the Value is set to 1 before saving the results.

Advantages of SnapProtect for Virtual Machine Protection

SnapProtect offers significant advantages for protecting critical virtual machines (VMs):

  • Point-in-time snapshots are taken in a fraction of the time needed for traditional streaming backups, using VMware software snapshots to create hardware snapshots on storage arrays.
  • Because the time applications need to be quiesced for backups is minimized for snapshots, the time needed to reconcile the production VM with transactions that occur during the backup is also minimized.
  • Snapshots can be taken as often as necessary, providing multiple daily recovery points for high-transaction VMs.
  • Critical virtual machines can be recovered and restored to service quickly.
  • Full VMs or files and folders can be restored from snapshots or backup copies.
  • Snapshots can be backed up to disk by a proxy (backup copy), limiting the impact of backups on production VM resources.
  • Based on policies, snapshots are automatically catalogued and indexed for restores and Backup Copy.

For incremental backups, only changes are written to backup media.

Considerations

  • For less critical VMs, use traditional streaming backups with the VMware vStorage API for Data Protection (VADP), to provide daily backup and recovery with reduced infrastructure costs.
  • For special cases, use application or file system agents in guest VMs to manage backups. This approach is useful when storage is presented directly to virtual machines, such as raw device mappings (RDMs), direct iSCSI, or network file system (NFS); large databases or large numbers of files; and applications that are not snapshot compatible.

Assign Virtual Machines to Tiers Based on Service Level Agreements (SLAs)

To design your protection deployment for virtual machines, classify VMs according to the recovery point objectives (RPOs) for different types of VMs.

Tier 1

Tier 1 VMs are the most critical VMs:

  • First priority for recovery
  • Multiple recovery points per day
  • 1 TB or more in size with multiple disks
  • High transaction and data change rates
  • Dedicated resources

Recommended Protection Method: Use SnapProtect backups for Tier 1 VMs.

Tier 2

  • Second priority for recovery
  • One or two recovery points per day
  • Moderate to low transaction rates
  • 200-500 GB in size with multiple disks

Recommended Protection Method: Whenever possible, use traditional streaming VADP backups. If necessary, schedule daily or twice daily SnapProtect snapshots to support fast recovery. For special cases use in guest agents.

Tier 3

  • Lowest priority for recovery
  • One recovery point per day
  • Low transaction rates
  • Fairly static data

Recommended Protection Method: Use traditional streaming VADP backups with no snapshots.

Protection Methods

SnapProtect Plus Backup Copy

When SnapProtect is used, a hardware snapshot is created on the storage array as soon as the VMware snapshot is completed, then the VMware snapshot is removed immediately. This approach minimizes the size of the redo log and shortens the reconciliation process, to reduce the impact on the virtual machine being backed up and lessen the storage requirement for the temporary file. The secondary backup copy is then performed from the hardware snapshot, outside the production environment.

Large, critical, and high transaction virtual machines can be excluded from streaming backups and protected using SnapProtect snapshots. Snapshots support application consistent backups for applications such as Oracle, SQL Server, SharePoint, and Exchange. Hardware snapshots provide multiple persistent recovery points a day for critical VMs, including full VM recovery as well as granular file and folder level recovery, while minimizing the load on production VMs and infrastructure.

In the event of corruption of a full datastore where the underlying storage volume is intact, SnapProtect enables a full volume revert at the hardware level, enabling all the VMs in the datastore to be recovered at the same time, leading to lower down time.

Multiple readers can be configured to perform simultaneous quiescing of multiple VMs for faster creation or deletion of software snapshots, reducing the overall SnapProtect job time.

The redo log time for large VMs is minimized, to enable larger datastores and VMs to be protected.

A snapshot (Snap Copy) can be used by an ESX proxy as the source for secondary VADP backups, offloading the backup load for the largest VMs to minimize the impact on production systems.

Benefits

  • Low impact on productions systems
  • Multiple recovery points per day
  • Fast recovery copy
  • Reduces the backup workload by using a proxy server to create the daily backup copy.

Considerations

  • An ESX proxy is needed to mount a hardware snapshot for Backup Copy operations.
  • Additional storage required for snapshot reserve.
  • Possible impact on other production systems using same storage.
  • Array overhead during snapshot mount operations.

Traditional VADP Protection

Backups of virtual machines begin by quiescing the virtual machine and taking a VMware software snapshot. While the backup is in progress the virtual machine disk (VMDK) is frozen and changes are written to a temporary file (redo log). Once the backup completes, the VMware snapshot is removed and the virtual machine disk is updated with changes from the redo log (reconciled). For high transaction VMs the redo log can grow significantly if the backup takes a long time to complete, approaching the size of the entire VMDK for the source virtual machine. For other VMs, the changes recorded in the redo log are minimal.

VADP provides protection through an initial full backup supplemented by daily incremental backups and a periodic synthetic full backup to provide a single daily recovery point. This approach can include the following elements:

  • VMware‘s Changed Block Tracking (CBT) feature identifies changed blocks for incremental backups.
  • A synthetic full backup merges incremental changes from backup media to create a complete point-in-time VM backup without accessing the production VM.
  • A secondary copy of a virtual machine can be updated regularly using the DASH Copy feature with deduplication. Only changed data is written to the secondary copy. DASH Copy creates a new full backup by updating reference counters on deduplicated blocks that exist on disk. DASH Copy can be run on a frequency dictated by your backup retention policy, ensuring that old data blocks that are no longer needed are deleted from the system (known as data aging).
  • Advanced transport modes for VMs (SAN or HotAdd) minimize the impact of backup operations on local area networks (LANs).

This approach meets short term and long term retention needs with minimal impact on VMs, and can be used when a limited number of ESX proxies (required for SnapProtect Backups) are available.

Benefits

  • It is possible to recover a full VM or any individual file from an incremental point in time backup (primary copy or any secondary copies). There is no need to consolidate daily backups into a synthesized full backup.

Considerations

  • For high transaction VMs, this approach may not be optimal, because high change rates in VMs require a long reconciliation process.
  • For high transaction VMs, incremental changes using CBT can take almost as long as a full backup because changes are written across the disk.
  • If hardware snapshots are required for some virtual machines, those can be configured in a separate backup set.

In-Guest Agents

In some cases, high transaction virtual machines can be protected by installing an application or file system agent in the guest (source VM). This approach is useful when storage is presented directly to a virtual machine including RDMs, direct iSCSI, or NFS, or when the VM has a large database or large number of files. This can also be used to address VMs that do not tolerate a VMware snapshot no matter how brief the duration.

Best Practices for Implementing Virtual Machine Protection

Backup sets provide an organizational and management point for each class of VM. Use a separate backup set for each protection method. For example, use one backup set for Tier 1 VMs with SnapProtect plus Backup Copy, and another for Tier 3 VMs with traditional VADP protection.

Use a small number of subclients per backup set, each including similar classes of virtual machines across datastores, with separate subclients for large, critical, and high transaction VMs. This approach provides greater scalability and simplifies management; it also requires less intervention as the environment grows and changes (for example, removing or adding datastores or moving a VM to a different datastore).

Organize high transaction VMs in separate datastores. This approach is recommended for application performance; it also enables the use of hardware snapshots for data protection without processing too many snapshots for each datastore. Do not include VMs that are backed up using traditional VADP on the same datastore as VMs that are backed up using SnapProtect plus Backup Copy.

Use naming patterns and filters to automatically include or exclude datastores, hosts, and virtual machines in a subclient.

All virtual machines in a single datastore should be protected as part of the same storage policy to minimize the number of snapshots.

Scheduling Considerations and Examples

Use schedule policies to manage the timing of each type of protection operation.

Schedule SnapProtect and streaming VADP protection operations at different times, to avoid contention issues at the hypervisor or storage level.

Schedule virtual server agent (VSA) and guest-based application agent protection operations at different times. Both protection methods create application consistent backups; they should run at different times to avoid any potential conflicts.

Examples

The following tables provide examples of how different classes of VMs and protection methods can be scheduled.

VM Class Tier 1 Tier 2 Tier 3
Protection Method SnapProtect plus Backup Copy SnapProtect plus Backup Copy Traditional Streaming VADP
Backup Frequency Every 6 hours Daily Daily
Copies 3 2 2
Copy Frequency Every 6 hours Daily Daily
Schedule Daily at 6 am, noon, 6 pm, midnight Daily at 3 am, 3 pm Daily at 8 pm
Backup Copy daily at 1 am Backup Copy daily at 4 am Synthetic Full daily at 10 pm
Retention Copy 1 – 7 days
Copy 2 – 30 days
Copy 3 – 60 days
Copy 1 – 30 days
Copy 2 – 60 days
Copy 1 – 7 days
Copy 2 – 30 days