Streams - Best Practices
Review the following best practices before using streams.
Improving Data Movement Performance
Use the following guidelines to improve data movement performance:
- The number of device streams for Disk Library should be set to the total number of simultaneous data streams that you want to write to the available disks. An ideal configuration is about five writers per spindle.
- Make sure to enable stream randomization so that the data is distributed evenly between available streams. Stream Randomization is enabled by default for new storage policy. See Configuring Stream Randomization for more information.
- If you have a storage policy configured without deduplication, you can enable multiplexing for the copy. This allows for a backup operation of multiple data streams to be run concurrently to the same media, which optimizes performance of the Auxiliary Copy operation. Once you enable, the number of parallel backups that run, they should equal to Number of Device Streams * Multiplexing factor. See Combining Streams Using Data Multiplexing for more information.
- For Auxiliary Copy to Tape, ensure to enable the Combine Stream option of the secondary copy properties to avoid usage of one tape per stream. See Enabling Combine Stream for more information.
- The number of writers per disk can be configured in the Mount Path properties and also on the Library properties. Disk library settings determine the total number of writers (by default it is set to 5). Set this value to Maximum Allowed Writers to use the control per mount path. See Establish The Mount Path Allocation Policy for the Library and Establish The Mount Path Allocation Policy for more information.
- To enhance restore performance, when restoring all the contents of a multi-stream backup (multiplexed or not), use the Restore by Jobs option, if this feature is supported for your agent.
Allocating Device Stream
You can either add or reduce the maximum number of device streams in the Storage Policy Properties dialog box. However, each stream requires the use of one media drive. The following are the restrictions on the maximum device streams that you can use.
- If no alternate data paths are added, we recommend the number of device streams must be equal to the following:
- For tape libraries, the number of drives that are available in the library.
- For disk libraries, the number of writers that are established in the disk library (Library Properties (General) and its mount paths (Mount Path Properties (General)).
- If alternate data paths are added, we recommend that the number of streams must be equal to the sum of drives that are available in all the libraries and/or the sum of all the writers that are in all of the disk library/mount paths that are associated with all of the data paths.
You can change the number of data streams if the storage policy does not have any data from backup operations associated with it. However, do not decrease the number of streams for a storage policy that contains data associated with a subclient that supports multiple streams. For example, in the SQL Server iDataAgent, after you run a backup using a storage policy with three streams, do not decrease the number of streams for the storage policy.
Allocating Data Streams
The maximum number of streams that can be created simultaneously must be the same for all copies within a given storage policy. For some databases, the number of streams through which they are restored/recovered must equal the number of streams through which they were backed up. Operations will fail if you use one copy to restore/recover data that was backed up through a different copy with a greater number of streams.
Consequently, the maximum number of streams that are available to each copy of a given storage policy is limited by the smallest number of streams that are available to any copy within the storage policy. If the limiting factor severely hampers the efficiency of one of the copies (for example, if a copy directed to disk media is limited by the restrictions placed on a copy directed to tape media), you should create separate storage policies for the different copies. For additional information, see Hardware-Specific Resource Issues.
Silo-enabled storage policy copies can be configured with additional data streams that are dedicated to silo backup operations only. This is in addition to the data streams that are already configured for the copy. One silo stream is configured by default, but this value can be modified. See Modifying Silo Copy Settings for more details.
- If Data Multiplexing is used, restore operations will employ a single stream. If multiple restore streams are required, do not use Data Multiplexing.
- If Data Multiplexing is not used:
- Regular restore uses only one stream.
- Restore by Jobs can use the same number of streams that are used for backup. For best performance in the case of a tape library, the same number of tape drives that are used for backup should be available for restore.
If any of the streams do not have resources available, the whole job will be placed in a pending state, and will eventually fail if the condition is not corrected. If you cannot determine the availability of resources for all the streams, run the backup with one stream. This can be done by specifying 1 in the Number of Data Backup Streams field in the Subclient Properties (Storage Device) dialog box. Keep in mind that single-stream backups will also fail if the required media resource is unavailable.
For the purposes of performance load balancing, which maximizes the speed of this feature, any mount point on a physical drive that points to another physical drive is not taken into account. For example, if drive D: has multiple mount points that each point to other physical drives that contain data, the performance-enhancing parallelism of this feature is not utilized. Drive D: and all of those mount points will be backed up sequentially, not in parallel. To maximize performance, exclude the mount points from the content of drive D:, instead treating each of the mount points as separate Subclient content.
Mount points, rather than physical drives, are treated as separate entities for the purposes of performance load balancing, and the software deploys a Data Reader for each mount point. If more than one mount point is pointing to the same physical drive, multiple simultaneous reads are performed on that drive, even if the Allow multiple data readers within a drive or mount point option has not been selected.