Replication Logs - Overview
ContinuousDataReplicator (CDR) performs replication by logging all activities in the source computer and replaying the log in the destination. On the source computer CDR logs all file write activities (new files and changes to existing files) involving the directories and volumes specified in the source paths of all the Replication Pair(s). These replication logs are transferred to the destination computer and replayed, ensuring that the destination remains a nearly real-time replica of the source.
Note the following differences in behavior on Windows and on UNIX:
- On Windows, log files are transferred periodically, not continuously. The period is based on the following:
- The amount of change activity. Logs are transferred when they reach 5MB in size.
- At a specified time interval if there is not sufficient change activity to fill the log. This time interval is by default 15 minutes. If necessary, you can adjust this change interval. For step-by-step instructions, see Specify the CDR Log File Update Interval.
- On UNIX, logs are sent to the destination computer in real time, and replayed from the destination computer's memory.
On Windows, replication logs are replayed serially on the destination computer - not in parallel. Hence, if you have many Replication Pairs configured to use the same destination computer, the destination computer should be able to receive and replay the replication logs at the same rate at which they arrive.
In order to avoid a backlog of replication log files causing the allocated log space to diminish to the point that throttling on the source computer(s) ensure that the destination computer has sufficient resources. This includes:
- processing power and memory
- I/O capacity
- disk space allocated for replication logs
On UNIX, replication logs for different Replication Sets are replayed in parallel. This is done using multiple replay threads on the destination computer. Ensure that the destination computer has sufficient resources. This includes:
- processing power and memory
- I/O capacity
Note that the logs are replayed from the destination computer's memory on UNIX computers. Hence disk space is not needed to store log files on the destination.
The location of the log files is specified during the installation of ContinuousDataReplicator Agent. If necessary, this can be changed subsequently from the CommCell Console. See Specify CDR Log File Location on Source and Destination Computers for step-by-step instructions.
Consider the following when you specify the location for the replication logs:
- Select a suitable volume for the source replication logs, which has sufficient space for the expected amount of log file activity and accumulation, for your environment.
- Do not specify a removable drive as the replication log location.
- Replication log cannot be located on a UNC path.
- On a cluster, replication logs must be located on a local volume, not a volume which is part of the cluster resource group.
- On UNIX, changing the log location on the source computer will cause the NetApp Replication Service (CVRepSvc) service to be recycled, this will cause all Replication Pairs to stop replication briefly and then resume.
- On UNIX, replication logs/cache has to be on a mount point different than the ones monitored by the ContinuousDataReplicator driver(source paths). Otherwise, the replication may not run properly.
Sufficient log file space is required on the source computer, and the destination computer for Windows. If a source computer runs out of log space (Windows) or attempts to create new entries in a log file before the old entries have been transferred (UNIX), logging will stop and all logs will be deleted. Thus, to avoid an interruption and restart, it is important to have sufficient space allocated for logs. For minimum log space requirements, see System Requirements - ContinuousDataReplicator. These minimums should be considered as a recommended starting point. Allow more space than recommended if it is available.
Consider the following when allocating space for logs:
- Log file sizes will reflect the actual size of the files added or the extent of changes made to files in the source path.
- The existing size of the data in the source path and the expected rate of additions and file changes, for all the Replication Sets and Pairs that will be configured on a given computer. Larger amounts of data, and high rates of change typically result in greater amounts of log space being required on the source computer.
- Capacity of, or throttling limits imposed upon the network used for replication. If network capacity is low, log space requirements will increase on the source, as data is not transferred quickly enough.
- Potential network outages or loss of connectivity. During such times, logs will continue to accumulate on the source, and sufficient space must be available to accommodate these circumstances. In the case of a source computer configured with multiple destinations, loss of connectivity with any one destination computer will prevent the logs from being deleted on the source computer in a timely manner. For additional information, see Fan-Out Considerations below.
- For a computer that serves both as a source computer and a destination computer, log space must be sufficient to accommodate the requirements of both of the capacities in which it serves.
- For a computer configured as a destination for multiple source computers (Fan-In), allocated log file space should be matched to the aggregate needs of all of its source computers. (On Windows, this is a disk space requirement; On UNIX, this is a memory requirement.)
- Utilize the Space Check feature for Configuring Alerts for Low Disk Space and Setting the Time Interval for Space Check (if appropriate) for the source log volume, so that you will be notified when free space is running too low; refer Configuring Space Check for ContinuousDataReplicator Agents.
- On Windows, configure the free disk space threshold for the source log volume so that data replication will be aborted well before the free space on the source log volume becomes too low, which can cause unpredictable results.
To avoid this, set the Low Watermark for the source log volume on the source computer to 10% or higher. See Configure Throttling for CDR Replication Activities. In the event this threshold is reached, you will have to make sufficient space available on the source log volume, and manually start the Replication Pair with Full Sync.
Each log will continue to be saved on the source computer until all destination computers signal that they have received that log and have finished replaying it. After this confirmation, the log will be marked for deletion on the source and the system will periodically delete such logs.
- On UNIX, the system reuses log files in a rotating manner once the allocated log file space becomes full, so the logs will never be deleted.
- On Windows, logs on the destination computer are marked for deletion after they have been replayed, and the system will periodically delete these files as well.
If by accident you manually delete a log file on the source computer from the Operating System, obviously it cannot be transferred to the destination and replayed. This will result in the destination no longer being completely in sync with the source. To resync the source and destination:
- On Windows, it will be necessary to abort activity for all affected Replication Pairs and restart them again using Start Full Re-Sync. For instructions on aborting and restarting replication, see Start/Suspend/Resume/Abort Data Replication Activity.
- On UNIX, the Replication Pairs will automatically SmartSync before returning to Replication.