Loading...

Job Controller - Advanced

Table of Contents

Setting Up Alerts

An alert is sent when the job conditions meet the criteria selected for the alert. The threshold and notification criteria determine when and at what frequency alerts are generated. Alerts can be configured globally or they can be job-based. For information on alerts, see Alerts and Notifications - Overview.

Each job can have one alert. If an alert exists for the job, the Add Alert option is not available.

Adding an Alert

Use the following steps to add a job-based alert for an active job in the Job Controller:

  1. From the CommCell Console ribbon, on the Home tab, click Job Controller.

  2. Right-click any running job and click Add Alert.
  3. From the Add Alert Wizard dialog box, select the required threshold and notification criteria and click Next.

    For information on the alert criteria available for job types, see Alerts and Notifications - Available Alerts - Job Management.

  4. Select the way in which the alert is to be sent to its intended recipient and click Next. For example, you can click Select [Email] for notification to send the alert as an email.
  5. Select the CommCell users and/or user groups that will receive the alert and then click Next.
  6. Review the options that you selected in the Summary tab and click Finish.

    The alert will be created for the selected job.

Modifying an Alert

Use the following steps to modify an alert configured for an active job in the Job Controller:

  1. From the CommCell Console ribbon, on the Home tab, click Job Controller.

  2. Right-click the job associated with the alert and click Modify Alert.
  3. In the Modify Alert Wizard dialog box, make the necessary changes and click Finish.

Deleting an Alert

Use the following steps to delete an alert configured for an active job in the Job Controller:

  1. From the CommCell Console ribbon, on the Home tab, click Job Controller.

  2. Right-click the job associated with the alert and click Delete Alert.
  3. Click Yes in the confirmation window.

Preempting Jobs

Preemption is defined by the Job Manager at each phase of a job. Jobs that can be interrupted by the Job Manager or by the user and then restarted without having to start the phase from the beginning are called Preemptible jobs. A non-preemptible job is one that cannot be interrupted by the Job Manager or suspended by the user.

If a running job is preemptible, the Job Manager can interrupt the running job and allocate the resources to a higher-priority job if the needed resources are streams, drives or media. The interrupted job enters a waiting state and resumes when the resources it needs become available. Backup and restore operations preempt auxiliary copy and other jobs (except backup).

The following table provides information on the Status of the job in the Job Controller window and the Reason for job delay displayed in the Job Details dialog box when a job is preempted. In addition, a brief explanation on what happens when a job is preempted is also provided.

Jobs Status in the Job Controller Reason for Job Delay Additional Information
Backup Operation Interrupt Pending

 

Waiting

No Job Delay

 

No resources available

Once interrupted, job does not hold on to resources and returns to Waiting status. The job retries for resources. (The Status of the job in the Job Controller window and messages in the Reason for job delay are discussed in What Happens When There are no Resources for a Job.)
Data Recovery Operations (for File System-like agents) Interrupt Pending

 

Waiting

No Job Delay

 

No resources available

Once interrupted, job does not hold on to resources and returns to Waiting status. The job retries for resources. (The Status of the job in the Job Controller window and messages in the Reason for job delay are discussed in What Happens When There are no Resources for a Job.)
Data Recovery Operation (for Database-like agents) Not Preemptible Not Preemptible Not Preemptible
Index Restore (Browse Backup Data) Not Preemptible Not Preemptible Not Preemptible
Auxiliary Copy Interrupt Pending

 

Waiting

No Job Delay

 

No resources available

Once interrupted, job does not hold on to resources and returns to Waiting status. The job retries for resources. (The Status of the job in the Job Controller window and messages in the Reason for job delay are discussed in What Happens When There are no Resources for a Job.)
Synthetic Full Interrupt Pending

 

Waiting

No Job Delay

 

No resources available

Once interrupted, job does not hold on to resources and returns to Waiting status. The job retries for resources. (The Status of the job in the Job Controller window and messages in the Reason for job delay are discussed in What Happens When There are no Resources for a Job.)
Media Refresh Waiting No resources available Once interrupted, job does not hold on to resources and returns to Waiting status. The job retries for resources.

The higher priority job that is doing the resource preemption will display the Reason for Job delay as follows:

Waiting for job[ ] to release the resources.

Preemptible and Non-Preemptible Jobs

In a preemptible phase, the job can be interrupted by the Job Manager or suspended by the user and then restarted without having to start the phase over again from the beginning. Preemptible jobs are always restartable. A File System backup phase is one example of a preemptible phase; the Job Manager can interrupt this phase when resource contention occurs with a higher priority job.

A non-preemptible phase is one that cannot be interrupted by the Job Manager or suspended by the user. It can only run to completion, be killed by administrative action, or be failed by the system. For example, the data recovery operations of database agents are non-preemptible.

The following table lists the types of preemptible and non-preemptible jobs:

Preemptible and Restartable Non-preemptible and Non-Restartable Non-preemptible but Restartable
Data protection operations for most non-database agents. Data recovery operations for database-like agents. Data protection operations for database agents.
DataArchiver archive jobs during the Archive Index and Archive Content Index phases of the job. Media export, erase media, and inventory jobs. The system state phase of Windows File System data protection operations.
Data recovery operations for most File System-like (indexing-based) agents during the restore phase. SAN volume data protection jobs (non-preemptible in its scan phase). Offline Content Indexing jobs.
Data recovery operations from the Search Console. Disk volume reconciliation jobs.  
Most administration jobs including Install Automatic Updates and Download Automatic Updates.    
Silo backup and restore operations.    
Media refresh operations.    
Deduplication database reconstruction job.    

What Happens When There Are No Resources for a Job

Each job requires certain resources for its successful completion. Absence of these resources affects different types of jobs differently. The following table discusses the resources required by each job, the status of the job in the Job Controller window when there are no resources and the corresponding examples of the Reason for job delay displayed in the Job Details dialog box. In addition, a brief explanation on what happens when a job does not have the required resources is also provided.

By default the NetApp Media & Library Manager service on the CommServe cleans up any media and drive reservation that is held by a job which failed to release the resource when it was abruptly terminated, every 1440 minutes. You can modify the frequency using the nRESOURCERELEASEINTERVALMIN registry key.

 
Jobs Resources Status in the Job Controller Reason for Job Delay Additional Information
Backup Operation Streams, Active Media, Drive Waiting

 

Waiting

See Example 1.

 

See Example 2.

Job checks for necessary resources.

If the resources are not available the job retries to reserve the resources when ever they are freed.

Does not hold on to any resource until all the necessary resources are available.

Data Recovery Operations (for File System-like agents) Drive Pending The media is already reserved by some other job(s). If the resources are not available the job retries to reserve the resources when ever they are freed.
Data Recovery Operation (for Database-like agents) Drive Failed

 

Running

See Example 1.

 

See Example 2.

Job checks for necessary resources.

If the resources are not available it retries every 2 minutes to reserve the resources.

Does not hold on to any resource until all the necessary resources are available.
Index Restore Operation (Browse Backup Data) Destination Drives

 

 

 

 

 

 

 

 

 

 

Source Media

 

Pending

 

Waiting

 

Waiting

 

 

 

 

 

 

Running

 

Pending

See Example 1.

 

 

 

See Example 2.

 

 

 

 

 

 

 

 

See Example 2.

Job checks for necessary resources.

Job reserves 2 drives for source and destination media.

If the above resources are not available, it retries every 2 minutes to reserve these resources.

Does not hold on to any resource until all the necessary resources are available.

Once the 2 drives and destination media is obtained job reserves the source media.

If the job encounters resource contention while reserving the source media, (Example 2) it retries every 20 minutes and a maximum of 144 times to obtain the source media.

Holds on to the 2 drives and destination media as long as it is not interrupted and as long as the source media is available.
Synthetic Full Streams, Destination Drives, Destination Media

 

 

 

 

 

 

 

 

Source Media

Waiting

 

Waiting

 

Waiting

 

 

 

 

 

Running

 

 

Pending

See Example 1.

 

 

 

See Example 2.

 

 

 

 

 

 

 

 

See Example 2.
Job checks for necessary resource

Job reserves streams, marks active media full, reserves 2 drives and destination media

If the resources are not available the job retries to reserve the resources whenever they are freed.

Does not hold on to any resource until all the necessary resources are available.

Once the 2 drives and destination media is obtained job reserves the source media.

If the job encounters resource contention while reserving the source media, (Example 2) it retries every20 minutes and a maximum of 144 times to obtain the source media.

Holds on to the 2 drives and destination media as long as it is not interrupted.

Example 1

The operation could not be completed as the drive pool is offline.

The operation cannot be completed as the host is offline.

The operation could not be completed as the library controller is offline.

The operation cannot be completed as the library is offline.

The operation cannot be completed as the master drive pool is offline.

There are not enough drives in the drive pool.

There is no active controller for this library.

Mount path is offline.

Media agent is offline.

Host is offline.

Library controller service is down.

The Library controller is offline.

Example 2

All spares are outside the library.

The operation cannot be completed as the drive is reserved.

The operation cannot be completed as the media is outside the library.

The operation cannot be completed as the mountpath is reserved.

The operation could not be completed as there is not enough media.

The operation could not be completed as there are not enough drives that are not reserved, online and whose controller are alive.

There is no active controller for this drive.

There are no disk paths that are enabled and with the required space.

There are no more spares.

The operation cannot be completed as the opposite side for this media is already reserved.

The operation cannot be completed as the number of drive reservations exceeds the allotted drives.

The number of writers would exceed the maximum allowed limit.

The operation cannot be completed as the media group is reserved.

The operation cannot be completed as the copy is reserved.

The operation cannot be completed as the drive pool is used by storage policy copy.

Requested volume is being mounted/unmounted.

The media is not in library.

Media is being used and is reserved.

Job does not have reservation on the drive.

The requested media is not in any slot of the library.

The requested media is stuck in the drive.

The requested media is exported.

The media is already Reserved by some other Job[s].

The Media is not available. The Job currently using the media was interrupted.

No drives available for reservation.

Not enough good drives available for reservation.

Not enough drives available in Drive Pool.

The interrupted job has not released the drives yet.

The interrupted job has not released the Media yet.

The job has already been interrupted by another job.

Job[s] Interrupted by this Job have not released Resources yet.

Not Enough streams Available for Storage policy [^1%s]. Need ^2%d stream[s] and ^3%d stream[s] are available.

The Media is already reserved by some other Job[s].

Waiting for Jobs [ ] to release the resources.

No resources available.

Queuing Jobs

Setting jobs to be queued allows a job that would otherwise fail to remain in the Job Controller in a Queued state, i.e., waiting. Once the condition that caused the job to be queued clears, the Job Manager will automatically resume the job.

All Data Protection, Data Recovery, Data Collection, Administration Operations jobs can be queued.

Queue Jobs When There is a Conflicting Active Job

Jobs can be queued if they conflict with other currently running jobs such as multiple backup operations for the same subclient.

  1. From the CommCell Console ribbon, click the Home tab, and then click Control Panel.
  2. Under the Data section, click Job Management.

    The Job Management dialog box appears.

  3. On the General tab, select the Queue jobs if other conflicting jobs are active check box.
  4. Click OK.

Queue Scheduled Jobs

You can also set scheduled jobs to be queued. If jobs are scheduled and the Queue Scheduled Jobs option is enabled, these jobs will start in the Job Controller in a queued state at their scheduled time. These jobs can be manually resumed or, if the Queue Scheduled Jobs option is disabled, these jobs will resume automatically.

Selecting this option is especially useful during times of maintenance. Rather than suspend each job manually after it has started, you can enable the Queue Scheduled Jobs option, which will start all the scheduled jobs in the Job Controller in a Queued state. Once you have completed the maintenance, you can manually resume specific scheduled jobs, or simply deselect the Queue Scheduled Jobs option to automatically resume all the scheduled jobs.

  1. From the CommCell Console ribbon, click the Home tab, and then click Control Panel.
  2. Under the Data section, click Job Management.

    The Job Management dialog box appears.

  3. On the General tab, select the Queue Scheduled Jobs check box.
  4. Click OK.

Queue Jobs When Activity Control is Disabled

Jobs can be queued if the activity control for the job type is disabled.

  1. From the CommCell Console ribbon, click the Home tab, and then click Control Panel.
  2. Under the Data section, click Job Management.

    The Job Management dialog box appears.

  3. On the General tab, select the Queue Jobs if activity is disabled check box.
  4. Click OK.

Additional Job Management Settings

Setting the Maximum Number of Simultaneously Running Streams

The Job Controller window displays all the current jobs in the CommCell. A status bar at the bottom of the job controller shows the total amount of jobs; the amount of jobs that are running, pending, waiting, queued and suspended; and the high and low watermarks. The watermarks indicate the minimum and maximum number of streams that the Job Manager can use simultaneously.

Use the following steps to set the job stream high watermark level:

  1. From the CommCell Console ribbon, click the Home tab and then click Control Panel.
  2. Under the Data section, click Job Management.
  3. In the Job Stream High Watermark Level box, type the job stream high watermark level for simultaneous running streams.
  4. Click OK.

Setting a Time Interval for Job Alive Check

You can specify the time interval for which the Job Manager will check active jobs to determine if they are still running. By default, the time interval is set to 2 minutes.

Use the following steps to modify the job alive check interval:

  1. From the CommCell Console ribbon, click the Home tab, and then click Control Panel.
  2. Under the Data section, click Job Management.
  3. Type or select the time interval in minutes in the Job Alive Check Interval (Mins) box.
  4. Click OK.

Setting Job Update Interval for Active Jobs

The Job Update Interval allows you to view or modify how often information is updated for backup and restore operations in the Job Details dialog box. The Job Updates Interval list displays all of the available agent types. Specify the protection and recovery time in minutes. You can also set the update interval time for the ContinuousDataReplicator.

Use the following steps to change the Job Update Interval for active jobs:

  1. From the CommCell Console ribbon, click the Home tab, and then click Control Panel.
  2. Under the Data section, click Job Management.
  3. Click the Job Updates tab.
  4. Select an Agent from the Agent Type list and click the integer under the Protection (Mins) or the Recovery (Mins) column to change the time interval.
  5. Click OK.

Enabling Jobs to Complete Past the Operation Window Rule

In some cases, an operation launched prior to the time window of an operation window rule may require the ability to run uninterrupted until completion.

Use the following steps to allow running operations to ignore the operation window rule and continue until completion:

  1. From the CommCell Console ribbon, click the Home tab, and then click Control Panel.
  2. Under the Data section, click Job Management.
  3. In the General tab of the Job Management dialog box, select the Allow running jobs to complete past the operation window check box.
  4. Click OK.

Preventing Backups on Disabled Clients

Use the following steps to disable backup operations on client computers that are disabled:

  1. From the CommCell Console ribbon, click the Home tab, and then click Control Panel.
  2. Under the Data section, click Job Management.
  3. From the General tab in the Job Management dialog box, select the Do not start backups on disabled clients check box.
  4. Click OK.

Command Line Operations

You can perform the following job configurations through the command line:

Viewing Job Summary (qlist jobsummary)

Description

This command lists the current state of all active jobs in the CommServe. Jobs are classified into the following states:

  • Running
  • Suspended
  • Pending
  • Queued
  • Waiting

You can also filter the jobs by client, agent, instance, backup set or subclient.

In case of an error, an error code and description are displayed as: "media: Error errorcode: errordescription"

Usage

qlist jobsummary [-c <client>] [-a <iDataAgent>] [-i <instance>] [-b <backupset>] [-s <subclient>] [-tf <tokenfile>] [-tk <token>] [-h]

Options

-c Client computer name
-a Agent type installed on client computer (see Argument Values - Agent Types)
-i Instance name, required for certain agents
-b Backup set name, required for certain agents
-s Name of the subclient
-tf Reads token from a file
-tk Token string
-h Displays help

Diagnostics

Possible exit status values are:

0 - Successful completion.

1 - CLI usage failures, due to the use of an unsupported option or missing argument.

2 - Any other failure. 

Example

Display job summary of all jobs of client cl1.

qlist jobsummary -c cl1
RUNNING PENDING WAITING QUEUED SUSPENDED TOTAL
 1       10      0       4      1         16

Setting Job Control (qoperation jobcontrol)

Description

This command allows you to kill, resume, suspend, or change/set the progress (in percentage) of a given job. To operate on a single job, specify the Job ID. To operate on more than one job in a specific area, specify the client or MediaAgent name. You can also specify additional levels such as the agent, instance, backup set and/or subclient. The -all option can be used to suspend/resume all jobs.

If the operation is successful, no message is displayed on the command prompt. In case of an error, an error code and description are displayed as: "jobcontrol: Error errorcode: errordescription"

Usage

qoperation jobcontrol [-o <joboperation>] [-j <jobid>] - all [-m <mediaagent>] [-c <client>] [-a <dataagenttype>] [-i <instance>] [-b <backupset>] [-s <subclient>] [-p <priority>] [-tfx <total files to transfer>] [-fx <files transferred>] [-ifx <files transferred since last update>] [-tbx <total bytes to transfer>][-tf <tokenfile>] [-tk <token>] [-h]

Options

-o Operation to be performed on the job. Valid values are:
  • kill
  • suspend
  • resume
  • changepriority
  • setpercentcomplete
-j Job ID
-all All jobs
-m MediaAgent name
-c Client computer name
-a iDataAgent type installed on client computer (see Argument Values - Agent Types)
-i Instance name
-b Backup set name
-s Subclient name
-p Job priority
-tfx Total files to transfer
-fx Files transferred
-ifx Files transferred since last update
-tbx Total bytes to transfer
-tf Reads token from a file
-tk Token string
-h Displays help

Diagnostics

Possible exit status values are:

0 - Successful completion.

1 - CLI usage failures, due to the use of an unsupported option or missing argument.

2 - Any other failure.

Example

  • Kill a job with job ID 175.

    qoperation jobcontrol -j 175 -o kill

  • Suspend all jobs under MediaAgent ma1.

    qoperation jobcontrol -m ma1 -o suspend

  • Suspend all jobs.

    qoperation jobcontrol -all -o suspend

  • Resume all jobs under client cl1 and dataagent "Q_WIN2K_FS".

    qoperation jobcontrol -c cl1 -a Q_WIN2K_FS -o resume

  • Change priority of a job with job ID 175 to 100.

    qoperation jobcontrol -j 175 -p 100 -o changepriority

Setting Job Retention (qoperation jobretention)

Description

This command allows you set retention rules on a given job. The job ID, storage policy, and storage policy copy names must be specified when using this command.

Jobs that are not retained are subject for data aging. Jobs that are retained with a specified retention period will become subject for data aging after the retention period ends. If you retain a job, however, and do not specify the retention period, the job data will never be subject for data aging.

In case of an error, an error code and description are displayed as: "jobretention: Error errorcode: errordescription"

Usage

qoperation jobretention -j <jobid> -sp <storagepolicy> -spc <copy> -rtn <true|false> [-rd <infinite or mm/dd/yyyy hh:mm:ss or yyyy/mm/dd hh:mm:ss>]

Options

-j Job ID
-sp Storage policy name
-spc Storage policy copy name
-rtn Job to be retained (true) or not (false)
-rd Date until the job is to be retained. Valid values are:
  • infinite
  • mm/dd/yyyy hh:mm:ss
  • yyyy/mm/dd hh:mm:ss
-tf Reads token from a file
-tk Token string
-h Displays help

Diagnostics

Possible exit status values are:

0 - Successful completion.

1 - CLI usage failures, due to the use of an unsupported option or missing argument.

2 - Any other failure.

Example

Retain a job - with job ID 175, storage policy sp1 and storage policy copy copy1 - indefinitely by setting the retention period to infinite.

qoperation jobretention -j 175 -rtn true - rd infinite -sp sp1 -spc copy1

Viewing the Last Backup Job (qlist lastjob)

Description

This command displays the summary details of all previous backup jobs completed by a specific client, instance, backup set or subclient. Whenever more than one job is found, the name of each backup job is listed in its own line. The message, "No job found," is displayed on the command prompt whenever backup jobs are not found.

In case of an error, an error code and description are displayed as: "lastjob: Error errorcode: errordescription"

Usage

qlist lastjob -c <client> -a <iDataAgent> -i <instance> -b <backupset> -s <subclient> [-js <jobstatus>] [-tf <tokenfile>] [-tk <token>] [-h]

Options

-c Client computer name
-a Agent type installed on client computer (see Argument Values - Agent Types)
-i Instance name, required for certain agents
-b Backup set name, required for certain agents
-s Name of the subclient
-js The completion status of the job. Use this option to see the last backup job that completed with a particular status. Valid values for this option are:
  • Completed
  • Failed
  • Killed
-tf Reads token from a file
-tk Token string
-h Displays help

Diagnostics

Possible exit status values are:

0 - Successful completion.

1 - CLI usage failures, due to the use of an unsupported option or missing argument.

2 - Any other failure. 

Example

Display the last backup job that ran under client client01.

qlist lastjob -c client01

JOBID STATUS    STORAGE POLICY APPTYPE    BACKUPSET SUBCLIENT INSTANCE  START TIME
----- ------    -------------- -------    --------- --------- --------  ----------
101   Completed SP_12          Filesystem set001    Sub01     <default> 01/01/2013 01:20:55

Viewing a List of Backed Up Files

Use the following steps to generate a list of files which are backed up during a specific job.

  1. Open the Command Prompt and go to software_installation_directory/Base and run the following command:

    qlogin -cs <commserve_host_name> -u <user name>

  2. Execute the following command from the software_installation_directory/Base folder after substituting the parameters:

    ListFilesForJob.exe -job <JOBID> -ma <MAName> [-vm <Instance>] [-flag <ArchiveBitFlag>] [-tmpdir <TMPDIRPATH>] [-o <OUTFILENAME>]

  3. Go to the path specified in TMPDIRPATH and open the OUTFILENAME file to view the list of files.

The following table displays the parameters that need to be provided before running the command:

Parameter Description of Parameter Values
JobID The job id of the job for which you are generating the list.
MAName Name of the MediaAgent which is used to perform the backup job.
Instance Name of the instance which you have used to install the Windows File System iDataAgent.

This is an optional argument. If you do not specify any value, the job in Instance001 will be used by default to generate the list of files.

ArchiveBitFlag 1 to set the Archive Bit.

0 to reset the Archive Bit.

This is an optional argument. If you do not specify any value, the archive bit will not change and the file that contains the list of files can not be deleted.

TMPDIRPATH The directory in which you want to create the file.

This argument is optional. If you do not specify any directory, the file will be created in the default temporary directory.

The default temporary directory for the software is set using the dGALAXYTEMPDIR registry key. When you install Windows File System iDataAgent, the dGALAXYTEMPDIR registry key gets created at the following location: HKEY_LOCAL_MACHINE\SOFTWARE\CommVault Systems\Galaxy\Instance<xxx>\Base

OUTFILENAME The name of the file in which you want to store the list.

Updating Jobs

Use the following steps to suspend, kill, or resume one or more jobs or to update the reason for suspending, killing, or resuming the jobs.

  1. Download the update_job_template.xml file to the computer where you will run the command.
  2. Open the .xml file and update the XML parameters listed in the table below.
  3. Run the qlogin command to log on to the CommServe computer.
  4. From software_installation_directory/Base type the following command after substituting the XML parameters:

    qoperation execute -af update_job_template.xml

  5. Optional: Verify the jobs were updated by viewing the jobs in the CommCell Console Job Controller window.

The following table displays the XML parameters needed before running the qoperation command:

Parameter Description of Parameter Values
message The jobs that will be updated.

Valid values are:

  • ALL_JOBS, the operation in the operationType parameter will affect all jobs
  • ALL_SELECTED_JOBS, the operation in the operationType parameter will affect the jobs defined in the jobId parameter
operationDescription The reason for suspending, killing, or resuming the job.
operationType The operation to perform on the job.

Valid values are:

  • JOB_SUSPEND
  • JOB_RESUME
  • JOB_KILL
jobId The job IDs for the jobs that will be suspended, killed, or resumed. Use the jobId parameter when the message parameter is set to ALL_SELECTED_JOBS. To add more than one job, add the following line for each job:

<jobs jobId="job_ID" />