Loading...

Data Analytics - Configuring Data Analytics

The Analytics Engine collects file and email metadata from subclient backup data associated with a storage policy. This metadata used to populate the File Analytics and Email Analytics reports, which are viewed from the Web Console.

Table of Contents

Analytics Engine Requirements

The Analytics Engine can be configured on a MediaAgent with the following requirements:

System Components Minimum Requirements
Operating System
  • Microsoft Windows Server 2012 R2 x64 Editions
  • Microsoft Windows Server 2012 x64 Editions
  • Microsoft Windows Server 2008 R2 x64 Editions
CPU Multi-Core Xeon Class 2
Memory 16 GB RAM
Available local hard disk space
  • 20 GB for the installation directory
  • 100GB of 10K SCSI disk for 100 million files for the Analytics Engine Index Directory

    For example, if you plan to analyze 500 million files with Data Analytics, you should have at least 500GB of space for the Analytics Engine Index Directory.

Configurations Clustered MediaAgents are not supported
Ports If you have a firewall setup, make sure that port 20000 is open for connections within the network.

If the MediaAgent associated with the storage policy on which you want to run analytics does not meet the above requirements, you can add a data path on the storage policy to another MediaAgent that meets the requirements. For instructions, refer to Configure the Analytics Engine.

Installing the Analytics Engine

The Analytics Engine is an indexing engine that supports a variety of CommCell features, including Data Analytics. You must install the Analytics package in your CommCell environment and configure the Analytics Engine from the corresponding MediaAgent properties in the CommCell Console. For information about installing the Analytics package, see Installing the Analytics Engine.

Note: MediaAgents that were installed in your CommCell environment before SP6 already have an Analytics Engine. You can enable and configure the Analytics Engine on these MediaAgents without installing the Analytics package.

Configure the Analytics Engine

Configure the Analytics Engine on the MediaAgent associated with the storage policy on which you want to run analytics.

Before You Begin

  • If you are not the CommCell administrator, you must have a security association that includes a role with the Agent Management or Administrative Management permission and an association with the MediaAgent you select as the engine.

    For information on adding a security association to a user, see Administering the Security Associations of a User.

  • Warning: Do not put the index directory and the Web Server (DM2) cache directory in the same location, because data loss might occur. The location of the Web Server cache is stored on the Web Server client in the following registry key: HKEY_LOCAL_MACHINE/SOFTWARE/CommVault Systems/Galaxy/InstanceNumber/DM2WebSearchServer/szCacheDataFilesPath.

Procedure

  1. Open the CommCell console and locate the MediaAgent associated with the storage policy on which you want to run analytics as follows:
    1. In the CommCell Browser, click Policies > Storage Policies.
    2. Click the <Storage Policy> on which you want to run analytics.
    3. In the <Storage Policy> tab under the Content section, locate the name of the <MediaAgent> associated with the storage policy in the MediaAgent column.
  2. Proceed as follows:
    If the MediaAgent associated with the storage policy... Then...
    meets the Analytics Engine requirements
    • Continue to configure the Analytics Engine on the MediaAgent as described in step 3.
    does not meet the Analytics Engine requirements To run analytics on a storage policy associated to a MediaAgent that does not meet the requirements, you can add a data path and share a library mount path between a MediaAgent that satisfies the Analytics Engine requirements (MediaAgent-B) with the MediaAgent associated to the storage policy (MediaAgent-A).
    1. Add a data path from the <storage policy> to MediaAgent-B as follows:
      1. In the <storage policy> tab, right click the primary copy and click Properties.
      2. In the Copy Properties dialog box, click the Data Paths tab.
      3. Click Add.
      4. In the Copy Data Path Candidates list, click MediaAgent-B and click OK.

        MediaAgent-B appears in the Data Paths tab.

      5. In the Copy Properties dialog box, click OK.
    2. Share the mount path for the library attached to MediaAgent-B with MediaAgent-A.

      See Share a Mount Path for instructions.

    3. Continue to configure the Analytics Engine on MediaAgent-B as described in step 3.
  3. Open the appropriate <MediaAgent> Properties dialog box as follows:
    1. In the CommCell Browser, click Storage Resources > MediaAgents.
    2. Right click the <MediaAgent> where you want to configure the Analytics Engine, and then click Properties.
  4. In the MediaAgent Properties for <MediaAgent> dialog box, click the Analytics Engine tab.
  5. Select the Configure / Deconfigure Analytics Engine check box.
  6. In the Index Directory box, type or click Browse to select the local path on the <MediaAgent> where the metadata information collected during analytics jobs will be stored.

    Select a folder for the Index Directory on the MediaAgent with the following requirements:

    •  Do not put the index directory and the Web Server (DM2) cache directory in the same location, because data loss might occur. The location of the Web Server cache is stored on the Web Server client in the following registry key: HKEY_LOCAL_MACHINE/SOFTWARE/CommVault Systems/Galaxy/InstanceNumber/DM2WebSearchServer/szCacheDataFilesPath.
    • Do not select a root directory.
    • The path name must not contain any of the following characters:

      ~!@#$%^&*()_+|}{“?><,/.’;[]=-`

    • The folder must have at least 1 TB of 10K SCSI available local disk space.
    • The folder must not contain any data.

  7. In the Available Analytics Engine Type list, click Data Analytics and then click the right-arrow button (>).

    Data Analytics appears in the Configured Analytics Engine Type list.

  8. In the MediaAgent Properties for <MediaAgent> dialog box, click OK.

Deconfigure the Analytics Engine

If you no longer want to run analytics jobs, deconfigure the Analytics Engine on a MediaAgent from the MediaAgent properties dialog box.

Deconfiguring the Analytics Engine will also delete all associated Data Analytics reports. These reports will no longer be available from the Web Console.

  1. Open the appropriate <MediaAgent> Properties dialog box as follows:
    1. In the CommCell Browser, click Storage Resources > MediaAgents.
    2. Right click the <MediaAgent> where you want to configure the Analytics Engine, and then click Properties.
  2. In the MediaAgent Properties for <MediaAgent> dialog box, click the Analytics Engine tab.
  3. Clear the Configure / Deconfigure Analytics Engine check box.

    The data in the Index Directory will be pruned according to the retention settings for the <MediaAgent>.

Configuring Analytics for File Access Time

By default, the metadata collected by Data Analytics for files includes modified date, file size, and file type. If you also want to display file access time information in the File Analytics Report, you must configure the file system agent and subclient to preserve this information during file system backup jobs.

Procedure

Perform this procedure on each subclient for which you want to display file access time information in the File Analytics Report.

  1. In the CommCell Browser under Client Computer > client name, click File System > backup set.
  2. In the backup set tab under Subclient Name, right-click the subclient that you want to use and then click Properties.
  3. Click Advanced.
  4. Click the Advanced Options tab and then select the Catalog additional file and system attributes check box.

After the next backup job, the subclient data will contain file access time information. This information will be displayed for the subclient in the File Analytics Report.

Run Analytics as a Job or Workflow

There are two options for analyzing the data in your environment: running data analytics as a job or workflow.