Loading...

Connecting to a File System with Data Cube

You can use Data Cube to collect, organize, and mine the data residing in various file system repositories across your enterprise.

Before You Begin

  • You must be able to log in to the Web Console to view Data Cube. See Accessing the Web Console.
  • Only users assigned a role with the Data Connectors permission at the MediaAgent level can access Data Cube in the Analytics section of the Web Console. The associated MediaAgent must have been configured with Analytics Engine for Data Cube.
  • By default, Data Cube will attempt to access the file system using the credentials of the logged in Web Console user. Therefore, you must give read access to the username of the Web Console user performing this action or provide the credentials of an authorized user to accessing the file system when performing this procedure.

Procedure

  1. In a Web browser, log in to the Web Console and then click Analytics.
  2. Click File System.
  3. On the Data Sources (File System) page, click Add File System.
  4. On the New Data Source (File System) page, configure the source as follows:
    1. Under Data Source Name:
      • Click the Analytics Engine list and select an Analytics Engine to store the analytics data.
      • In Data Source Name, enter a name for the data source. The name cannot contain spaces.
      • In Data Source Description, enter a description for the data source.
      • Click Next to proceed to the next section.
    2. Under Directory Details:
      • In Directory Paths, enter the paths to the file system directories or files that you want to crawl. You can add multiple paths on separate lines.
      • In User Name, enter a valid username with access to the directory paths.

        Note: If you do not provide a username, the data source will be accessed using the credentials you used to log in to the Web Console.

      • In Password, enter the password for the user name used access the directory paths.
      • To filter the files and folders that are included in the crawl, in the Include or Exclude fields, enter a pattern that includes wildcard to specify the files and folders that you want to include or exclude. Enter multiple filters as a comma-separated list.

        For example, to exclude some common multimedia files, in Exclude enter: *avi, *mpg, *mp3, *mp4, *mov. To exclude a folder named Sample Files, in Exclude enter */Sample Files.

      • To collect only the data that has changed since the previous crawl, enable Incremental Crawl.

        Note: The description below the Incremental Crawl button displays the type of crawl that will be performed for the data source.

      • Click Next to proceed to the next section.
    3. Under Advanced Options:
      • To filter the data included in the crawl by size, select a minimum and maximum value for the File Size option. Only files that are within the bounds of these values will be included in the data source.
      • To only index the metadata of the files, enable Index Only Metadata. When this option is selected, users can only search data based on metadata and not the contents of the data.

        Note: The description below the Index Only Metadata button displays the type of indexing that will be performed for the data source.

      • To perform a full crawl of the selected data source after you finish creating the source, select Start Crawling Now.
  5. When finished, click Submit