Data Load Test

Before you start loading the complete source data, always run a test. You can do so using the configured data source. You just have to limit the number of documents to be loaded.

To run a data load test:

 

  • The application, the index engine and its meta engine are running.
  1. Open the desired workspace and select the Applications panel.
  2. Click the data source.
  3. From the Actions menu, select Configure.
  4. In the Index engine node, in the Index limit area, select the Stop crawl when number of index documents reached limit check box.
  5. Enter 100 in the Limit field.
  6. Click OK.
  7. Click the data source.
  8. From the Actions menu, select Start.

Check Data Load Test Results

Check all the items in this list:

  • Is the date format shown on the Explore tab of the Axcelerate Ingestion module correct? Look at the Document Date column.
  • Check the Custodian Smart Filter. Is there only one name for one custodian? Or are there multiple names per custodian, due to inconsistent spelling?
  • Check the MIME type Smart Filter. Are there any MIME types you wanted to exclude?
  • If you used a date filter: Sort the documents by the Document Date column or conduct a date search using the Date restrictions filter, to verify the dates of the loaded documents correspond to your expectations. Make sure to select the correct date field.
  • Check the Language Smart Filter: Have the expected languages been detected?
  • Check how embeddings and attachments have been split.

    In the MIME type Smart Filter, filter for a critical MIME type, for example, application/msword, and, in the Document Characteristics Smart Filter, fiter for With Embeddings. Then click the paperclip icon to include families. This displays all embeddings and attachments below the parent document.

    Repeat this for other critical MIME types.

    Also filter for With Attachments in the Document Characteristics Smart Filter and click .

  • Check the Exception Type and Exception Class Smart Filters. There may be exceptions that you can avoid in the definite data load.

Prepare Another Test Run

If you need to change configuration and rerun the test you first have to delete all loaded documents.

  1. In the Applications panel of CORE Administration, click on the index engine.
  2. Go to the Explore tab.
  3. From the Actions menu, select Mark documents as deleted.
  4. Go back to the Applications panel and click on the index engine again.
  5. From the Actions menu, select Save.

All documents are deleted now, and you can run a new test.

Prepare the Definite Data Load

Now you know how data will look like after data loading.

Tip: We recommend to consult with Customer Support, to make sure that all potential issues are taken into account.

Important: Before running the data load, make sure to remove the limit set for data load testing.

 

Disable Postprocessing for Data Load Tests