NSRL/NIST Filter

NSRL/NIST filter allows for excluding system files from crawling. For litigation purposes, application and/or operating system files must not be hold. The service can be enabled for each crawler.

The National Institute of Standard and Technology (NIST)/ National Software Reference Library (NSLR) provides a constantly updated database of system files with their hash value to uniquely identify them. This database is loaded as part of the NIST filter service. This service is typically installed within a CORE system.

The crawler computes the hash value for each file that it encounters, and sends the hash value to the Recommind NIST filter service. If the hash value matches the hash value in the database, then the document with this hash value is filtered.

Important: If you use the NIST filter, make sure to regularly update the NIST database.

Note: If you want to load CSV data, you want to be sure that each line of the CSV file is transferred to the XML document. For this reason NIST filtering is always disabled for CSV data sources, even if you have enabled it for the application.