PST and OST Archive Repair
By default, MSG files and PST or OST mail archives are parsed using the Microsoft Outlook MAPI.
If the data to be loaded contains PST archive files and Microsoft Outlook 2013 is installed on the crawler server, it is recommended to validate the PST files before loading them.
The reason for this is that Microsoft Outlook 2013 aborts if the PST or any contained mail is corrupt. Then the complete PST archive is excluded from data loading. An exception document is created instead, with exception class Corrupt and exception type File corrupt.
Note: This only happens with Microsoft Outlook 2013. With Microsoft Outlook 2010, if a single mail in the PST archive is corrupt, only this mail is excluded from data loading and an exception document is created instead.
With Microsoft Outlook 2010, if a single mail in the PST archive is corrupt, only this mail is excluded from data loading and an exception document is created instead.
But you can use the repair tool also if you use Microsoft Outllook 2010 for crawling.
For OST archive files, the Oracle OutsideIn PST parser is used. So there is no risk that Microsoft Outlook aborts, but only the risk of corrupt archive files.
If OST repair is enabled in the data source configuration, the crawler always executes the repair tool on an OST file. For files that are not corrupt, this might mean a small decrease of the crawling performance.
Validation of PST and OST archives takes place with each data load. To resolve issues found by validation, you can enable automatic repair.
Automatic repair
If you enable automatic repair, the repair process is automatically started when you load data and validation detects an issue. The repair tool then gets archive files from the crawler and returns repaired files to the crawler, if a repair is possible.
Axcelerate Cloud comes with a custom version of the Outlook Recovery software produced by SysInfoTools. For Axcelerate Cloud, a preconfigured command line in the data source configuration specifies the repair tool.
On premise installations do not come with an installed repair tool.
You can configure automatic repair for PST, OST, or both, in the data source configuration.
You can, but from an efficiency standpoint, this is not recommended.
If you want to identify corruptions in a PST archive file before starting the data load, run Recommind's validation tool PSTValidator.exe. If a corrupt file is detected, use Microsoft's scan.pst tool to repair the respective PST file, or load data and use automatic repair.
Note: PSTValidator.exe has a lower recognition rate than the default Axcelerate Cloud tool.
Manual repair takes longer than automatic repair through the crawler. When a data source is crawled, all PST and OST files are validated anyway.