Property Post Processors
TheCORE data source crawler features a framework to manipulate metadata. Modules can be added, which are triggered on certain conditions and manipulate the content of item properties based on their type and configuration. These modules are called property post processors (PPP).
The item properties usually are added to metadata fields in an indexed record.
Though the number of PPP types is limited, PPPs can fulfill many different tasks through different configuration. The data source crawler comes with a large number of pre-defined PPPs for diverse use cases. Administrators can also create additional PPPs, drawing from the existing types and configuring them for their specific use case.
The configuration of each PPP consists of two sections:
- a framework configuration section that defines when exactly the PPP is executed on an item
- a type specific configuration section.
You can disable single property post processors or all postprocessing features.
Note: PPPs change the items before they are sent to the index engine. Any change to any PPP requires a re-crawl of the data source, therefore.
Changes to PPPs can change the checksum (hash value) that is used for duplicate detection.
Important: When you add a PPP, make sure that the Recipient counter PPP remains the last one in the crawler configuration. This PPP requires information that can only be correctly retrieved when all other PPP have been executed.