Single-instance Storage

Single-instance storage is used to store identical items only once per storage handler. Single-instance storage is available for SQL-managed storages.

The system default is set to Auto for the complete storage system. This setting means that single-instance storage is enabled for items with the storage file type Native files in writable storages. The storage of these items often takes a lot of disk space, especially in Axcelerate Ingestion, where usually a lot of duplicates exist.

You can remove the Auto setting for individual storage handlers.

Besides for native files, single instance storage may also be useful for:

storage file type Production exports

Single instance storage does not make sense for:

storage file type Image files
storage file type Document view files
storage file type Redaction files
storage file type Production files

You can enable single-instance storage even after ingestion or publishing. The system then ignores items that were stored before single-instance storage was enabled.

How does single-instance storage work?

For identifying duplicate items, the system uses a SHA-2 hash sum that is calculated from the binary data. The storage handler only stores the first of the duplicate items. This item is referenced in the metadata of all documents that use one of the duplicates.

The reference numbers for duplicates are stored in the database, too. Whenever a document in a project is deleted, or replaced in a way that the hash sum changes, the number of duplicates is reduced. The actual item is only removed from the storage if there is no document with a reference to this item anymore.

Note: Do not confound single-instance storage with document de-duplication or duplicate detection during publishing. You can use single-instance storage and, at the same time, publish duplicates.