mergeCsv script

Synopsis – Usage information

mergeCsv.bat|mergeCsv.sh [-help]

Synopsis – mergeCsv

mergeCSV.bat|mergeCSV.sh

-project |-p <project name> [-host <host name>]
[-port <port number>] [-servicePort <port number>]
-session <session ID> or -user |-u <user name> -password|-pw <password>
[-dump]
-file <filename> [-expandFamily] [-dryRun]
[-noFlushBefore] [-lockChanges] [-noFlushAfter]
-append [-noDedup] [-forceChange]| -replace [nullValue <nullvalue> ]
[-dontChangeProtected]
[-noUniqueMatch]
[-maxCategoryLength <number of characters>
[-allowUpdateMultipleDocs]
[-fieldsForDisplayNameAsId] [-displayNameMappingErrorFileName <exception log file name>]
[-csvFieldNative <csv field name for path to native file>] [-nativeBasePath <filepath>]
[-csvFieldImage <csv field name for path to image files>] [-imageBasePath <filepath>]
[-storeOriginal <undo file>] [-exceptions <log file>] [-showMatches <matches log file>]
-idCsv <csv field name> -idEngine <engine field name>
[-idPrefix <prefix>] [-idPostfix <postfix>]
-fields <csvField Name> <engineFiledName> [<csvFieldName <engineFieldName>] ...
-fieldSeparator <separator> [-lineSeparator <separator>]
[-multiValueSeparator <separator>]
[-textIndicator <text indicator>]
[-lineSeparatorForText <separator>]
[-charset <charset>]

Reference of basic parameters

These parameters can be used by every script. The synopsis shows which of them are really used.
Parameter Value Description
-help   Displays information on command line utility.
-dump   Allows to test the command without executing any commands in the CORE system. The possible result is displayed in the command prompt and you can check if the result meets your purpose. The -dump parameter must be added after all other parameters at the end of the command line.
-project <project name> The engine name.
Tip: The name has to be entered exactly as returned by servicecontrol -user admin -password <password> -getRunningProcesses.
-application <internal application identifier> Internal identifier as shown on the application’s Details tab in CORE Administration.
-host <host name> Host ID of the CORE project. If omitted, the local host is assumed.
-port <port number> RMI port of the CORE project. If omitted, port 1099 or the port specified as %MINDSERVER_PORT% is used.
-servicePort <port number> RMI port of the launcher service. By default, this is 1099. If omitted, port 1099 or the port specified as %MINDSERVER_PORT% are used.
-session <session ID> Session ID of a previous command; mandatory, if no user and password parameters are given. This will only work if a previous script command did not close the session, i.e., if that command contained -noLogout. The session ID is displayed in the command line tool after a script has been finished.
-user <username> Mandatory, if no session ID is provided.
-password <password> Mandatory, if no session ID is provided.
-noLogout   By default, a session is closed when a script is finished. With this parameter, the session is kept open. This allows the following script command to use the -session parameter. The session is closed if a new script command containing -user and -password parameters and not containing -noLogout is run.
-allowleading​longdash   By default, the script execution is aborted if any option is introduced with a long dash instead of the plain hyphen-minus sign - (Unicode 002D). To allow such other dashes for introducing options, add this parameter to the command.

Reference of functions

One or more of the following function parameters may be specified.

Parameter Value Explanation/Remarks
-file <filename> This mandatory parameter specifies the CSV file containing the changes to import.
-expandFamily Activates family expansion, i.e., the defined changes are applied to the whole attachment family of the matching document.

Optional.

Note: Only the first match will be executed. If a documet belongs to the attachment families of two documents matching to different lines in the CSV, the first match will win.

-dryRun   Switches to the dry run test mode that does not apply any changes.

Optional.

-noFlushBefore   Deactivates the flush buffer operation that normally is executed before import starts.

Optional.

-lockChanges   Blocks all engine operation until the import has finished. Any API call is suspended until the import has finished, ensuring that no other process can change a document while it is changed by the import as well.

Optional.

-noFlushAfter   Deactivates the flush buffer operation that normally is executed after import has finished.

Optional.

-append   If this switch is added, the values from the CSV file are added to existing values.

The command line must contain either -append or -replace.

The command line must contain either -append or -replace. For native file or image updates,-append is ignored.

-noDedup   Only used in append mode: Ensures that duplicate entries are not removed.

Example: If you merge the values Yes and Maybe into a field with the values Yes and Uncertain, with -noDedup, the field will contain these values: Yes, Maybe, Yes, Uncertain. Without -noDedup, it will contain these values: Maybe, Yes, Uncertain.

-forceChange  

Enforces changing a field content even if the old and the new value are the same.

This can be used if related values like e.g. true value in the rm_hastext field, or the Copied value in the rm_copy field have to be fixed or a field configuration was changed.

-replace   Replaces the existing values of the respective document fields with the values from the CSV file.

If -nullValue is not used, empty CSV fields delete the existing CORE field values.

Can be used for native file and image updates and removal.

-nullValue <null value keyword> If you use -replace in combination with -nullValue, you can control how empty CSV fields are treated. If the CSV field contains the null value keyword, existing CORE field values are deleted. If the CSV field is empty, existing CORE field values are not changed.
-dontChangeProtected   Specifies that the operation should not change documents that are protected in  Axcelerate ECA & Collection, because they have been published.
-noUniqueMatch   Content of the CSV key field does not have to match the complete corresponding document content, but only a part of it. This match must be an exact match. If specified, a document field containing "ABC DEF" will match the CSVfield content "ABC".

This option can be used for all CORE fields, except for the document ID.

-maxCategoryLength <number> Maximum number of characters per field value. If exceeded, the merge of the respective row will fail. The default value is 128. Use 0 or a negative value to disable the length limitation.
-allowUpdateMultipleDocs   Allows to update fields even if multiple matching documents are found.

If this option is not used, only the first matching document will be updated, and there will be an exception for the matching documents that were not updated.

-fieldsForDisplayNameAsId < CSV field> [<CSV field>] ... List of all fields that contain value display names. These fields must be mapped to an index engine field in the -fields parameter.

If an unambiguous mapping of field value names is possible, values will be mapped. They can be in any level of a hierarchical field.

If a new field value is added with the script, it will always be placed at the root level of a hierarchical field.

Note: For hierarchical fields, it is recommended to create new values in the desired place before running the script.

-displayNameMappingErrorFileName <exception log file name> File for logging mapping exceptions in case of ambiguous display names. If value mapping exceptions occur, you can use this file to resolve them.
-csvFieldNative <csv field name for path to native file> Specifies which column in the csv file contains native file location. This setting also configures the system to use the correct native storage handler in Axcelerate Review & Analysis.
-nativeBasePath <filepath> Base path relative to which native file locations are given.
-csvFieldImage <csv field name for path to image files> Specifies which column in the csv file contains image file locations. This setting also configures the system to use the correct image storage handler in Axcelerate Review & Analysis.
-imageBasePath <filepath> Base path relative to which image file locations are given.
-storeOriginal <file name> Poupulates a CSV file with the original values of all fields that were replaced/appended to. This file can be used to undo changes.
-exceptions <file name> Creates an exception log file for changes. Especially, a log file lists all CSV file rows that have not lead to any changes.
-showMatches <file name> This file logs matches between rows in the CSV file and IDs in the engine.

Reference of parameters

The following parameters define the content that is changed.

Parameter Value Explanation/Remarks
-idCsv <CSV ID field> Specifies the field in the CSV file that is used to find matching documents.

Mandatory.

-idEngine <engine ID field> Specifies the internal field name of the CORE field that is used to find matching documents.

Mandatory.

-idPrefix <prefix> String of characters that is added in front of the ID key from the CSV file before mapping it against the engine ID field.
-idPostfix <postfix> String of characters that is added at the end of the ID key from the CSV file before mapping it against the engine ID field.
-fields <csvField Name><engineFieldName>[<csvFieldName <engineFieldName>] ... Introduces a list of field mapping pairs each defining a source CSV field and a target CORE field.
-fieldSeparator <separator> Character or character sequence that separates the fields in the CSV file.

Mandatory.

-lineSeparator <separator> Character or character sequence that separates lines in the CSV file, if this is not the default CR/LF used in Windows files.
-multiValueSeparator <separator> Character or character sequence that separates multiple values within the same field in the CSV file. If omitted, a semicolon will be assumed as delimiter for structured view field values, and blanks will be assumed as value delimiters for non-structured view fields, e.g., comment fields.
-textIndicator <text indicator> Character that is used to enclose CSV file text. Separator characters that are enclosed with the text indicator are not interpreted as separators. If omitted, the default text indicator is double quotes.
-lineSeparatorForText <separator> Line separator used in text fields. If omitted, the default \n is used.

This field allows to use unicode. This is suited for large chinese data. Unicode requires less space than UTF-8.

-fileIndicator <file reference indicator> Prefix that indicates that a cell of a full text column references a text file. In the CSV file, the prefix must occur immediately before every unicode path.
-charset <character set> If the default UTF-8 is not used, specify the character set.

Example:  

In this example, a custom CORE field with the internal field name rm_cities is mapped with the Places field.

CSV file to be merged:

Places,"URI (Id)" Paris,"file:/D:/data/some%20text%20files/more%20text.txt?ds=some_text_files" London,"file:/D:/data/some%20text%20files/some%20text.txt?ds=some_text_files" Lyon,"file:/D:/data/some%20text%20files/additional%20text.txt?ds=some_text_files" Washington,"file:/D:/data/some%20text%20files/bilingual.txt?ds=some_text_files"

Script command line:

mergeCsv -fieldsForDisplayNameAsId Places -p singleMindServer.Minerva_North_Americ ​-u admin ​-pw adm1n -file "places_for_merging.csv" -idcsv "URI (Id)" -idengine uri ​-fields Places rm_cities ​-fieldseparator "," -append -exceptions myMergeExceptions.csv -showMatches myDocument.csv -displayNameMappingErrorFileName "D:\myValueMergeMappingErrors.csv"

Example:  

The CORE p field is mapped to the CSV TextPath column. This column contains references to text files. References are preceded with FILEPATH as file reference indicator. The text replace any text existing in the CORE field.

CSV file to be merged:

ID;TextPath
BAB_443;FILEPATH\\Server\FULLTEXT\00\83\BAB_443.txt
BAB_565;FILEPATH\\Server\FULLTEXT\00\83\BAB_565.txt

Script command line:

mergecsv -user admin -password <PASSWORD> -project singleMindServer.12345 -file "\\Server\Fix.csv" -idCsv ID -idEngine ID -fileIndicator "FILEPATH" -fields "TextPath" p -replace

Example:  

CSV file to be merged:

ID;NativeFile
BAB-443;c:\documents\meeting notes.txt
BAB_565;c:\documents\agenda.doc

Script command line:

mergecsv -user admin -password <PASSWORD> -project singleMindServer.12345 -file csvmergefile.txt -idCsv "ID" -idEngine "rm_numeric_identifier" -csvFieldNative "nativeFile" -fieldSeparator ";" -replace -exceptions "csvmerge50.txt"

 

Value Display Name Mapping Exceptions and Their Resolution

Copyright © 2018 Open Text. All Rights Reserved. Trademarks owned by Open Text.