Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Page Properties
hiddentrue
idDONE

Product Version
Report Note
AssigneeAntje

Resources & Remarks


Modification History

NameDateProduct VersionAction
Antje Oelschlägel16 DEC 2022
created


...

Service Namecontentanalyzer
Port Range7430-7439
Profilesprod, docker, kubernetes, metrics
Helm Chartyuuvis

Function

In the default configuration, each binary content file imported to yuuvis® Momentum passes the CONTENTANALYZER service. Its mime type is determined and the contained text is extracted for the most common file types.
>> Basic Use Case Flows

...

As of 2023 Summer, the length of extracted full-text is limited by the configurable parameter maxTextLength (default 2048 kB)as described below. Thus, overloads and downtimes of the CONTENTANALYZER service due to huge content files with much text can be avoided.

...

Configuration

The default behavior of the CONTENTANALYZER service can be changed via serviceConfiguration.json configuration file. The analysis of the content and/or mime type can be requested or not, depending on the defined conditions. If a condition matches during an import process, the content and/or mime type will be analyzed.
>> serviceConfiguration.json

Note: Within each import request body, this configuration can be overwritten by specifying the options parameters accordingly. The analysis of content and/or mime type can be requested or suppressed even if the opposite behavior is configured in the file serviceConfiguration.json.

Furthermore, it is possible to set the following parameters in a service-specific configuration file for the CONTENTANALYZER:

ParameterTypeDescriptionDefault
extraction.exclusiveOfficeLockboolean

If you need text extraction for large binary content files of Microsoft Office file types, the CONTENTANALYZER service might need its full memory for each single file to be processed. If true, all other text extraction processes wait until the processing of the Microsoft Office file is completed.

Note: Nevertheless, sufficient RAM is required for the CONTENTANALYZER service.

false
extraction.maxTextLengthInKBinteger

Available as of 2023 Summer.

Limit for the length of extracted full-text. If an extraction process reaches the limit, it is stopped and the full-text created till then is stored.

2048
mimetype.extension.redetectioncomma-separated list of mime typesThe standard calculation is based on the analysis of the binary content itself. In case a determined mime type is wrong, it is possible to reanalyze the file considering the file ending. The mime types for which this second analysis step should be triggered are listed here.'image/x-portable-greymap'

...