Page Properties

hidden	true
id	DONE

Product Version
Report Note
Assignee	Antje

Resources & Remarks

Modification History

Name	Date	Product Version	Action
Antje Oelschlägel	16 DEC 2022		created

...

Excerpt
This core service is responsible for the analysis of binary content files during their import in to yuuvis® Momentum.

Characteristics

...

Service Name	`contentanalyzer`
Port Range	`7430-7439`
Profiles	`prod, docker, kubernetes, metrics`
Helm Chart	`yuuvis`

Function

In the default configuration, each binary content file imported in to yuuvis® Momentum passes the CONTENTANALYZER service. Its mime type is calculated determined and the contained text is extracted for the most common file types.
>> Basic Use Case Flows

...

The CONTENTANALYZER can extract the text contained in an imported binary content file. The extracted plain text is stored as text rendition in the search index and will be used for the full-text search together with all values of string properties within the metadata.For the following file types the

As of 2023 Summer, the length of extracted full-text is limited by the configurable parameter as described below. Thus, overloads and downtimes of the CONTENTANALYZER service due to huge content files with much text can be avoided.

The file types for which text extraction is available :

...

MS Office Word 97-2016

...

MS Office PowerPoint 97-2016

...

MS Office Excel 97-2016

...

are listed here:
>> Renditions

The text rendition can be retrieved via the following endpoint:
>> GET /api/dms/objects/{objectId}/contents/renditions/text

Mime Type

...

Determination

The CONTENTANALYZER service is responsible for the calculation determination of the mime type of binary content files that are imported in to yuuvis® Momentum. The calculated determined mime type is stored in the content stream properties section of the corresponding DMS object.

Configuration

The default behavior of the CONTENTANALYZER service can be changed via serviceConfiguration.json configuration file. The analysis of the content and/or mime type can be requested or not, depending on the defined conditions. If a condition matches during an import process, the content and/or mime type will be analyzed.
>> serviceConfiguration.json

Note: Within each import request body, this configuration can be overwritten by specifying the options parameters accordingly. The analysis of content and/or mime type can be requested or suppressed even if the opposite behavior is configured in the file serviceConfiguration.json.

Furthermore, it is possible to set the following parameters in a service-specific configuration file for the CONTENTANALYZER:

Parameter	Type	Description	Default
`extraction.exclusiveOfficeLock`	boolean	If you need text extraction for large binary content files of Microsoft Office file types, the CONTENTANALYZER service might need its full memory for each single file to be processed. If `true`, all other text extraction processes wait until the processing

for

of the Microsoft Office file is completed.

Note: Nevertheless, sufficient RAM is required for the CONTENTANALYZER service.

false

extraction.maxTextLengthInKB

integer

Available as of 2023 Summer.

Limit for the length of extracted full-text. If an extraction process reaches the limit, it is stopped and the full-text created till then is stored.

2048

mimetype.extension.redetection comma-separated list of mime types The standard calculation is based on the analysis of the binary content itself. In case a

calculated

determined mime type is wrong, it is possible to reanalyze the file considering the file ending. The mime types for which this second analysis step should be triggered are listed here. 'image/x-portable-greymap'

Info

icon	false

Read on

Section

Column

width	25%

SYSTEM Service

Insert excerpt

	SYSTEM Service
	SYSTEM Service
nopanel	true

Keep reading

Column

width	25%

AUTHENTICATION Service

Insert excerpt

	AUTHENTICATION Service
	AUTHENTICATION Service
nopanel	true

Keep reading

Column

width	25%

Basic Use Case Flows

Insert excerpt

	Basic Use Case Flows
	Basic Use Case Flows
nopanel	true

Keep reading

...

Versions Compared

Old Version 3

New Version Current

Key

Characteristics

Function

Mime Type

Determination

Configuration

Read on

SYSTEM Service

AUTHENTICATION Service

Basic Use Case Flows

Page Comparison

Versions Compared

Old Version 3

New Version Current

Key

Characteristics

Function

Mime Type

Determination

Configuration

Read on

SYSTEM Service

AUTHENTICATION Service

Basic Use Case Flows