Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Section


Column

Table of Contents

Table of Contents
excludeTable of Contents


Column

Characteristics

port range: 7332-7335

service name: controller

profiles: cloud prod,docker,es,oauth2,lc,mq,prodkubernetes


Function

Asynchronous Full-Text Indexing

...

The queue from which the controller-service reads the messages is configured by the parameter textextraction.in-queue and has the default value lc.textextraction (lc - lifecycle) (2.).

Code Block
titleapplication-lc.yml
textextraction.in-queue: lc.textextraction

...

The Controller-Service then generates the links corresponding to the DmsApiObject contained in the message for the textextractor-service (3.). The aim is that the textextractor-service remains unaware of the rest of the system, i.e. the links contain all the information required to retrieve the content or save the extracted text in the form of query parameters. By default, the Controller-Service generates links which the textextractor-service must resolve at the Discovery-Service (how this works, can be read here). This provides a meaningful scaling of the Controller-Service, assuming that the textextractor-service is integrated into the services landscape.

...

The Controller-Service generates job messages for the textextractor-service (4.). These messages contain two links and additional properties in a map.

...

The content of a DMS object can be retrieved using a GET request to the sourceLink (6.). The controller-service receives the object ID, version number and tenant via the query parameters of the sourceLink and can use this information to retrieve the content of the object from the API gateway and return it to the caller (7.).

The extracted text of a Dms object can be saved using a POST request to the targetLink (9.). To do this, the text must be contained in the body of the request. From the query parameters of the targetLink, the controller-service receives the object ID, content stream ID, and content stream range of the corresponding DMS object. To ensure that the content of the object has not changed in the time between the creation of the job message and the current point in time, the Controller-Service retrieves the current metadata for the object ID from Elasticsearch (10.) and compares the content stream ID and content stream range from the targetLink with those from the current metadata (11.). If at least one of the two properties does not match, the Controller-Service terminates the update process and returns http status 409 CONFLICT.

...

If the comparison of the content stream ID and the content stream range shows that the content has not changed in the meantime, the text sent in the body will be written in Elasticsearch in the field contentfile of the object with the corresponding object ID (12.).

Processing Error/ Success Messages

The textextractor-service writes a success or error message for each executed full-text extraction. These are read by the controller-service (14.) which logs the contained reports.

...