TEXTEXTRACTOR Service
The textextractor service extracts text from the content of a DMS object. It is not interested in the objects itself and does its job without even knowing the ID of the corresponding objects.
The textextractor service reads messages from a queue. These messages contain a sourceLink
and a targetLink
. With the sourceLink
, the textextractor service retrieves the content and posts the extracted text to the targetLink
after the work is done. Finally, a message is written to a success or error queue, depending on whether the job was successful or not.
Table of Contents
Characteristics
port range: 7420-7429
service name: textextractor
profiles: cloud,lc,mq
Function
Reading Messages
The queue from which the textextractor-service reads messages (5.) is configured by the parameter textextraction.job-queue
and has the default value lc.textextraction.job
.
textextraction.job-queue: lc.textextraction.job
These messages contain sourceLink
, targetLink
and properties
. The properties
tell the textextractor-service whether to resolve the links at the Discovery-Service before calling them or not. In addition, the properties
are written to the success or error message and are used to assign which messages the textextractor-service has processed and whether the processing was successful or not. An example message can be seen in the Controller-Service description.
Getting Content
The content is retrieved by the sourceLink
(6.). If the property useDiscovery
in the properties
map is set to true
, the textextractor-service must resolve the sourceLink
at the discovery-service before it can call it. Otherwise, it is only called.
Extracting Text
The text is extracted from the content (8.). The logic for extracting text is the same as that of the contentanalyzer-service.
Forwarding Extracted Text
The extracted text is passed on to the targetLink
(9.). Analogous to the sourceLink
, the textextractor-service uses the targetLink
, depending on how the property useDiscovery
is set, simply or must have it resolved at the Discovery-Service against a specific controller-service instance.
Writing Success Message
At the end a success message is written (13.). The message is written to a queue whose name differs from the job queue only by the suffix .success
. By default, the queue is called lc.textextraction.job.success
. The message contains the initial properties
that were in the job queue.
Writing Error Message
If a text extraction request cannot be executed without errors, a message is written to an error queue (13.). The name of this queue consists of the name of the job queue and the suffix .error
, and is therefore lc.textextraction.job.error
by default. In addition to the initial properties
, the message also contains an additional property reason
whose value contains an error message.