Tagging Objects for Processing

Based on tags, you set up complex stateful processing chains. This tutorial describes how tags can be added, displayed, and updated in an example context. For this purpose, a typical application will be discussed: An import management system, which is an example for a processing chain, where tags are a helpful tool to describe the current status of each object.

Table of Contents

Introduction

In modern data processing chains, asynchronous operations become more and more popular. Concerning the administration of such complex procedures, a status measure is essential. The lifecycle of any object needs to be documented with a view to a general knowledge on the current state of progress, and in order to enable the resumption of interrupted processes. If the object status were located in the metadata, each pass through a multi-stage processing chain would create a completely new version of the whole object, which needs to be stored in addition to the previous one. In contrast, tags are treated separately. They always have a state and a traceId, which are a basis for external support as well as a possibility to control asynchronous operations. This additional information is stored together with the object, but independently of the metadata. Tags can only be assigned to the current version of an object. At the same time, their management and modification does not trigger the creation of a new version. The so assigned processing status reflects a concrete state of the object and serves as the basis for further transactions.

Requirements

This tutorial is dedicated to experienced users. Please find information on requirements, maven and client configuration in our "Importing Documents via Core API" tutorial, on which the variables' names are based. The handling of any IOException is demonstrated there, too.

Application Example – Import Management

The import management allows for subsequent object type classification that is independent of the actual import process. As a matter of fact, the initial step is a general search for recently created objects that have to be analyzed. Any object identified will be flagged with an integer tag analysis showing the current state of progress within the characterization process (waiting to be processed (1) → processing started (2) → processing finished (3) → metadata updated (4)).

Further tags will be assigned to the object in order to demonstrate the handling of multiple tags assigned to one object, to explain the tracing for tag operations and to illustrate the behavior of tags during metadata and content updates.

In this tutorial, the tag handling is explained by following one object with the arbitrary objectId=1234567812345678 through the processing chain of the import management. The code snippets can be executed in the same order as they are shown if the requirements are met and at least one document was imported. Replace the example objectId in the code snippets by the actual objectId of your imported document.

Adding Tags to Objects

A general search might deliver a long result list. In this example, one object is selected in order to demonstrate how to add a new tag. The selected object has the objectId=1234567812345678. An analysis tag is added with the current state value 1 (= waiting to be processed). For the import management, this step will be carried out for each individual result of the general search.

The string variables auth, tenant and baseUrl are defined in the client configuration.

String objectId = "1234567812345678";
String tagName = "analysis";
String tagValue = "1";

RequestBody addTagBody = RequestBody.create(null, new byte[]{});

Request addTagRequest = new Request.Builder()
                    .header("Authorization", auth)
                    .header("X-ID-TENANT-NAME", tenant)
                    .url(baseUrl + "/api/dms/objects/" + objectId + "/tags/" + tagName + "/state/" + tagValue)
                    .post(addTagBody)
                    .build();

            Response addTagResponse = client.newCall(addTagRequest).execute();
            String addTagResponseString = addTagResponse.body().string();
            System.out.println(addTagResponseString);
Response
{"objects": [{
  "properties": {
    "system:tags": {
       "value": [
         ["analysis",1,"2021-06-30T15:19:25.370Z","0ff83b2d8c17b704"]
       ]
     }
   }
 }]
}

>> Endpoint details: POST /api/dms/objects/{objectId}/tags/{name}/state/{state}

We add a second tag to the example object in order to demonstrate the handling of multiple tags. The procedure is exactly the same.

Adding tag 'testtag' in addition to tag 'analysis'.
String objectId = "1234567812345678";
String tagName = "testtag";
String tagValue = "7";

RequestBody addTagBody = RequestBody.create(null, new byte[]{});

Request addTagRequest = new Request.Builder()
                    .header("Authorization", auth)
                    .header("X-ID-TENANT-NAME", tenant)
                    .url(baseUrl + "/api/dms/objects/" + objectId + "/tags/" + tagName + "/state/" + tagValue)
                    .post(addTagBody)
                    .build();

            Response addTagResponse = client.newCall(addTagRequest).execute();
            String addTagResponseString = addTagResponse.body().string();
            System.out.println(addTagResponseString);
Response
{"objects": [{
  "properties": {
    "system:tags": {
       "value": [
         ["analysis",1,"2021-06-30T15:19:25.370Z","0ff83b2d8c17b704"],
         ["testtag",7,"2021-06-30T15:19:26.700Z","4b2448d66e4d02eb"]
       ]
     }
   }
 }]
}

Showing Tags of Objects

To check the current state of the object processing, the tags of a specified object can be viewed. The response will contain the status information for all tags of the object with the given objectId. In particular, the example code below returns the analysis tag together with its state value 1 as well as the testtag tag together with its state value 7.

String objectId = "1234567812345678";

Request getTagRequest = new Request.Builder()
                    .header("Authorization", auth)
                    .header("X-ID-TENANT-NAME", tenant)
                    .url(baseUrl+ "/api/dms/objects/" + objectId + "/tags")
                    .get().build();

            Response tagResponse = client.newCall(getTagRequest).execute();
            String tagResponseString = tagResponse.body().string();
            System.out.println(tagResponseString);
Response
{"objects": [{
  "properties": {
     "system:tags": {
       "columnNames": ["name","state","creationDate","traceId"],
       "value": [
         ["analysis",1,"2021-06-30T15:19:25.370Z","0ff83b2d8c17b704"],
         ["testtag",7,"2021-06-30T15:19:26.700Z","4b2448d66e4d02eb"]
       ]
     }
   }
 }]
}

>> Endpoint details: GET /api/dms/objects/{objectId}/tags

Searching Objects and Updating Their Tags

At this point in the import management context, there are several objects in the system having a newly added analysis tag with the state 1. They are all waiting for a time-consuming processing step, but only one of them can be handled at the same time. In order to select one individual object for further processing, a search query is used, which returns a limited result list of objects having an analysis tag with the state 1 that was created/modified today or yesterday. The first result in this list is updated to state=2 which means processing started. In the code snippet of the Adding Tags to Objects section, only one object was flagged with the tag and thus the following code snippet will continue with exactly this object.

String statement = "SELECT * FROM document WHERE system:tags[analysis].(state=1 AND (creationDate=YESTERDAY() OR creationDate=TODAY()))";
String tagName = "analysis";
String newTagValue = "2";

Request queryTagRequest = new Request.Builder()
                    .header("Authorization", auth)
                    .header("X-ID-TENANT-NAME", tenant)
                    .url(baseUrl + "/api/dms/objects/tags/" + tagName + "/state/" + newTagValue + "?query=" + statement)
                    .post(RequestBody.create(null, new byte[0]))
                    .build();

            Response queryTagResponse = client.newCall(queryTagRequest).execute();
            String queryTagResponseString = queryTagResponse.body().string();
            System.out.println(queryTagResponseString);
Excerpt from the response body
{"objects": [{
  "properties": {
    "system:objectId": {
       "value": "1234567812345678"
     },
    ...
    "system:versionNumber": {
       "value": 1
    },
    ...
    ...
    "system:tags": {
       "value": [
         ["analysis",2,"2021-06-30T15:19:27.950Z","0ff83b2d8c17b704"],
         ["testtag",7,"2021-06-30T15:19:26.700Z","4b2448d66e4d02eb"]
       ]
     },
     ...
   },
   ...
 }]
}

>> Endpoint details: POST /api/dms/objects/tags/{name}/state/{state}?query=<SQL>

Updating Tags of Objects

As soon as the processing of the object is finished, the tag should be updated as well. The code below will change the status value of the analysis tag to 3 (= processing finished).

String objectId = "1234567812345678";
String tagName = "analysis";
String newTagValue = "3";

Request updateTagRequest = new Request.Builder()
                    .header("Authorization", auth)
                    .header("X-ID-TENANT-NAME", tenant)
                    .url(baseUrl + "/api/dms/objects/" + objectId + "/tags/" + tagName + "/state/" + newTagValue + "?overwrite=true")
                    .post(RequestBody.create(null, new byte[0]))
                    .build();

            Response updateTagResponse = client.newCall(updateTagRequest).execute();
            String updateTagResponseString = updateTagResponse.body().string();
            System.out.println(updateTagResponseString);
Response
{"objects": [{
  "properties": {
     "system:tags": {
       "columnNames": ["name","state","creationDate","traceId"],
       "value": [
         ["analysis",3,"2021-06-30T15:19:28.530Z","faa0cdae5008d7e0"],
         ["testtag",7,"2021-06-30T15:19:26.700Z","4b2448d66e4d02eb"]
       ]
     }
   }
 }]
}

>> Endpoint details: POST /api/dms/objects/{objectId}/tags/{name}/state/{state}?overwrite=true

Adding Tags with Specified 'traceId'

In order to label multiple tag operations to associate them with one overall process, the traceId can be specified. In the code block below, the tracingprocess tag with state value 1 is added to our example object specified by its objectId. Additionally, a string traceId is defined in line 4. The traceId is passed to the endpoint in the request header in line 9. It will be set as value for the traceId of the new tag tracingprocess.

String objectId = "1234567812345678";
String tagName = "tracingprocess";
String tagValue = "1";
String traceId = "1122334455667788";

Request addTagRequest = new Request.Builder()
                    .header("Authorization", auth)
                    .header("X-ID-TENANT-NAME", tenant)
                    .header("X-B3-TraceId", traceId)
                    .url(baseUrl + "/api/dms/objects/" + objectId + "/tags/" + tagName + "/state/" + tagValue)
                    .post(RequestBody.create(null, new byte[0]))
                    .build();

            Response addTagResponse = client.newCall(addTagRequest).execute();
            String addTagResponseString = addTagResponse.body().string();
            System.out.println(addTagResponseString);
Response
{"objects": [{
  "properties": {
     "system:tags": {
       "columnNames": ["name","state","creationDate","traceId"],
       "value": [
         ["analysis",3,"2021-06-30T15:19:28.530Z","faa0cdae5008d7e0"],
         ["testtag",7,"2021-06-30T15:19:26.700Z","4b2448d66e4d02eb"],
         ["tracingprocess",1,"2021-06-30T15:19:30.250Z","1122334455667788"]
       ]
     }
   }
 }]
}

>> Endpoint details: POST /api/dms/objects/{objectId}/tags/{name}/state/{state}

Updating Tags with Specified 'traceId'

Similar as in Adding Tags with Specified 'traceId', also in a tag update request, the query parameter traceIdMustMatch=true can be set if a traceId is specified in the header. The update operation will be performed only if the values are matching. Thus, the tag can only be updated if the previous traceId is known to the caller. Furthermore, the update operation will appear with the same traceId in the audit trail and can thus be summarized with the tag creation operation in one overall process trace.

The code block below shows an update request for the tracingprocess tag to the state value 2. Since the traceId specified in line 4 and referenced in line 9 matches the current traceId of the tracingprocess tag, the update will be successful. Of course, the traceId will remain the same after the tag update.

String objectId = "1234567812345678";
String tagName = "tracingprocess";
String newTagValue = "2";
String traceId = "1122334455667788";

Request updateTagRequest = new Request.Builder()
                    .header("Authorization", auth)
                    .header("X-ID-TENANT-NAME", tenant)
                    .header("X-B3-TraceId", traceId)
                    .url(baseUrl + "/api/dms/objects/" + objectId + "/tags/" + tagName + "/state/" + newTagValue + "?overwrite=true&traceIdMustMatch=true")
                    .post(RequestBody.create(null, new byte[0]))
                    .build();

            Response updateTagResponse = client.newCall(updateTagRequest).execute();
            String updateTagResponseString = updateTagResponse.body().string();
            System.out.println(updateTagResponseString);
Response
{"objects": [{
  "properties": {
     "system:tags": {
       "columnNames": ["name","state","creationDate","traceId"],
       "value": [
         ["analysis",3,"2021-06-30T15:19:28.530Z","faa0cdae5008d7e0"],
         ["testtag",7,"2021-06-30T15:19:26.700Z","4b2448d66e4d02eb"],
         ["tracingprocess",2,"2021-06-30T15:19:30.780Z","1122334455667788"]
       ]
     }
   }
 }]
}

>> Endpoint details: POST /api/dms/objects/{objectId}/tags/{name}/state/{state}?overwrite=true

Deleting Tags with Specified 'traceId'

The deletion of a tag can be requested with the appended query parameter traceIdMustMatch=true or without. If not specified, the default traceIdMustMatch=false will be set and the current traceId of the corresponding tag will not be checked.

The code block shows a deletion call with traceIdMustMatch=true. The traceId in the header will be compared with the current traceId of the tag tracingprocess. If the values are matching, the tag is deleted.

String objectId = "1234567812345678";
String tagName = "tracingprocess";
String traceId = "1122334455667788";

Request deleteTagRequest = new Request.Builder()
                    .header("Authorization", auth)
                    .header("X-ID-TENANT-NAME", tenant)
                    .header("X-B3-TraceId", traceId)
                    .url(baseUrl + "/api/dms/objects/"+ objectId + "/tags/" + tagName + "?traceIdMustMatch=true")
                    .delete().build();

            Response deleteTagResponse = client.newCall(deleteTagRequest).execute();

            if(deleteTagResponse.code() == 200) System.out.println("Successfully deleted.");
            else System.out.println("Error while deleting: " + deleteTagResponse.code());

>> Endpoint details: DELETE /api/dms/objects/{objectId}/tags/{name}

Tag Behavior in PATCH Metadata Update

In this section the behavior of tags during PATCH metadata update calls is demonstrated. To learn about requesting the call itself, please refer to the Updating Documents via Core API tutorial.

In a PATCH update, only the properties referenced in the request body will be modified. Although the tags do not belong to the metadata, they can be modified like metadata and together with the metadata in one call. The tag information is stored in table format. Thus, like for metadata table properties, the entire table will be replaced during the PATCH update. Consequently, tags can be added, updated or removed from the object. The code block shows an example request body that updates the state of the analysis tag to value 4 (= metadata updated) and adds the new tag contentprocessing:resistant. The suffix resistant indicates that the tag persists during updates of the binary content file assigned to the corresponding object. From the PATCH request body, only name and state of the tags are read. The tag properties creationDate and traceId are automatically determined by the system. The traceId of each tag will have the value of the system:traceId metadata property. In our example request body, the table row for the testtag tag is missing and will thus not appear in the new tag table anymore.

{
  "objects": [{
    "properties": {
      "Name": {
        "value": "Test 3"
      },
      "system:tags": {
        "value": [
          ["analysis",4],
          ["contentprocessing:resistant",13]
        ]
      }
    }
  }]
}
Excerpt from the response body
{"objects": [{
  "properties": {
    "system:objectId": {
       "value": "1234567812345678"
     },
    ...
    "system:versionNumber": {
       "value": 2
    },
    ...
    "system:traceId": {
      "value": "118343a3fbc940e6"
    },
    "system:tags": {
       "value": [
         ["analysis",4,"2021-06-30T15:19:32.560Z","118343a3fbc940e6"],
         ["contentprocessing:resistant",13,"2021-06-30T15:19:32.560Z","118343a3fbc940e6"]
       ]
     },
     "Name": {
        "value": "Test 3"
      },
   },
   ...
 }]
}

>> Endpoint details: PATCH /api/dms/objects/{objectId}

Content Update

See how to prepare a call for the update of the binary content file in the Updating Documents via Core API tutorial. The content update creates a new version of the object and deletes all the assigned tags that are not resistant tags.

From our example object, the tag analysis is removed, but the resistant tag contentprocessing:resistant is kept for the new object version. The values of the tag properties remain unchanged. The code block below shows an excerpt of the response including the system:tags tag table after the update of the binary content file.

Excerpt from the response body
{"objects": [{
  "properties": {
    "system:objectId": {
       "value": "1234567812345678"
     },
    ...
    "system:versionNumber": {
       "value": 3
    },
    ...
    ...
    "system:tags": {
       "value": [
         ["contentprocessing:resistant",13,"2021-06-30T15:19:32.560Z","118343a3fbc940e6"]
       ]
     },
     "Name": {
        "value": "Test 3"
      },
   },
   ...
 }]
}

>> Endpoint details: POST /api/dms/objects/{objectId}/contents/file

POST Metadata Update

See how to prepare a POST update of the metadata in the Updating Documents via Core API tutorial. The update replaces all the metadata of the object with the values specified in the request body. If a property is missing in the request body, it will be removed from the object (except automatically determined system properties). Also the entire table with the tag information has to be explicitly specified as already described in the Tag Behavior in PATCH Metadata Update section. If the system:tags table is not specified in the request body, all tags will be removed from the object.

The code block below shows an example request body where no tags are specified. Thus, the analysistag and especially also the contentprocessing:resistant tag are deleted.

Excerpt from the response body
{"objects": [{
  "properties": {
    "system:objectId": {
       "value": "1234567812345678"
     },
    ...
    "system:versionNumber": {
       "value": 4
    },
    ...
    "system:tags": {
       "value": null
     },
     "Name": {
        "value": "Test 2"
      },
   },
   ...
 }]
}

Summary

This tutorial illustrated the tag handling supported by yuuvis® Momentum in the example context of a simplified import management system. For control and administration of the status of individual objects within a process chain, tags are the means of choice.

>> Code examples in gitHub


Read on

Tagging

Tags are used to describe the status of an object within a process chain independently of the object's metadata, which means no need of definition in the schema and no triggering of new versions. Keep reading

Schema - Defining Object Types

Detailing the available schema, object type definitions as well as property definitions. Keep reading

Changing Schema Structures ("Schema Flow")

This tutorial shows how to change your basic schema for individual instances of an object type during the entire lifecycle of a document. Classify objects at a later point in time, add or remove property groups at runtime by defining and referencing "floating" secondary object types. Keep reading