Accessing Binary Content during an Import

Access binary content files via webhooks during the import process.

Introduction

If an object is imported to yuuvis® Momentum, a binary content file can be assigned in addition to the object's metadata. The storage of the binary content file is managed by the REPOSITORY service. Afterwards, the metadata are stored in a separate database and are indexed for search.
>> Basic Use Case Flows

As soon as the object import process is finished, the binary content file can be accessed. Especially, it can be analysed in order to set some content-related metadata properties for the corresponding object. The metadata update is possible, i.e., via POST or PATCH /api/dms/objects/{objectId} and triggers the creation of a new object version. However, in some use cases, it might be necessary to set metadata properties depending on an analysis of the corresponding binary content file already for the first version of the imported object. For this purpose, the binary content file can be accessed from a webhook that is called after storing the content and before storing the metadata. This tutorial describes an example solution using an internal endpoint of the REPOSITORY service.

Setting up the Webhook Service

The webhook entry point has to be located after storing the binary content file and before storing the metadata in the process chain of an object import. For this purpose, the type dms.request.objects.upsert.database-before is suitable.

The Preprocessing Metadata using Webhooks tutorial provides a general example on how to configure and set up your own webhook service. A more specific example controller class for a webhook service for our concrete use case is provided in the code block below. For each object imported with a binary content file, the MD5 digest of the corresponding content is calculated. Since an internal endpoint of the REPOSITORY service has to be called, the webhook service must run within the yuuvis® Momentum Kubernetes cluster.

As defined in lines 2 and 5, the webhook service will be available via the URL /api/process/checkcontent. This endpoint accepts the import request body containing all of the objects' metadata, and the request headers (line 6).

Especially in the case of a batch import, it is necessary to iterate over the individual objects (line 13) as the repository can only retrieve the content of one object at a time. For each individual object, the existence of an assigned binary content file is checked (lines 17 and 19). The binary content file is retrieved from the repository (starting in line 28) and its MD5 digest is calculated (line 47). The value is printed to the console (line 55).

Depending on the MD5 digest value, it would be possible to manipulate the individual object's metadata. However, in this example, the metadata remain unchanged. The objects from the request body are returned without any manipulation (line 66).

Calculate MD5-Digest of binary content before import
@RestController
@RequestMapping("/api")
public class WebhookRestController
{    
    @PostMapping(value = "/process/checkcontent", produces = "application/json;charset=UTF-8", consumes = "application/json")
    public ResponseEntity<?> checkContent(@RequestBody final Map<String, Object> dmsApiObList, @RequestHeader HttpHeaders incomingHeaders) throws Exception
    {
        try
        {
            String authorization = incomingHeaders.getFirst(HttpHeaders.AUTHORIZATION);
            List<Map<String, Object>> apiObjectList = (List<Map<String, Object>>)dmsApiObList.get("objects");

            for (Map<String, Object> apiObject : apiObjectList)
            {
                String objectId = (String)getMap(getMap(apiObject, "properties"), "system:objectId").get("value");

                boolean hasContent = apiObject.get("contentStreams") != null && ((List)apiObject.get("contentStreams")).size() > 0;

                if (hasContent)
                {
                    Map<String, Object> requestObject = new LinkedHashMap<>();
                    LinkedList<Map<String, Object>> requestList = new LinkedList<>();
                    requestList.add(apiObject);
                    requestObject.put("objects", requestList);

                    // @formatter:off
                    String md5Digest = 
                        restTemplate.execute(
                            "http://repository/api/dms/objects/" + objectId, 
                            HttpMethod.POST, 
                            (ClientHttpRequest requestCallback) -> {
                              if (StringUtils.hasLength(authorization))
                              {
                                requestCallback.getHeaders().add(HttpHeaders.AUTHORIZATION, authorization);
                              }
                              requestCallback.getHeaders().setContentType(MediaType.APPLICATION_JSON);
                              requestCallback.getBody().write(this.objectMapper.writeValueAsString(requestObject).getBytes("UTF-8"));
                            }, 
                            new ResponseExtractor<String>()
                            {
                                @Override
                                public String extractData(ClientHttpResponse response) throws IOException
                                {
                                    if (response.getStatusCode().is2xxSuccessful())
                                    {
                                        // calculate MD5 Hash
                                        return DigestUtils.md5DigestAsHex(response.getBody());
                                    }
                                    throw new IllegalStateException("error in getting content: " + response.getRawStatusCode() + " " + response.getStatusText());
                                }
                            }   
                        );
                    // @formatter:on

                    System.out.println("for objectId[" + objectId + "] the md5-digest of content is[" + md5Digest + "]");
                }
            }
        }
        catch (Throwable e)
        {

            return new ResponseEntity<>(e.getMessage(), HttpStatus.UNPROCESSABLE_ENTITY);
        }

        // just return input no changes
        return new ResponseEntity<Map<String, Object>>(dmsApiObList, HttpStatus.OK);
    }
}