Importing Documents via Core API

This tutorial shows how documents can be imported into a yuuvis® API system via the Core API. During this tutorial, a short Java application will be developed that implements the HTTP requests for importing documents. We additionally provide a JavaScript version of this tutorial. 

Check out our graphical overview of the architecture which describes the basic use case flow for importing documents.

Table of Contents

Requirements

To work through this tutorial, the following is required:

Maven Configuration

Our Java client will submit its requests to the Core API using OkHttp 3.12 by Square, Inc. Therefore, the following block must be added to the Maven dependencies in the pom.xml of the project:

pom.xml
<dependency>
    <groupId>com.squareup.okhttp3</groupId>
    <artifactId>okhttp</artifactId>
    <version>3.12.0</version>
</dependency>

Client Configuration

To interact with the yuuvis® API system via the Core API, we use an OkHttp3 client to send HTTP requests and read their responses.

OkHttp3 Client and Variables
String baseUrl = "http://127.0.0.1"; //baseUrl of gateway: "http://<host>:<port>"
String username = "clouduser";
String userpassword = "secret";
String tenant = "default";
String auth = java.util.Base64.getEncoder().encodeToString((username + ":" + userpassword).getBytes());
  
OkHttpClient.Builder builder = new OkHttpClient.Builder();
OkHttpClient client = builder.build();

For more information on setting up the OkHttp3 client with cookie handling, please refer to this Login Tutorial.

Importing a Single Document

To import a document using the Java client, we need the metadata and optionally the content of the document (depending on the schema definition, there are document types that may or must have content or must not have content).

The metadata when importing a document has the following format:

metaData.json
{
    "objects": [{
        "properties": {
            "objectTypeId": {
                "value": "document"
            },
            "Name": {
                "value": "test import"
            }
        },
        "contentStreams": [{
            "cid": "cid_63apple"
        }]
    }]
}

In our example, the schema contains an object type document with the Name property, which may or must have content. The content is referenced in the contentStreams object by specifying a cid (multipart content ID). In the example, the cid references a multipart content with content ID cid_63apple.

A content file can be in different file formats. It is recommended to specify the format correctly in the metadata and in the multipart request. If the content type is not specified, it is automatically determined during the content analysis. If the content type determination is not clear or the content analysis is switched off, the content type application/octet-stream is used.

In our example we have chosen a text file (Content-Type: text/plain).

Request

For an import, a POST request must be sent to the endpoint /api/dms/objects with a multipart body consisting of metadata and, if applicable, content of the object to be imported. To construct such a request, we use a MultipartBody.Builder(), which allows us to build the request body from several form parts as follows.

Building the Multipart Body with OkHttp3
RequestBody requestBody = new MultipartBody.Builder()
        .setType(MultipartBody.FORM)
        .addFormDataPart("data", "metaData.json",
			RequestBody.create(MediaType.parse("application/json; charset=utf-8"), 
				new File("./src/main/resources/metaData.json")))
        .addFormDataPart("cid_63apple", "test.txt",
        	RequestBody.create(MediaType.parse("text/plain; charset=utf-8"),
               	new File("./src/main/resources/test.txt")))
        .build();

We use a Request.Builder() to create a request object with the multipart body, headers, and the URL. The following headers are necessary for the import because they contain user information of the user accessing the endpoint: Authorization header that contains the Base64-coded credentials of the user and an X-ID-TENANT-NAME header that contains the tenant name of the user. If the used OkHttp client supports cookie handling, the Authorization header can be omitted after the client's first request, since the logon information is stored in a session cookie (see also Login Tutorial).

Building a POST Request for an Import
Request request = new Request.Builder()
        .header("Authorization", "Basic "+ auth)
        .header("X-ID-TENANT-NAME", tenant)
        .url(baseUrl + "/api/dms/objects") //baseUrl: "http://<host>:<port>"
        .post(requestBody)
        .build();

Response

To display the response of the API to the console, we create an associated response object when the request is executed. Please note that an IOException can be thrown by the OkHttpClient when creating the response object.

Handling any IOException
try{	
	Response response = client.newCall(request).execute();
	System.out.println(response.body().string());	//print to console
} catch (IOException e) {
	e.printStackTrace();
}

Importing Multiple Documents in Batch Mode

If multiple documents are to be imported at the same time, this can be done using the same endpoint of the Core API. Instead of a single object, the objects list consists of several metadata records. The individual content files of the objects then each require a unique cid as the name of the form-data parts in the multipart request. This cid is referenced in the associated metadata record in the contentStreams list, which allows metadata to be uniquely assigned to content.

metaDataBatch.json
{
    "objects": [{
        "properties": {
            "objectTypeId": {
                "value": "document"
            },
            "Name": {
                "value": "test import object 1"
            }
        },
        "contentStreams": [{
            "cid": "cid_63apple"
        }]
    },
    {
      "properties": {
            "objectTypeId": {
                "value": "document"
            },
            "Name": {
                "value": "test import object 2"
            }
        },
        "contentStreams": [{
            "cid": "cid_64apple"
        }]
    }]
}

Request

In the multipart body, we create a separate FormDataPart for the content of each object, whose first parameter is the content ID (cid).

Building a POST Request for a Batch Import
RequestBody batchImportRequestBody = new MultipartBody
        .Builder()
        .setType(MultipartBody.FORM)
        .addFormDataPart("data",
        	"metaDataBatch.json",
           	RequestBody.create(MediaType.parse("application/json; charset=utf-8"),
				new File("./src/main/resources/metaDataBatch.json")))
        .addFormDataPart("cid_63apple",
        	"test1.txt",
           	RequestBody.create(MediaType.parse("text/plain; charset=utf-8"),
				new File("./src/main/resources/test1.txt")))
        .addFormDataPart("cid_64apple",
			"test2.txt",
			RequestBody.create(MediaType.parse("text/plain; charset=utf-8"),
				new File("./src/main/resources/test2.txt")))
  		.build();

The assembly of the request object is identical to the normal import.

Response

If successful, the response object contains a multi-element objects list that contains the metadata records of all objects imported in this batch import.

Summary

In this tutorial an OkHttpClient with Cookie-Handling was used to import documents via the Core API, both in batch mode and individually.

A complete code example can be found in this git repository.


More Tutorials

Retrieving Documents

In this tutorial, we will discuss various ways to retrieve objects via the Core API from the yuuvis® API system using an OkHttp3 Java client.Keep reading

Updating Documents

This tutorial demonstrates how to update documents in yuuvis® API with the Core API. The following example will result in a short Java application that implements the HTTP requests for updating a document.  Keep reading

Deleting Documents

This tutorial explains how documents can be deleted using the Core API with the help of a Java client. This tutorial requires basic knowledge of importing documents using the Core API. Keep reading