Importing Documents via Core API
This tutorial shows how documents can be imported into a yuuvis® API system via the Core API. During this tutorial, a short Java application will be developed that implements the HTTP requests for importing documents. We additionally provide a JavaScript version of this tutorial.
Check out our graphical overview of the architecture which describes the basic use case flow for importing documents.
Table of Contents
Requirements
To work through this tutorial, the following is required:
- Set-up yuuvis® API system (see Installation Guide)
- A user with at least read permissions on a document type in the system (see tutorial for permissions)
- Simple Maven project
Maven Configuration
Our Java client will submit its requests to the Core API using OkHttp 3.12 by Square, Inc. Therefore, the following block must be added to the Maven dependencies in the pom.xml
of the project:
<dependency> <groupId>com.squareup.okhttp3</groupId> <artifactId>okhttp</artifactId> <version>3.12.0</version> </dependency>
Client Configuration
To interact with the yuuvis® API system via the Core API, we use an OkHttp3 client to send HTTP requests and read their responses.
String baseUrl = "http://127.0.0.1"; //baseUrl of gateway: "http://<host>:<port>" String username = "clouduser"; String userpassword = "secret"; String tenant = "default"; String auth = java.util.Base64.getEncoder().encodeToString((username + ":" + userpassword).getBytes()); OkHttpClient.Builder builder = new OkHttpClient.Builder(); OkHttpClient client = builder.build();
For more information on setting up the OkHttp3 client with cookie handling, please refer to this Login Tutorial.
Importing a Single Document
To import a document using the Java client, we need the metadata and optionally the content of the document (depending on the schema definition, there are document types that may or must have content or must not have content).
The metadata when importing a document has the following format:
{ "objects": [{ "properties": { "objectTypeId": { "value": "document" }, "Name": { "value": "test import" } }, "contentStreams": [{ "cid": "cid_63apple" }] }] }
In our example, the schema contains an object type document
with the Name
property, which may or must have content. The content is referenced in the contentStreams
object by specifying a cid
(multipart content ID). In the example, the cid
references a multipart content with content ID cid_63apple
.
A content file can be in different file formats. It is recommended to specify the format correctly in the metadata and in the multipart request. If the content type is not specified, it is automatically determined during the content analysis. If the content type determination is not clear or the content analysis is switched off, the content type application/octet-stream
is used.
In our example we have chosen a text file (Content-Type: text/plain
).
Request
For an import, a POST request must be sent to the endpoint /api/dms/objects
with a multipart body consisting of metadata and, if applicable, content of the object to be imported. To construct such a request, we use a MultipartBody.Builder()
, which allows us to build the request body from several form parts as follows.
RequestBody requestBody = new MultipartBody.Builder() .setType(MultipartBody.FORM) .addFormDataPart("data", "metaData.json", RequestBody.create(MediaType.parse("application/json; charset=utf-8"), new File("./src/main/resources/metaData.json"))) .addFormDataPart("cid_63apple", "test.txt", RequestBody.create(MediaType.parse("text/plain; charset=utf-8"), new File("./src/main/resources/test.txt"))) .build();
We use a Request.Builder()
to create a request object with the multipart body, headers, and the URL. The following headers are necessary for the import because they contain user information of the user accessing the endpoint: Authorization header that contains the Base64-coded credentials of the user and an X-ID-TENANT-NAME header that contains the tenant name of the user. If the used OkHttp client supports cookie handling, the Authorization header can be omitted after the client's first request, since the logon information is stored in a session cookie (see also Login Tutorial).
Request request = new Request.Builder() .header("Authorization", "Basic "+ auth) .header("X-ID-TENANT-NAME", tenant) .url(baseUrl + "/api/dms/objects") //baseUrl: "http://<host>:<port>" .post(requestBody) .build();
Response
To display the response of the API to the console, we create an associated response object when the request is executed. Please note that an IOException can be thrown by the OkHttpClient when creating the response object.
try{ Response response = client.newCall(request).execute(); System.out.println(response.body().string()); //print to console } catch (IOException e) { e.printStackTrace(); }
Importing Multiple Documents in Batch Mode
If multiple documents are to be imported at the same time, this can be done using the same endpoint of the Core API. Instead of a single object, the objects
list consists of several metadata records. The individual content files of the objects then each require a unique cid
as the name of the form-data parts in the multipart request. This cid
is referenced in the associated metadata record in the contentStreams
list, which allows metadata to be uniquely assigned to content.
{ "objects": [{ "properties": { "objectTypeId": { "value": "document" }, "Name": { "value": "test import object 1" } }, "contentStreams": [{ "cid": "cid_63apple" }] }, { "properties": { "objectTypeId": { "value": "document" }, "Name": { "value": "test import object 2" } }, "contentStreams": [{ "cid": "cid_64apple" }] }] }
Request
In the multipart body, we create a separate FormDataPart for the content of each object, whose first parameter is the content ID (cid
).
RequestBody batchImportRequestBody = new MultipartBody .Builder() .setType(MultipartBody.FORM) .addFormDataPart("data", "metaDataBatch.json", RequestBody.create(MediaType.parse("application/json; charset=utf-8"), new File("./src/main/resources/metaDataBatch.json"))) .addFormDataPart("cid_63apple", "test1.txt", RequestBody.create(MediaType.parse("text/plain; charset=utf-8"), new File("./src/main/resources/test1.txt"))) .addFormDataPart("cid_64apple", "test2.txt", RequestBody.create(MediaType.parse("text/plain; charset=utf-8"), new File("./src/main/resources/test2.txt"))) .build();
The assembly of the request object is identical to the normal import.
Response
If successful, the response object contains a multi-element objects
list that contains the metadata records of all objects imported in this batch import.
Summary
In this tutorial an OkHttpClient with Cookie-Handling was used to import documents via the Core API, both in batch mode and individually.
A complete code example can be found in this git repository.