This tutorial shows how documents can be imported into a yuuvis® API system via the Core API. This tutorial is an extension of the Java import tutorial, applying its concepts to a different popular programming language, JavaScript.
Check out our graphical overview of the architecture which describes the basic use case flow for importing documents.
Table of Contents
Requirements
To work through this tutorial, the following is required:
- Set-up yuuvis® API system (see minikube setup, for example)
- A user with at least read permissions on a document type in the system (see tutorial for permissions)
Project Requirements
In this tutorial, we are going to implement the yuuvis® API Import HTTP POST Request in JavaScript. Our Goal is to import documents using the yuuvis® API. For access to the yuuvis® API, make sure you provide credentials and a valid base URL to your script like this:
var user = "clouduser"; var password = "cloudsecret"; var tenant = "default"; var baseUrl = "http://127.0.0.1"; //baseUrl of gateway: "http://<host>:<port>" var auth = "Basic Y2xvdWR1c2VyOmNsb3Vkc2VjcmV0Cg==" //clouduser:cloudsecret
Requirements for node.js
We figure most JavaScript developers are familiar with some type of application development framework, like node.js, and use them to develop backend applications, which is why we want to provide examples specifically for that context. Using node.js allows us to use two libraries to accomplish the HTTP requests and file stream reading: request and file-system respectively. If you have node.js installed, you can simply install a request using npm i request
– filesystem is part of the node.js core installation – and require them in your project.
const request = require('request'); const fs = require('fs');
Requirements for Browser Based Application
As an alternative, we're also providing a solution to be put within the <script></script> portion of an HTML page, acting as an example for a browser-driven application. This may not be the most elegant or effective solution, but it will work on most if not all systems running modern versions of browsers. It also circumvents the step of handpicking libraries in node.js, as we can use the standard Web APIs available to all browsers: FileReader, Blob, FormData and of course XMLHttpRequest. These should be available to all modern browsers except Opera Mini, with slight limitations in IE.
Importing a Single Document
To import a single document we will need to send an HTTP POST request containing the JSON metadata and the content of the document to the /api/dms/objects
endpoint.
Form Data Assembly
Since importing of documents is accomplished using a multipart HTTP request, the data transmitted to our API needs to be put into a multipart data form. This form data needs to built to meet API expectations.
For our metadata, we create a JSON object for our document that conforms to the API metadata JSON schema. Every metadata JSON consists of a list of such objects, each comprised of multiple properties and content stream array. To find out more about the schema your metadata needs to conform to, read the schema management tutorial or visit <baseURI>/api/dms/schema/native
to retrieve the tenant-specific schema definition that apply to you.
The following JSON serves as an example for what that metadata looks like for a simple text document.
{ "objects": [{ "properties": { "system:objectTypeId": { "value": "document" }, "Name": { "value": "test import" } }, "contentStreams": [{ "cid": "cid_63apple", "fileName": "test.txt", "mimeType": "text/plain" }] }] }
Creating this metadata using JavaScript is really simple. Given we have access to a documents' title, filename contentStreamId and contentType, we can write up a method that produces the JSON representation for that single document. Note that even though this metadata only describes one document, it's root node of the JSON is still the "objects" array that encloses our document's metadata notation.
function createDocumentMetadata(doc_title, doc_fileName, doc_cid, doc_contentType) { return { "objects":[ { "properties": { "system:objectTypeId": { "value": "document" }, "Name": { "value": doc_title } }, "contentStreams": [{ "mimeType": doc_contentType, "fileName": doc_fileName, "cid": doc_cid }] } ] }; }
Putting together metadata and content into a form data JSON object is again a matter of meeting API specifications. Essentially, we create a JavaScript object that (for a single document import) has two keys: 'data', whose value stores the metadata JSON as String, and '<doc_cid>' (the documents content stream ID), whose value stores a stream representation of the content.
The following serves as an example form data JSON.
data: { value: '{ "objects":[{ "properties": { "system:objectTypeId": { "value": "document" }, "Name": { "value": "test" } }, "contentStreams": [{ "mimeType": "text/plain", "fileName": "test.txt", "cid": "cid_63apple" }] }] }', options:{ contentType:'application/json' } }, cid_63apple: { value: [Object], //stream goes here options: { contentType:"text/plain", filename: "test.txt" } }
It's here where the reason for inserting the content stream ID of the document into its metadata becomes apparent: the contentStreams portion of the metadata designates which form data part contains the content associated with the object.
Creating the form data in javascript would work something like this:
function createImportFormdata(doc_title, doc_fileName, doc_cid, doc_contentType){ var formData = {} formData['data'] = { value: JSON.stringify(createDocumentMetadata(doc_title, doc_fileName, doc_cid, doc_contentType)), options: { contentType: 'application/json' } } formData[doc_cid]= { value: fs.createReadStream(doc_fileName), options: { contentType: doc_contentType, filename: doc_fileName } } return formData; }
Take care that the value of the "data" form part is stringified and that the key of the content form part becomes the parameter doc_cid
.
Form Data in Browser-Based Web Applications
Arrays and JS objects don't always do the trick for multipart requests, as they tend to confuse content negotiation. In browser application, use the Blob datatype to define the formparts as binary files denoting their own content type.
var singleFormData = new FormData() var metadataBlob = new Blob([JSON.stringify(singleMetadata)],{type:"application/json"}) var contentBlob = new Blob([file],{type:"text/plain"}) singleFormData.append('data', metadataBlob, "metadata") singleFormData.append('cid_63apple', contentBlob, "contentdata")
Assembling the Request
Now that we have created JSON representations for our content and metadata, we can assemble our multipart HTTP request object. We are going to go over some simple methods to accomplish this and give examples for both browser-based front-end and node.js-based back-end applications.
Multipart HTTP Request using XMLHttpRequest or Ajax
First things first, let's create our request using the some old-fashioned XMLHttpRequest.
var xhr = new XMLHttpRequest(); xhr.onreadystatechange = function(){ if (xhr.readyState === 4){ if (xhr.status === 200){ alert(xhr.responseText); } } } xhr.open("POST", baseUrl+"/api/dms/objects") xhr.setRequestHeader("X-ID-TENANT-NAME", tenant) xhr.setRequestHeader("Authorization", auth) xhr.send(singleFormData);
Notice that we do not state a distinct content type for the multipart body, only for the form data parts as declared in the FormData object.
An Ajax Query would work in a similar fashion:
$.ajax( { type: "POST", url: baseUrl+"/api/dms/objects", data: singleFormData, processData: false, contentType: false, cache: false, beforeSend: function(request) { request.setRequestHeader("Authorization", auth); request.setRequestHeader("X-ID-TENANT-NAME", tenant); }, complete: function(result) { alert(result.responseText); } });
Again, we avoid setting a content type for the entire multipart body. We also set "processData" to false in order of avoiding any transformation of the request body into plaintext.
Multipart HTTP Request in node.JS
Taking advantage of the simple request structure of the Request library, all we need to do is create a JSON representation of the request with the form data as its body.
The request object defines the HTTP method, target URI, authentication credentials, headers and of course the form data itself. For the payload of the request to be recognized as multipart, the 'Content-Type'-header of the request has to be "multipart/form-data".
createSingleDocumentMultipartRequest(doc_title, doc_fileName, doc_cid, doc_contentType){ return{ method: 'POST', uri: baseUrl + '/api/dms/objects', headers: { 'Accept': 'application/json', 'Content-Type': 'multipart/form-data', 'X-ID-TENANT-NAME': tenant }, auth: { user: user, pass: password }, formData: createImportFormData(doc_title, doc_fileName, doc_cid, doc_contentType) }; }
To send our HTTP request toward it's target URI we invoke the post method of the request API. The callback function we provide enables us to work with the response.
function executeRequest(request_object){ request.post(request_object, function callback(err, httpResponse, body) { if(err) throw err; else { console.log(httpResponse.statusCode) console.log(body) }}) }
Batch Import
If more than one document is to be imported, this can be done using the same endpoint with a similar multipart request containing multiple content form parts and a larger metadata file containing metadata for each document.
Form Data Assembly
Creating our new form data will be based around iterating over arrays containing the document data relevant to the construction of the metadata and content form parts.
function createMultiDocumentMetadata(doc_titles, doc_fileNames, doc_cids, doc_contentTypes){ var objects = [] for (var i = 0; i < doc_titles.length; i++){ objects[i] = { "properties": { "system:objectTypeId": { "value": "document" }, "Name": { "value": doc_titles[i] } }, "contentStreams": [{ "mimeType": doc_contentTypes[i], "fileName": doc_fileNames[i], "cid": doc_cids[i] }] } } return {"objects": objects} }
function createMultiImportFormdata(doc_titles, doc_fileNames, doc_cids, doc_contentTypes){ var formData = {} formData['data'] = { value: JSON.stringify(createMultiDocumentMetadata(doc_titles, doc_fileNames, doc_cids, doc_contentTypes)), options: { contentType: 'application/json' } } for(var i = 0; i < doc_cids.length; i++){ formData[doc_cids[i]] = { value: fs.createReadStream(doc_fileNames[i]), options: { contentType: doc_contentTypes[i], filename: doc_fileNames[i] } } } return formData; }
Batch Import Form Data in Browser-Based Applications
In the context of our browser application, we continue using the FormData API, adding more form parts for each new content file.
var multiFormData = new FormData() var metadataBlob = new Blob([JSON.stringify(singleMetadata)],{type:"application/json"}) var contentBlob1 = new Blob([file],{type:"text/plain"}) var contentBlob2 = new Blob([file2],{type:"text/plain"}) multiFormData.append('data', metadataBlob, "metadata") multiFormData.append('cid_63apple', contentBlob1, "contentdata1") multiFormData.append('cid_64apple', contentBlob2, "contentdata2")
Assembling the Request
The multipart request JSON assembly is similar to the singular import. The same function executeRequest(request_object)
can be used to send the request.
var doc_titles = ["test", "test1"] var doc_fileNames = ["test.txt", "test1.txt"] var doc_cids = ["cid_63apple", "cid_64apple"] var doc_contentTypes = ["text/plain", "text/plain"] createBatchImportMultipartRequest(doc_titles, doc_fileNames, doc_cids, doc_contentTypes){ return{ method: 'POST', uri: baseUrl+'/api/dms/objects', headers: { 'Accept': 'application/json', 'Content-Type': 'multipart/form-data', 'X-ID-TENANT-NAME': tenant }, auth: { user: 'clouduser', pass: 'cloudsecret' }, formData: createMultiImportFormdata(doc_titles, doc_fileNames, doc_cids, doc_contentTypes) }; }
Same goes for the AJAX/XMLHttpRequest way of doing things:
$.ajax( { type: "POST", url: baseUrl+"/api/dms/objects", data: multiFormData, processData: false, contentType: false, cache: false, beforeSend: function(request) { request.setRequestHeader("Authorization", "Basic Y2xvdWR1c2VyOmNsb3Vkc2VjcmV0"); request.setRequestHeader("X-ID-TENANT-NAME", "team1"); }, complete: function(result) { $('#container2').text(result.responseText); } });
Summary
In this tutorial the import of documents was implemented using Javascript and node.js. View the node.js script here.