Object Representation Formats

A description of the structure in which yuuvis® Momentum expects objects to be formatted and returns objects.

Table of Contents

Introduction

The representation of objects in yuuvis® Momentum depends on their situation within the processing chain, but is always organized in a JSON structure. On the top level, this JSON structure always contains the list objects. Each entry of this list represents a set of metadata assigned to one object. The metadata are grouped into sections. The section properties is always present whereas the sections contentStreams, renditions and options are not always present.

During an import process, the JSON objects list is integrated into a multipart body. Thus, it is possible to handle binary content together with the metadata in one call. Multiparts are not discussed in this article. Please refer to the import endpoint description POST /api/dms/objects.

Schema-Defined Property Sections

These sections contain properties that are defined in a schema and can be used in search queries containing SELECT and/or WHERE clauses. 

Section 'properties'

The section properties is always present. It contains user-defined properties from the applied schema as well as pre-defined general system properties.
>> Schema - Defining Object Types

Each object stored in yuuvis® Momentum has the full set of the pre-defined general system properties. However, most of them do not necessarily have to be specified during the import of the object. The only required property for an import is system:objectTypeId.

The code block below shows a small example for an objects list as it could appear in an import request body. Since there is no contentStreams section, no binary content file will be assigned to the object. The properties section contains the required system:objectTypeId with the specified value that has to be an object type available in the applied schema. Additionally, a property name is set which does not have any prefix. The import body is valid only if the property is defined in the global schema.


Small Example: Import without Content
{
	"objects": [{
		"properties": {
			"system:objectTypeId": {
				"value": "smallDocument"
			},
			"name": {
				"value": "exampledocument-without-content"
			}
		}
	}]
}

The metadata will be automatically enriched with the not-specified system properties. If the same object is retrieved later, these properties will be listed in the response body as shown in the following example code block.

Small Example: Retrieving Metadata
{
    "objects": [{
        "properties": {
            "system:objectId": {
                "value": "cdc7095f-a5ce-486d-92a7-6d0955d969ee"
            },
            "system:baseTypeId": {
                "value": "system:document"
            },
            "system:objectTypeId": {
                "value": "smallDocument"
            },
            "system:createdBy": {
                "value": "0d7fd0be-6a0b-4d3b-933c-25e0c4c5d794"
            },
            "system:creationDate": {
                "value": "2018-01-26T15:21:170Z"
            },
            "system:lastModifiedBy": {
                "value": "0d7fd0be-6a0b-4d3b-933c-25e0c4c5d794"
            },
            "system:lastModificationDate": {
                "value": "2018-01-26T15:21:170Z"
            },
            "system:versionNumber": {
                "value": 1
            },
            "system:tenant": {
                "value": "tenant1"
            },
            "system:traceId": {
                "value": "97a35859dbb4c435"
            },
            "name": {
				"value": "exampledocument-without-content"
			}
    }]
}

Specifying Table Properties

The values for table properties are specified in two lists as shown in the example below. The names of the table columns are listed in the correct order in columnNames. In the value array, for each table row a list of individual values is expected. The length of those lists has to equal to the length of the  columnNames list. The individual values need to have the appropriate type that is defined in the table property definition for the corresponding column.

{
	"objects": [{
		"properties": {
			"system:objectTypeId": {
				"value": "document"
			},
			"name": {
				"value": "exampledocument-without-content"
			},
			"tableproperty": {
                "columnNames": ["iColumn1", "iColumn2", "iColumn3"],
                "value": [["something", "to know", true],["more", "infos", false]]
			}
		}
	}]
}

Specifying Tags

Tags can be managed independently but also also in one call together with the metadata. The tags are stored in the system:tags system property in table format as described before.

{
	"objects": [{
		"properties": {
			"system:objectTypeId": {
				"value": "document"
			},
			"name": {
				"value": "exampledocument-without-content"
			},
			"system:tags": {
          		"value": [ [ "tag1", 100, "2020-02-20T02:22:20.220Z", "1234567887654321" ] ]
			}
		}
	}]
}

Specifying Structured Data Properties

Even if the schema allows various structured data properties in an object, the instantiated object can contain a value for at most one structured data property. The value for a structured data property has to be a valid JSON structure. It is not allowed to pass a single string, boolean or other variable format:

valid example value
				"appTable:customerdetails": {
					"value": {
						"id": 2982,
						"uid": "711e1858-eb24-4183-8743-0292c7b9b93b"
					}
				}
invalid example value - NOT allowed
				"appTable:customerdetails": {
					"value": "{\"id\":2982,\"uid\":\"711e1858-eb24-4183-8743-0292c7b9b93b\"}"
				}
invalid example value - NOT allowed
				"appTable:customerdetails": {
					"value": true
				}

The maximum value for the total number of sub-properties within one JSON value is 500. The keys have to be strings not longer than 32 characters and have to follow the convention [a-zA-Z][a-zA-Z0-9]*. The maximum depth for JSON structure is 16. Empty maps are not allowed in any position of the JSON and are replaced by null.

The code block below shows a more complex example for a valid objects list containing a structured data property appTable:customerdetails. The specified value is a valid JSON structure with 16 sub-properties and a depth of 2.

{
	"objects": [{
		"properties": {
			"system:objectTypeId": {
				"value": "document"
			},
			"name": {
				"value": "exampledocument-without-content"
			},
			"appTable:customerdetails": {
				"value": {
					"id": 2982,
					"uid": "711e1858-eb24-4183-8743-0292c7b9b93b",
					"word": "beverages",
					"words": [
						"tee",
						"milk",
						"water"],
					"sentence": "The customer prefers hot chocolate.",
					"sentences": [
						"Unfortunately, hot chocolate is not offered.",
						"The customer decides for milk instead.",
						"The milk should be cool."
					],
					"food": {
						"uid": "7aa4a2f2-3dc0-420c-a0d7-edc6af3619de",
						"dish": "Bunny Chow",
						"description": "Fresh Norwegian salmon, lightly brushed with our herbed Dijon mustard sauce, with choice of two sides.",
						"ingredient": "Jelly",
						"measurement": "1/2 teaspoon",
						"lastcooked": "2018-03-13T00:00:00.000Z"
					}
				}
			}
		}
	}]
}

Section 'contentStreams'

The section contentStreams is always present for objects with a binary content file assigned to them. It is a list with one entry containing a set of content stream properties.
>> Content Stream Properties

Each object with an assigned content file stored in yuuvis® Momentum has the full set of those properties in addition to the previously described properties section.

During the import of an object with a new content file that is not yet stored via yuuvis® Momentum, the cid has to be specified in the contentStream section. This value is a reference on the corresponding binary content passed in the same multipart import body together with the objects list. The code block below shows an example for an objects list as it could appear in an import request body for the creation of an object with a binary content file assigned to it. 

Example with new Content
{
	"objects": [{
		"properties": {
			"system:objectTypeId": {
				"value": "largeDocument"
			},
			"name": {
				"value": "exampledocument-with-content"
			}
		}
		"contentStreams": [{
            "cid": "cid_63apple"		
        }]
	}]
}

For references on already existing content the contentStreamId and repositoryId have to be specified in order to identify the binary content file in the storage.

Example with already existing Content
{
	"objects": [{
		"properties": {
			"system:objectTypeId": {
				"value": "largeDocument"
			},
			"name": {
				"value": "exampledocument-with-content"
			}
		}
		"contentStreams": [{
            "contentStreamId": "2B797243-A1F5-11EA-A814-9FABD98CE7A7",
            "repositoryId": "repo252"		
        }]
	}]
}

The metadata will be automatically enriched with the not-specified system properties. If the same object is retrieved later, the properties will be listed in the response body as shown in the following example code block.

Example Metadata retrieved for Object with assigned Content.
{
    "objects": [{
        "properties": {
            "system:objectId": {
                "value": "cdc7095f-a5ce-486d-92a7-6d0955d969ee"
            },
            "system:baseTypeId": {
                "value": "system:document"
            },
            "system:objectTypeId": {
                "value": "appEmail:email"
            },
            "system:createdBy": {
                "value": "0d7fd0be-6a0b-4d3b-933c-25e0c4c5d794"
            },
            "system:creationDate": {
                "value": "2018-01-26T15:21:170Z"
            },
            "system:lastModifiedBy": {
                "value": "0d7fd0be-6a0b-4d3b-933c-25e0c4c5d794"
            },
            "system:lastModificationDate": {
                "value": "2018-01-29T13:13:113Z"
            },
            "system:versionNumber": {
                "value": 2
            },
            "system:tenant": {
                "value": "tenant1"
            },
            "system:traceId": {
                "value": "97a35859dbb4c435"
            },
            "appEmail:from": {
                "value": "Maria Schmidt <schmidt@example.de>"
            },
            "appEmail:to": {
                "value": ["Hans Meier <meier@example.de>"]
            },
            "appEmail:cc": {
                "value": ["Conrad Schulze <schulze@example.de>",
                "Emilia Lehmann <lehmann@example.de>"]
            },
            "appEmail:subject": {
                "value": "Updated Bewerbungsunterlagen"
            },
            "table": {
                "columnNames": ["iColumn1", "iColumn2", "iColumn3"],
                "value": [["something", "to know", true],["more", "infos", false]]
            }
        },
        "contentStreams": [{
            "contentStreamId": "2B797243-A1F5-11EA-A814-9FABD98CE7A7",
            "length": 173413,
            "mimeType": "message/rfc822",
            "fileName": "upload.eml",
            "digest": "E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855",
            "repositoryId": "repo252"
        }]
    }]
}

Process-Related Property Sections

The properties in these sections are not defined in a schema. They are not always present and appear only in specific situations. They cannot be used in SELECT and/or WHERE clauses of search queries. 

Section 'renditions'

The section renditions occurs during the import of objects with a binary content file. It contains a list of rendition specifications.

The list can contain only one entry of kind text. A plain text is stored in the Elasticsearch index in order to allow for full-text search queries.

The following properties are set:

PropertyDescriptionRequired in an import body where the 'renditions' section is specified
mimeType

mimeType of the rendition.

Available values: "text/plain"

yes
kind

Kind of the rendition.

Available values: "text"

yes
contentStreamSection of content stream properties describing the details of the plain text file that should be read to create the text rendition. It has to be included into the multipart request body.yes

lengthLength of the file to be read for the creation of the rendition.no
mimeType

mimeType of the file to be read for the creation of the rendition.

Available values: "text/plain"

no
fileName

Name of the rendition file.

no
cidReference within the multipart on the file to be read for the creation of the rendition.yes

If a content file of appropriate format is imported in yuuvis® Momentum, the CONTENTANALYZER service can create a full-text rendition. The service will automatically add the section renditions to the corresponding object.

If the section renditions is already specified by the user in the initial request body, the CONTENTANALYZER will not extract the full text from the corresponding content file. Instead, the user-specified text rendition will be stored in the Elasticsearch index. This might be useful if the CONTENTANALYZER service is not used or the content file's format is not supported for full-text analysis.

The example objects list below is taken from an import request body in which an e-mail file is assigned as binary content file. Since the section renditions is also specified, the CONTENTANALYZER will not analyze the email file. The plain text read from the file referenced by cid will be stored in Elasticsearch for full-text search.

Example from an import request body.
{
	"objects": [{
		"properties": {
			"enaio:objectTypeId": {
				"value": "E13C7EBF4B974B3A9FF296C01F90D0EE"
			},
			"sysfrom": {
				"value": "Garco Meissler <garco@example.de>"
			},
			"systo": {
				"value": "Dudreas Annkel <dudreas@example.de>"
			},
			"syscc": {
				"value": "Kruedeas Anger <kruedeas@example.de>"
			},
			"syssubject": {
				"value": "Wachsmalstift rückwärts kontrollieren"
			},
			"redline:baseTypeId": {
				"value": "DOCUMENT"
			},
			"redline:mandant": {
				"value": "default"
			}
		},
		"contentStreams": [{
			"mimeType": "message/rfc822",
			"fileName": "upload.eml",
			"cid": "cid_63apple"
		}],
		"renditions": [{
			"mimeType": "text/plain",
			"kind": "text",
			"contentStream": {
				"length": 39939,
				"mimeType": "text/plain",
				"fileName": "content.txt",
				"cid": "rendition_0"
			}
		}]
	}]
}


Read on

Schema - Defining Object Types

Detailing the available schema, object type definitions as well as property definitions. Keep reading

Importing Documents via Core API

This tutorial shows how documents can be imported into a yuuvis® API system via the Core API. During this tutorial, a short Java application will be developed that implements the HTTP requests for importing documents. We additionally provide a JavaScript version of this tutorial.  Keep reading

Retrieving Documents via Core API

In this tutorial, we will discuss various ways to retrieve objects via the Core API from the yuuvis® API system using an OkHttp3 Java client. Keep reading