Detailing the available schema, object type definitions as well as property definitions.
Table of Contents
Introduction
In yuuvis® Momentum, documents are stored as document objects. The business integrator defines one or more object types and their properties according to specific needs. Any document imported will need to be classified as exactly one of the object types defined in the schema of the system.
The schema defines a set of object types and a set of properties. The object type classifies the object and defines the properties that the object must have or is allowed to have (properties may be optional). There are furthermore some metadata like 'system:versionNumber
' or 'system:lastModificationDate
', whose values are provided by the system.
By default, the maximum number of property definitions in a tenant-specific schema is 20. A system integrator can change this limit via the schema.tenant.properties.limit
parameter in the system service. If you want to increase this limit, it is recommended to increase the maximum number of fields in your elasticsearch index as well.
Type IDs
In every property definition and every object type definition the id attribute is required. It is used to identify the object type or property. An ID is a string with a maximum of 63 characters and it must match the regular expression
([a...zA...Z][a...zA...Z0...9]*:)?[a...zA...Z][a...zA...Z0...9]*.
Type IDs are also used as the name of the type, e.g., in query operations. Hence, it is recommended to choose meaningful values for type IDs.
The part before the :
character is the prefix. In tenant-specific (app-specific) types the prefix is always "ten"+<tenant name>
("app"+<app name>
). In most cases, you can omit the prefix when designing a schema or importing objects or creating search queries. The prefix is added automatically.
As of 2021 Winter, it is possible to use the -
character as an additional separator within prefixes for tenant-specific IDs if matching the following regular expression:
(([a...zA...Z][a...zA...Z0...9]*-)?([a...zA...Z][a...zA...Z0...9]*:))?[a...zA...Z][a...zA...Z0...9]*
An exception are column names in Table Property Definitions, where prefixes are prohibited as of version 2020 Winter.
Property Definitions
General Attributes
All property definitions have the following attributes:
Attribute | Type | Required | Description |
---|---|---|---|
id | String | yes | The type ID of the property. It uniquely identifies the property in the schema. |
localNamespace | URI | no | By using namespaces, it is possible to form groups of properties and object types. |
description | String | no | Describes the property. |
propertyType
| Enum | yes | Specifies the type of this property. The following types are supported:
|
cardinality | Enum | yes | Defines whether the property can have a maximum of one or an arbitrary number of values. Possible values are single and multi. |
required | Boolean | yes | If true, the object must have at least one value of this property. If a property is required and has no This attribute can be overwritten in the property references of object type definitions. Hence, the same property can be required in one object type and not required in another object type. |
queryable | Boolean | no | Specifies whether or not the property may appear in the WHERE clause of a query statement. Default is true. false is only allowed for table properties. |
classification | String | no | Declares the classifications this property belongs to. There is no validation or use in the system itself. For example, string properties can be classified as 'email' or 'url' and a client can use this classification to present the property's content in an appropriate manner. This tag can be used several times and the corresponding values are delivered in an array. Note: Make sure to validate the strings you set for the classification tags, so that your application will not fail if the string does not match the expected syntax. |
defaultValue | depending on thepropertyType | no | The value that the system sets for the property if no value is provided during object creation. If the |
Specific Attributes
Depending on the property type a property can have specific attributes.
Integer Property Definitions
Attribute | Type | Required | Description | Technical Limit used as Default |
---|---|---|---|---|
maxValue | Integer | no | The maximum value allowed for this property | 9223372036854775807 |
minValue | Integer | no | The minimum value allowed for this property. | -9223372036854775808 |
DateTime Property Definitions
Attribute | Type | Required | Description |
---|---|---|---|
resolution | Enum | no | The only supported value is date. If the resolution is set to date, the property can only store values without a time part and these values have the format yyyy-MM-dd. |
Decimal Property Definitions
Decimal properties support values of 64-bit precision (IEEE 754). The values have to be specified in decimal notation. However in the table below, the technical limits are provided in base 2 scientific notation in order to display the values in a condensed and comfortable format. This format cannot be used to specify the value for a maxValue
or minValue
attribute in a decimal property definition.
Attribute | Type | Required | Description | Technical Limit used as Default |
---|---|---|---|---|
maxValue | Decimal | no | The maximum value allowed for this property | (2-2 -52 )·2 1023 |
minValue | Decimal | no | The minimum value allowed for this property. | -(2-2 -52 )·2 1023 |
Please note the additional limit of precision: values of magnitude smaller than 2 -1074 will be rounded to 0.
String Property Definitions
Attribute | Type | Required | Description |
---|---|---|---|
maxLength | Integer | no | The maximum length (in characters) allowed for a value of this property. |
minLength | Integer | no | The minimum length (in characters) allowed for a value of this property. |
In general, the length of a string is limited to 8192. Hence, the maximum value for maxLength and minLength is 8192.
Table Property Definitions
The column types of a table property are defined by a list of property definitions inside the table property definition. Each column property definition has its own attributes, such as required or default value. The values are applied to each row entry. The cardinality of a column property definition must be single.
As of version 2020 Winter, the column names specified via <id>examplecolumn</id>
must not contain a prefix. They have to follow the convention [a...zA...Z][a...zA...Z0...9]*. Otherwise, the schema containing the corresponding property definition will not pass the validation.
Table properties differ depending on the value of the property queryable. If queryable is false, the table must not appear in WHERE clauses in search queries. However, you can still find objects using full-text conditions on values stored in a table (query keyword 'CONTAINS'). If queryable is true, you can apply more precise search queries to a table, but you will need more disk space to store objects.
The number of rows and columns of a table property definition is limited to a maximum of 512 columns and 1024 rows.
Structured Data Property Definition
As of version 2021 Summer, yuuvis® Momentum offers a property type for the storage of structured data in JSON format. Thus, it is possible to store interleaved data structures in a queryable way without defining each single sub-property in the schema. An example definition is shown in the code block below. The schema validation checks if the ID follows the convention. Only the value single
is allowed as cardinality
.
Note: The structured data properties should NOT be considered to replace the concept of a well-defined schema. They should be used only if the handling of objects' metadata via the conventional property definitions is not reasonable.
Even if the schema allows various structured data properties in an object, the instantiated object can contain a value for at most one structured data property.
There are strict rules for the values that can be specified for structured data properties assigned to an object. Find all details in the linked chapter.
>> Structured Data in Request Bodies
Structured data properties are queryable similar to table properties.
>> Queries on Structured Data
This example tutorial provides explanations and code examples to get an idea on how to define and specify structured data properties and on how to query them.
>> Setting and Querying Structured Data
Object Type Definitions
There are different groups of object type definitions:
In a schema, all object type definitions must appear in this order. First all document object type definitions, then all folder object type definitions and so on.
All object type definitions have the following attributes:
Attribute | Type | Required | Description |
---|---|---|---|
id | String | yes | The type ID of the object type. It uniquely identifies the object type in the schema. |
localNamespace | URI | no | By using namespaces, it is possible to form groups of properties and object types. |
description | String | no | Describes the property definition. |
baseId | Enum | yes | Specifies the base type of this object type. The following object types are supported:
|
propertyReference | String | no | Reference by ID to a property. An object type definition can have an arbitrary number of property references. |
A tenant-specific object type can have references to both tenant-specific and global properties.
Document Object Type Definitions
Document object types are the elementary object types. To store objects with content, the objects' type must be a document type.
Document object type definitions have the following specific attributes:
Attribut | Type | Required | Description |
---|---|---|---|
contentStreamAllowed | Enum | yes | Specifies whether objects of this type must, must not, or may have content. Possible values are:
Note: The attribute is also available for secondary object type definitions. If a secondary object type with a specified |
secondaryObjectTypeId | String | no | References to secondary object types (if there are several secondary object types, they are listed one below the other). Determines which secondary object types an instance of this object type receives upon creation ( In contrast to the CMIS specification (Content Management Interoperability Services), in which the secondary object types can be determined freely for each object instance, the schema specifies which secondary object types an object instance must have. |
Folder Object Type Definitions
You define folder object types as structuring elements either in your global, tenant-specific or application schema. They do not have their own content files in contrast to document object types. In yuuvis® Momentum versions 2020 Autumn and older, folders cannot be set up in a hierarchical structure – a folder inside a folder is not allowed. As of version 2020 Winter, a folder hierarchy is possible. Folders allow the grouping of multiple objects and have their own metadata which will not be inherited by the assigned objects. They act as object parent being referred to by its system:objectId
property value in the individual metadata of the child objects as system:parentId
and thus are also taken into account during the search. Similar to a document object type a folder can reference a secondary object type's property group. The properties can be used as regular properties on folder level. Folders with objects assigned to them cannot be deleted.
The following code block shows the example folder object type definition "dossier" that can be integrated into any given schema. Note that folder object type definitions need to be defined after all document type definitions, yet before any secondary object type definitions.
The document object types are the elementary object types of your schema in which the "real" content is stored. During import or update of an object the object ID of the target folder is provided as the new object's parent ID (system:parentId
). Actions such as assigning objects, moving them to or removing them from a folder can easily be carried out in this way. A validation takes place and checks whether the given ID is a folder and does exist. Furthermore, as of version 2020 Winter, folders in a folder will be allowed. The folder hierarchy will represent a tree structure where each folder cannot appear as a parent in different levels of the structure. Hierarchies that do not adher to this structure will not pass validation.
By assigning documents to a configured folder you set an exact filing location and ensure a "tidy" storage of documents within your repository – if needed for your use case.
After schema modification, a folder object can be created by importing the following metadata:
After extracting the object ID of the folder from the import response, you can start populating the folder with documents:
Folder object type definitions have the following specific attributes:
Attribute | Type | Required | Description |
---|---|---|---|
secondaryObjectTypeId | String | no | References to secondary object types (if there are several secondary object types, they are listed one below the other). Determines which secondary object types an instance of this object type receives upon creation ( In contrast to the CMIS specification (Content Management Interoperability Services), in which the secondary object types can be determined freely for each object instance, the schema specifies which secondary object types an object instance must have. |
Secondary Object Type Definitions
Secondary object types are abstract. This means that they cannot be instantiated. They allow you to design a more complex schema. In a way the concept of secondary object types is similar to the concept of inheritance.
Secondary object types can be used to group properties and then assign these property groups to object types (e.g., documents or folders). Like other object types, a secondary object type can have references to properties. They can also have no properties at all which can be understood as a way of categorizing document types (tagging). Document or folder object types can in turn reference secondary object types, which give them their properties. There are two ways for secondary object types to be referenced by an object in the schema definition: as static or as floating.
<secondaryObjectTypeId static="true">INV</secondaryObjectTypeId>
.<secondaryObjectTypeId static="false">INV</secondaryObjectTypeId>
.
The property groups of static referenced secondary object types are automatically available in all instances of the object type. Floating secondary object types can be handled in a flexible way during the import (POST /api/dms/objects endpoint) or at runtime for already existing instances of an object type with an update (POST /api/dms/objects/{objectId} / PATCH /api/dms/objects/{objectId}). The keywords "add":,
"value":
or "remove":
can be used in the "system:secondaryObjecttypeIds":
property area of the metadata.json filecan.
Keyword | Type | Description | metadata.json |
---|---|---|---|
"value": | array, comma-separated list | during import or update adds one or multiple secondary object types to an object Note that the list of "floating" secondary object types transferred will replace all existing ones. This includes the related properties and their metadata values. If you want to keep an existing "floating" secondary object type, you have to list it as well. | { "objects": [{ "properties": { "system:objectTypeId": { "value": "appSot:document" } "system:secondaryObjectTypeIds": { "value": ["INV","SUP"] } ... |
"add": | string | during update adds a single secondary object type to an object | { "objects": [{ "properties": { "system:secondaryObjecttypeIds": { "add": "INV" }, ... |
"remove": | string | during update removes a single secondary object type from an object Note: The metadata assigned to the object by referencing the secondary object type will be deleted for the current object version! | { "objects": [{ "properties": { "system:secondaryObjecttypeIds": { "remove": "INV" }, ... |
Document object types have a system:secondaryObjectTypeIds
property that contains the secondary object types associated with the document object type in the schema. This allows secondary object types to be taken into account during the search. The secondary object type must be explicitly specified in the FROM clause: select * from appSot:INV.
The system:secondaryObjectTypeIds
property is set by the repository using the schema.
Consider the example schema. If a document of the type appSot:document is created, it may have values for the properties appSot:dateOfReceipt and appSot:comment. For the appSot:dateOfReceipt property definition this is obvious, because there is a direct reference in the document type definition of appSot:document.
The appSot:comment property definition is not directly referenced but the definition of appSot:document has a reference to the secondary object type appSot:basicInfo which references the appSot:comment property definition. Thus, both properties are available for creating documents.
Furthermore, properties passed on by secondary object types are treated like "regular" properties in a document. The attributes of a property are taken into account. For example, a document of the type appSot:document not only may have a value for appSot:comment, it must have a value, because it is a required property. It makes no difference, whether a document type references the property definitions directly or indirectly via a secondary object type reference.
You cannot tell from the metadata of a document any longer if a property is referenced directly or indirectly in the schema. All properties are plain in the properties list. For example, the metadata of a document based on the appSot:document document type definition may look like this:
Secondary object type definitions have the following attributes:
Attribute | Type | Required | Description |
---|---|---|---|
| Enum | no | Can substantiate the For the final document, content will be For the final document, content will be Conflict situation leading to invalid documents: any combination of at least once
If |
System Properties
General Metadata Properties
In addition to the properties assigned to the object types in the schema, each instantiated object has a set of general system properties. Some system properties are set by the system and some system properties can be set by the user.
Property | Type | Description | Set by |
---|---|---|---|
system:objectId | string | Identifies the object in the database. | system |
system:baseTypeId | string | Identifies the object type the object instantiates. Secondary object types are not allowed. | system |
system:objectTypeId | string | Required during an import and cannot be changed lateron. Identifies the object's type. | user |
| JSON list of strings | Contains the secondaryObjectTypeId for each secondary object type associated with the document object type in the schema. | user |
system:createdBy | string | userId of the user that has initially created the object. | system |
system:creationDate | string | Date of the object's creation in format yyyy-MM-ddTHH:mm:ss.fffZ. | system |
system:lastModifiedBy | string | userId of the user that sent the last successful POST request on the object. | system |
system:lastModificationDate | string | Date of the last successful POST request on the object as a string in format yyyy-MM-ddTHH:mm:ss.fffZ. | system |
| string |
During an import or update operation, the object can be assigned to an existing folder by referencing its | user |
system:parentObjectTypeId | string | Identifies the folder object type of the parent folder that is specified by Available for document object types and as of version 2020 Winter, also for folder object types. | system |
system:versionNumber | integer | Integer object version number. Corresponds to the number of POST requests on the object starting with the initial creation. | system |
system:tenant | string | Identifies the tenant the object belongs to. | system |
system:traceId | string | The traceid of the import operation or last update operation.Unique process number of any operation. If not specified in the request, a random string value will be set. | system |
| JSON table of strings and integers | Contains the properties of the tags assigned to the object. >> Tagging | user |
| string | Only available for documents. Secondary object type The binary content of the object cannot be changed or deleted before this date, but metadata updates are allowed. In an update request, the | user |
| string | Only available for documents. Secondary object type This date defines the start of the retention time. It has to be earlier in time than the | user |
| string | Only available for documents. Secondary object type If | user |
Content Stream Properties
The contentStreams
group of properties comprises system properties of document objects containing a binary content. Thus, their contentStreamAllowed
attribute has to be required
or allowed
. The properties in contentStreams
contain all information necessary to be able to handle binary content. Depending on the type of operation, different properties are required for import/update requests or are displayed in a response.
Property | Type | Description | In an Import/Update Request | In a Response |
---|---|---|---|---|
contentStreamId | String | Points to existing content within a repository. | Required only for pointing to existing content. If not specified, the system generates a UID. | displayed |
length | Integer | Length of the binary content, determined by the system. | - | displayed |
mimeType | String | Mime type of the content file. | Determined by the content analysis, but can be overwritten by user specification in the import body. | displayed |
fileName | String | Name of the content file. | Can be set in the request body. If not specified, During an import, In case of a pure content update, | displayed |
digest | String | SHA-256, automatically determined from the binary content. | - | displayed |
repositoryId | String | ID of the repository that will be used for storage of the binary data. | Required only for pointing to existing content. If not specified, the default repository defined in the repository service configuration will be set. | displayed |
archivePath | String | Additional and optional path structure of the stored object. | Required only for pointing to existing content if reconstruction is not possible with metadata information | displayed only if it was set |
range | String | Applies to Compound Documents only. Defines a certain segment from compound documents that should be provided for content retrievals. | Optional in the request body and only available for compound documents. | displayed only if it was set |
ci d | String | Assign the corresponding multipart content. | Required in the import request body. Not needed later on and therefore not stored in the system. | - |
Prefixes
The IDs of many object type and property definitions have prefixes. System types have the prefix system
. Types defined in a tenant schema have the prefix ten
followed by the name of the tenant. Types defined in an app schema have the prefix app
followed by the name of the app. If you post a tenant schema or an app schema, all IDs in the schema must either match these rules or be missing altogether. If they are missing, the prefix is added to the IDs for the applied schema.
IDs of types defined in the global schema can have any prefix, as long as they do not start with ten
or app
or equal system
. Alternatively, they can also have no prefix.
App Schemata
In the multi-tenant landscape of yuuvis® Momentum, any object types or properties that need to be available for multiple or all tenants, need to be introduced to the system schema using the system schema endpoints. To prevent cluttering the system schema, avoid dependencies and allow for duplicate names, the system schema can be structured into applications, which provide a namespace for properties and object types pertaining to a particular use case.
Applications are defined as smaller schema files that are integrated into the global schema to be available for every tenant. This allows modular usage of application schemata across multiple yuuvis® systems.
Within the schema, applications are defined by a prefix followed by the application's name: app<app name>. The app name is case insensitive when used as a path parameter to interact with the application schema or within the search engine.
Depending on the situation, the prefix can be omitted, for example to broaden a search query across multiple application schemata.
The app schema endpoints are:
- POST /api/system/apps/{app}/schema - Introduces provided schema as app within tenant schema, overwrites previous app schema
- GET /api/system/apps/{app}/schema - Retrieves the specified app portion of the tenant schema
- POST /api/system/apps/{app}/schema/validate - Validates a schema based on app schema rules
When uploading an app schema, all properties that do not specify a prefix will have that prefix generated as app<app name> where <app name> is equal to the path parameter {app}.
It is allowed to specify this prefix ahead of time, meaning that both "<property name>" and "app<app name>:<property name>" are acceptable names for a property within a schema posted to {base URL}/api/system/apps/<App name>/schema. Uploading a schema using prefixes that do not match the {app} path parameter or any other known apps or tenants will result in a validation error message.
Resolve conflicts that result from multiple similar app/property combinations will also lead to schema validation errors.
Tenant Schemata
Each tenant can define tenant-specific object types in a separate tenant schema. The object types defined in a tenant schema are only available for the corresponding tenant.
Each tenant can define exactly one tenant schema. It can be customized via the following endpoints:
- GET /api/admin/schema - Retrieve the tenant's schema
- POST /api/admin/schema - Update the tenant's schema
- POST /api/admin/schema/validate - Validate the tenant's schema
Each object type ID and property type ID has the prefix t
en
+ <tenant name>
. Thus, the same object type name can occur in multiple tenant schemata.
If any prefix is used not equal to t
en
+ <tenant name>
, the tenant schema will not pass the validation.
If the prefix is missing in the request body for a tenant schema update, it will be added automatically in the applied schema that is used in use cases like search, import or update of objects.
Summary
In this article, you have reviewed the tools available for creating yuuvis® Momentum schemata. Now you can get started implementing your own custom schema to solve your information management problems.