Search Query Language

How to query in an SQL-like manner.

Table of Contents

The query language based on CMIS serves to standardize DMS requests to the yuuvis® API system.

The CMIS query language consists of a subset of the SQL-92 grammar, augmented by a syntax for requesting full text. Thus, the CMIS query language provides a relational view of the data. The virtual tables requested in the FROM clause correspond to DMS object types and the virtual columns requested in the SELECT correspond to the CMIS properties (metadata fields).

Query Language Definition

A query is structured as follows

QuerySELECT[virtual column list]FROM[virtual table names]WHERE[conditions]ORDER BY[sorting specification]
Description
property list (metadata fields)
object type list
limitations
sorting criteria
  • use * as virtual column list to query all metadata fields
  • it is possible to use the SCORE() function in the virtual columns list
  • currently, only the query of a single object type is supported.

Note: The search on metadata fields is case sensitive. A case insensitive search is possible only by means of a CONTAINS statement.

Simple SELECT Queries

Queries on Object Types

Examples for simple SELECT queries on a specified object type
-- Select all objects of type 'email'.
SELECT * FROM email;

-- Select the customer number and the date of all invoices.
SELECT custno,date FROM invoice;

Queries on Secondary Object Types (as of 2019 Winter)

The search result delivers the entire object and not only the properties of the specified secondary object type.

Example SELECT query on a secondary object type
-- Select all objects of secondary object type system:rmDestructionRetention
SELECT * FROM system:rmDestructionRetention;

Queries with Aliases

Properties can be queried with an alias name. The values of these properties are returned in the result with the alias name. The alias can be specified by a space separated behind the property name or with the key word AS between property name and alias name.

Example queries
-- Alias without keyword: Select the customer number with alias c and the date with alias d of all invoices.
SELECT custno c,date d FROM invoice;

-- Alias with keyword: Select the customer number with alias c and the date with alias d of all invoices.
SELECT custno AS c,date AS d FROM invoice;

Queries with Conditions

In the WHERE clause, conditions can be formulated and combined with logical operators.

The following operators are supported:

  • comparison operators: <, >, <=, >=, <>, =
  • join operators: AND, OR, NOT
  • additional predicates: LIKE, IN, NULL, CONTAINS

Examples

Some introductory example queries with conditions
-- Select all invoices where the invoice amount is greater than 100.
SELECT * FROM invoice WHERE amount>100;

-- Select all invoices where the invoice amount is greater than 100 and the supplier is 'ACME'.
SELECT * FROM invoice WHERE amount>100 AND supplier='ACME';

-- Select all invoices where the invoice amount is greater than 100 or less than 10 and the supplier is either 'ACME', 'OS' or 'ICH'.
SELECT * FROM invoice WHERE (amount>100 OR amount<10) AND supplier IN ('ACME','OS','ICH');

Wildcard Example

Wildcard characters are used with the LIKE operator. There are two wildcards used in conjunction with the LIKE operator:

  • % – The percent sign represents zero, one, or multiple characters
  • _ – The underscore represents a single character
Example query
-- Select all invoices where the supplier starts with 'ACM' and the index data field 'amount' is not NULL
SELECT * FROM invoice WHERE supplier LIKE 'ACM%' AND amount IS NOT NULL;

Full-Text Search with CONTAINS

CONTAINS – Basics

To perform full-text search, use CONTAINS in the WHERE clause. The string values specified in the CONTAINS statements are handled case insensitive.

If a field name is specified, only this field is checked for the search term.

-- Select all types where the term 'QUERY' was found in field 'name'.
SELECT * FROM type WHERE name CONTAINS('QUERY');

If no field name is specified, a full-text search will be performed. All string metadata fields will be included in the search and if available, also the text rendition that was estimated by the CONTENTANALYZER service or set manually.

-- Select all types where in full-text 'QUERY' was found.
SELECT * FROM type WHERE CONTAINS('QUERY');

Logical Operators

The following logical operators are supported:

OperatorDescriptionSyntax
+The full-text must contain the term behind the operator. There is no blank space between operator and term.+<term>
-The full-text must not contain the term behind the operator. There is no blank space between operator and term.-<term>
AND

The full-text must contain both terms connected with the operator. The operator is only identified in capital letters and separated by exactly one blank space from the two terms it is connecting.

Note: We recommend to use the + operator instead of AND.

<term1> AND <term2>
OR

The full-text must contain at least one of the two terms connected with the operator. The operator is only identified in capital letters and separated by exactly one blank space from the two terms it is connecting.

Note: Consecutive terms without any operator in between are interpreted as they would be connected with OR. Thus, we recommend to avoid the use of the OR operator.

<term1> OR <term2>

The examples will clarify the use of the individual operators.

Operator +

If the + operator is applied to a term, only objects containing this term in their full-text will match the query. If further terms are specified without an additional + (or -) operator, they are allowed in the full-text but not required. Thus, they influence the ranking (higher score) of the found objects only. They do not have an impact on the number of found objects in the result itself.

-- Select all documents where 'tree' was found.
SELECT * FROM type where CONTAINS('apple +tree')

Operator -

If the - operator is applied to a term, only objects not containing this term in their full-text will match the query. If further terms are specified without an additional - (or  +) operator, they are allowed in the full-text but not required. Thus, they influence the ranking (higher score) of the found objects only. They do not have an impact on the number of found objects in the result itself.

-- Select all objects where 'tree' was found.
SELECT * FROM type where CONTAINS('apple -tree')

-- Select all objects containing 'tree' and not containing 'pie' in their full-text. The optional words 'apple' and 'exceptional' influences the scoring only.
SELECT * FROM type where CONTAINS('exceptional apple +tree -pie')

Operator OR

All objects containing at least one of the connected terms in their full-text are matching the query. A space between several search term combinations is interpreted as logical OR operator (as long as no include or exclude search term condition is used).

-- Select all objects containing at least one of the words 'apple' or 'tree'.
SELECT * FROM type where CONTAINS('apple tree')

-- The query is equivalent to the previous one without 'OR' operator.
SELECT * FROM type where CONTAINS('apple OR tree')

Some more examples:

-- Select all invoices where in full-text either 'reminder', 'discount' or 'delay' was found.
SELECT * FROM invoice WHERE CONTAINS('reminder discount delay');

-- Select all invoices where the invoice amount is greater than 1000 and in fulltext either 'reminder', 'discount' or 'delay' was found.
SELECT * FROM invoice WHERE amount>1000 AND CONTAINS('reminder discount delay');

Operator AND

All objects containing both terms connected with the AND operator in their full-text will match the query. However, we recommend to avoid the operator and apply the + operator to each individual term that should be contained in the object's full-text.

-- Select all objects containing both words 'apple' and 'tree'.
SELECT * FROM type where CONTAINS('apple AND tree')

-- The query is equivalent to the previous one and recommended.
SELECT * FROM type where CONTAINS('+apple +tree')

-- The query is equivalent to the previous one and not recommended.
SELECT * FROM type where CONTAINS('+apple AND +tree')

Phrases

To search for a string, consisting of several words in the exact same order (e.g. a sentence), use double quotation marks within the CONTAINS. Blank spaces are not interpreted as OR operator within these quotation marks.

-- Select all objects where the phrase 'exceptional apple tree pie' was found.
SELECT * FROM type where CONTAINS('"exceptional apple tree pie"')

-- Select all invoices where the amount is greater than 1000 and in full-text the sentence 'color picture 20x30' was found.
SELECT * FROM invoice WHERE amount>1000 AND CONTAINS('"color picture 20x30"');

Phrases are interpreted as terms. Thus, logical operators are applied to the entire phrase.

-- Select all documents where 'exceptional pie' was found and 'apple tree' was not found.
SELECT * FROM type where CONTAINS('+"exceptional pie" -"apple tree"')

Escaping in CONTAINS Statements

In a CONTAINS expression, the system interprets specific symbols as operators. If you want to use those symbols as characters NOT beeing an operator, you have to escape them with a backslash ('\').

The following rules apply:

  • The symbols < and > can not be escaped. If they are used in terms, they will be ignored.
  • All types of quotation marks used within a term have to be escaped.
  • The ^ symbol has to be escaped if not used as boosting operator.
  • The ~ symbol has to be escaped if not used as fuzzy operator.
  • Further special characters that have to be escaped: + - = & | ! ( ) { } [ ] * ? : \ /
  • If an operator AND/OR is followed by a second operator AND/OR, the second operator will be automatically escaped and interpreted as a term and/or.

Weighted Score

The SCORE function returns the score values produced by the CONTAINS function in the SEARCH_SCORE field. The score value is a relative ranking, with values in the range of 0 to 1 and increases with the relevance. The SCORE function can also be queried with an alias name.

-- Select all invoices and their score value where the invoice amount is greater than 1000 and in fulltext either 'reminder', 'discount' or 'delay' was found.
SELECT *,SCORE() FROM invoice WHERE amount>1000 AND CONTAINS('reminder discount delay');

-- With alias: Select all invoices and their score value where the invoice amount is greater than 1000 and in fulltext either 'reminder', 'discount' or 'delay' was found.
SELECT *,SCORE() score FROM invoice WHERE amount>1000 AND CONTAINS('reminder discount delay');

Full-text search only in certain fields:

-- Select all invoices and their score value where the field 'status' either contains 'reminder', 'discount' or 'delay'.
SELECT *,SCORE() FROM invoice WHERE status CONTAINS('reminder discount delay');

Sorting with ORDER BY

To sort the result list, use the ORDER BY clause.

-- Select all invoices where the invoice amount is greater than 1000, with the result list sorted in descending order of the invoice amount.
SELECT * FROM invoice WHERE amount>1000 ORDER BY amount DESC;

-- Select all invoices where the invoice amount is greater than 1000,
-- with the result list sorted in descending order of the invoice amount and in ascending order of the date.
SELECT * FROM invoice WHERE amount>1000 ORDER BY amount DESC, date ASC;

In ORDER BY clauses, you can use the alias names given in the column definitions.

-- Select the invoice amount with alias 'b' of all invoices where the invoice amount is greater than 1000,
-- with the result list sorted in descending order of the alias b (invoice amount).
SELECT amount b FROM invoice WHERE amount>1000 ORDER BY b DESC;

To sort by the score value, you must assign an alias name to the SCORE () function.

-- Select all invoices and their score value where the field 'status' either contains 'reminder',
-- 'discount' or 'delay', with the result list sorted in ascending order of alias s (score value).
SELECT *,SCORE() s FROM invoice WHERE status CONTAINS('reminder discount delay') ORDER BY s ASC;

Aggregations (COUNT(*), GROUP BY)

To determine the number of objects with special properties, queries can be aggregated. All objects with the same entry in the field specified after the GROUP BY clause are summarized. The number of objects is returned to the field 'OBJECT_COUNT' without the assignment of aliases.

-- Select the number of objects (enaio:object) with type specification, aggregated by the type of the object.
SELECT COUNT(*),type FROM system:object GROUP BY type;

It is also possible to aggregate several fields. In this case, all combinations of the fields mentioned are returned.

-- Select the number of objects (enaio:object) with type and title specification, aggregated by the type and title of the object.
SELECT COUNT(*),type,title FROM system:object GROUP BY type,title;

The GROUP BY function can be used without the COUNT(*) function. Then all combinations of the mentioned fields are returned without their numbers.

-- Select 'type' of all objects (enaio:object), aggregated by the type of the object.
SELECT type FROM system:object GROUP BY type;

To sort the result list of an aggregation, use the ORDER BY clause, with or without alias.

-- Select the number of objects (enaio:object) with type specification,
-- aggregated by the type of the object, with the result list sorted in ascending order of 'type'.
SELECT COUNT(*),type FROM enaio:object GROUP BY type ORDER BY type ASC;

-- With alias: Select the number and 'type' with alias 't' of all objects (enaio:object),
-- aggregated by 'type', with the result list sorted in descending order of alias t (type).
SELECT COUNT(*),type t FROM system:object GROUP BY type ORDER BY t DESC;

In order to be able to sort by the score value, you must assign an alias name to the SCORE () function.

-- Select the number with alias 'c' and 'type' of all objects (enaio:object), aggregated by 'type',
-- with the result list sorted in descending order of alias c (number of objects).
SELECT COUNT(*) c,type FROM system:object GROUP BY type ORDER BY c DESC;

Provided Date Functions in WHERE Clauses

The query language provides some frequently used time and time span functions for WHERE clauses.

  • today() - a date field contains today's date
  • yesterday() - a date field contains yesterday's date
  • thisweek() - a date field contains this week's date, a week starts on Monday and ends on Sunday
  • lastweek() - a date field contains last week's date, a week starts on Monday and ends on Sunday
  • thismonth() - a date field contains this month's date
  • lastmonth() - a date field contains last month's date
  • thisyear() - a date field contains this year's date
  • lastyear() - a date field contains last year's date
  • now() - the current date 2.0
  • dateadd(interval, number, date) - adds a date interval to a date and returns the date. [from yuuvis 2019 Winter]
    • valid interval values: year, yyyy, yy = Year; quarter, qq, q = Quarter; month, mm, m = month; dayofyear = Day of the year; day, dy, y = Day; week, ww, wk = Week; weekday, dw, w = Weekday; hour, hh = hour; minute, mi, n = Minute; second, ss, s = Second; millisecond, ms = Millisecond
    • number can be a positiv or negative integer value
    • date is a valid ISO8601 date value or a function which delivers a valid ISO8601 date value
Usage
SELECT ... FROM ... WHERE system:creationDate IN today();
SELECT ... FROM ... WHERE rmDestructionDate > today();
Examples
SELECT contentStreamId, fileName, repositoryId FROM system:object WHERE system:creationDate IN lastmonth();
SELECT * FROM document WHERE system:lastModificationDate IN today() AND system:creationDate IN lastyear() ORDER BY system:creationDate DESC;
SELECT COUNT(*) c, system:creationDate FROM enaio:object WHERE system:creationDate IN thisyear() GROUP BY system:creationDate ORDER BY c DESC;
dateadd() and now() Examples [[from yuuvis 2.0]]
SELECT * FROM system:object WHERE system:rmDestructionDate < now();
SELECT * FROM system:object WHERE system:creationDate < dateadd('day',35,'2018-09-20T08:37:26.280Z');
SELECT * FROM document WHERE system:lastModificationDate > dateadd('year',-1,now());

Queries on contentstream Fields

Additional to property fields of a DMS object, all contentstream fields can be queried in SELECT statements as described above. The following queriable contentstream fields exist:

  • contentStreamId- the ID of the contentstream
  • repositoryId- the ID of the repository the content is stored in
  • mimeType - the mime type of the content
  • fileName - the file name of the content
  • length - the length in bytes of the file
  • range - for compound documents the byte range which represents the object content in the file 
  • digest - the sha256 hash of the content
  • archivePath - the file path within an archive
Examples
SELECT contentStreamId, fileName, repositoryId FROM system:object;
SELECT * FROM document WHERE mimeType'message/rfc822' ORDER BY title DESC;
SELECT COUNT(*) c, fileName FROM system:object GROUP BY fileName ORDER BY c DESC;

Queries on Queriable Tables

As of version 2020 Winter, table cells can be accessed using the format in the following example:

Select on queriable table
-- syntax of table cell access
SELECT * FROM system:object WHERE queriable_table[row].column = value;          

-- examples
SELECT * FROM system:object WHERE queriable_table[row].columnx = value;
SELECT queriable_table.nickname, queriable_table.name FROM system:object WHERE queriable_table[row].columnx LIKE 'steven';
SELECT * FROM system:object WHERE (queriable_table[5].age <= 55 AND queriable_table[6].age >= 22) OR queriable_table[6].age > 6;
SELECT * FROM system:object WHERE queriable_table[2].name IN ('Erwin', 'Max');
SELECT * FROM system:object WHERE queriable_table[3].name CONTAINS('Arno');
SELECT * FROM system:object WHERE queriable_table[4].city IS NULL;
SELECT * FROM system:object WHERE queriable_table[4].city IS NOT NULL;
SELECT * FROM system:object WHERE queriable_table[5].date > lastyear();

Between the SELECT and FROM keywords, one or more table columns can be referenced with the syntax tablename.columnname as shown in the second example query above. The query will then return only the selected columns, in the example the columns nickname and name of the table queriable_table. Otherwise, like in the other examples above where the * symbol is placed between the SELECT and FROM keywords, the entire table will be returned.

The conditions are specified in the same way like in single query calls. Date functions can be used, too.

ORDER BY and GROUP BY are not supported on tables.

To reference any column or any row in the condition, the * symbol can be used as wildcard, as shown in the following examples:

Select on queriable table using wildcards representing any column and/or any row
SELECT * FROM system:object WHERE queriable_table[*].columnx = value;  -- columnx should have the given value in any row
SELECT * FROM system:object WHERE queriable_table[rowx].* = value;     -- any column should have the given value in rowx
SELECT * FROM system:object WHERE queriable_table[*].* = value;        -- any table cell should have the given value

For combined queries within one column of a table, there is a special syntax as shown in the code block below. Examples are provided as well.

Select on queriable table using wildcards representing any column and/or any row
-- syntax of table cell access
SELECT * FROM system:object WHERE queriable_table[row].(column_condition_1 AND/OR ... AND/OR column_condition_x);
SELECT * FROM system:object WHERE queriable_table[*].(column_condition_1 AND/OR ... AND/OR column_condition_x);

-- examples          
SELECT * FROM system:object WHERE queriable_table[*].(name IN ('Erwin', 'Max') AND age >= 22);  
SELECT * FROM system:object WHERE queriable_table[5].((name CONTAINS('Arno') AND age >= 22) OR city IS NULL);
SELECT * FROM system:object WHERE queriable_table[*].((name CONTAINS('Arno') AND age >= 22) OR (city IS NULL and date <= lastyear())) AND system:creationdate IN thisyear();

Within the brackets, any complex condition statement created from the set of conditions on properties described in the sections above can be queried, where each single condition will be applied on the specified row. If you use the wildcard * as row number, each table row will be tested on the complex condition. For a query match, a row has to be found for which the whole condition set is true.

As shown in the last two lines of the code block above, Combined queries on a row can also be used in combination with further conditions.
 

Queries on Tags

As of version 2020 Summer, tags can be assigned to objects. The tag properties are searchable by means of queries. Since the tag properties are stored in table format, the queries are defined like queries on tables. Each tag corresponds to one row of the table property system:tags that is assigned to every object. The row is indicated by the value of the corresponding tag's name. The four available columns are the same for all tags: namestate, creationDate and traceId. The following code block shows examples:

Examples for queries on tags
SELECT * FROM system:object WHERE system:tags[*].creationDate=TODAY();
SELECT * FROM system:object WHERE system:tags[*].name='ren:ocr';
SELECT * FROM system:object WHERE system:tags[ren:ocr].state=2;
SELECT * FROM system:object WHERE system:tags[*].(state=2 AND creationDate=TODAY());
SELECT * FROM system:object WHERE system:tags IS NULL;
SELECT * FROM system:object WHERE system:tags[ren:ocr1] IS NULL;

Of course, the queries can be combined with any other valid query as can be seen in the following examples:

Combine queries on tags with further conditions
-- search for objects have a tag created today and with the state 2
SELECT * FROM system:object WHERE system:tags[*].(state=2 AND creationDate=TODAY());    

-- search for objects having the tag 'analysis' with the state 1 that were created yesterday
SELECT * FROM system:object WHERE system:tags[analysis].state=1 AND where system:creationDate=YESTERDAY();

Queries on Audit Entries

Audit entries can be queried by querying the object type enaio:audit.

-- Select all audit entries
SELECT * FROM system:audit;

-- Select all audit entries for a specific traceId
SELECT * from system:audit WHERE traceid = '543221'

-- Select all audit entries created yesterday
SELECT * from system:audit WHERE creationDate IN YESTERDAY()

Audit entries are stored within a relational database. Thus, to query audit entries full-text queries and the return of the total amount of matching items totalNumItems within the result cannot be supported:

  • no queries with CONTAINS()
  • no queries using scoring SCORE()
  • no totalNumItems within the result set

All other features will be provided as as described above.

Query – JSON Format

A search can be called using the /api/dms/objects/search HTTP POST endpoint by giving the following JSON structure:

Input JSON using parameters
{
  "query" : {
    "statement" : "SELECT * from sysemail WHERE email = @emailAddress AND sysfrom IN @from AND CONTAINS(@text)",
    "skipCount" : 0,         										// optional for Paging
    "maxItems" : 50,         										// optional for Paging
	"useCache" : true,       										// optional
	"handleDeletedDocuments" : "DELETED_DOCUMENTS_EXCLUDE",			// optional DELETED_DOCUMENTS_INCLUDE | DELETED_DOCUMENTS_ONLY | DELETED_DOCUMENTS_EXCLUDE default: DELETED_DOCUMENTS_EXCLUDE 
 	"parameters": {													// optional, only if @ parameter occurs in statement 
		"emailAddress": "'info@optimal-systems.de'", 	
		"text": "'Meeting'", 		
		"from": "('hallo@huhu.de','info@huhu.de')"
	}
 }
}

The field statement contains the SQL query, skipCount (default: 0) defines the number of skipped objects, from with the maxItems (default: 50) number of found objects will be delivered. This defines the paging of the search call. The field useCache (default: true) defines if caching will be used. Within a default search, all objects marked as deleted will be excluded (DELETED_DOCUMENTS_EXCLUDE). By using the optional field handleDeletedDocuments this behaviour can be changed in including this objects  (DELETED_DOCUMENTS_INCLUDE) or delivering only deleted objects (DELETED_DOCUMENTS_ONLY).

Using Parameters

Within a SQL statement two predefined parameters can be referenced. By @userId the ID of the actual user (taken from the user session) and by @userRoles the roles of the actual user (also taken from the user session) can be referenced for replacement at runtime.

Input JSON with use of @userId and @userRoles parameters
{
  "query" : {
    "statement" : "SELECT * from system:object WHERE system:createdBy = @userId AND system:roles IN @userRoles"
 }
}

The JSON field parameters can be used to define parameter (reserved names: userRoles, userId) values which can be referenced by @<parameter> within a preformed SQL statement for replacement at runtime. The reserved parameter names userRoles, userId cannot be used for own definition of a parameter. A given definition under this names affects an exception.

Input JSON with use of parameters
{
  "query" : {
    "statement" : "SELECT * from system:object WHERE email = @emailAddress AND email:from IN @from AND CONTAINS(@text)",
 	"parameters": {													// optional, only if @ Parameter occurs in statement 
		"emailAddress": "'info@optimal-systems.de'",
		"text": "'Meeting'",
		"from": "('hallo@huhu.de','info@huhu.de')"
	}
 }
}