Page Properties

hidden	true
id	DONE

Product Version	2021 Autumn
Report Note
Assignee	Antje

Resources & Remarks

Modification History

Name	Date	Product Version	Action
Antje	21 JUL 2021	2021 Autumn	created
Goran	18 OCT 2021	2021 Autumn	updated
Antje	25 NOV 2022	2022 Winter	remove beta label, update content

Excerpt
Responsible for preparing data for training, training of machine learning models, evaluation of trained models, and preparing for the deployment to the production.

...

Section

border	true

Column

Characteristics

Note

title	Beta Version

The Machine Learning (ML

)

Training Pipeline is a component of the Artificial Intelligence Platform. This platform is not included in yuuvis® Momentum installations and is available as a beta version only on request.

Function

The ML Training Pipeline is part of the Artificial Intelligence Platform responsible for data ingestion, data validation, transformation, machine learning training, and model evaluation. The pipeline is based on MLflow – an open-source platform for managing ML lifecycles.

ML Training Pipeline is used via the command line application Kairos CLI.

Data Export

The source of data for machine learning is a document management system, e.g., yuuvis® Momentum. The data exported from yuuvis® Momentum are stored on local storage devices, S3 or Azure Blob Storage, in the format suitable for data ingestionshall be exported in a predefined format and shall be made available to the provided training pipelines.

Machine Learning Pipelines

The machine learning pipelines are components developed and shipped by OPTIMAL SYSTEMS GmbH. They contain all necessary procedures and algorithms to train machine learning models for different purposes (e.g., document classification and metadata extraction)

...

.

At the moment,

...

pipelines can be used for document classification (for instance it can determine whether a document is an invoice, a contract, a sick-leave or something else) and for metadata extraction (for instance, extract the issuing date, total amount and invoice number from an invoice).

Document Classification

In the context of the AI platform, classification means the determination of suitable typification classes fitting for an object based on its full-text rendition. For one object, one prediction is provided that contains mappings of classes and their corresponding relevance probability as well as a reference on the object in yuuvis® Momentum via objectId.

...

ML Pipeline can analyze the PDF rendition of binary content files assigned to objects in yuuvis® Momentum in order to extract specific metadata. Based on the trained models, predictions for values of specific object properties can be determined. The object properties have to be listed in the Inference Schema where conditions for the values and settings for the prediction responses are also specified.

Machine Learning Training

The training of machine learning models can be run using Kairos CLI App to define what data to use and which ML Pipeline to run in order to get a model for the desired purpose, for example, invoice metadata extraction.

Model Evaluation

After the machine learning training is done, the model is evaluated. By examining training results, the user decides whether the model is suitable for use or needs further tuning of hyperparameters, bigger longer training, larger data set, etc.

Model Registry

Models that are suitable for further use are stored in the Model Registry component. From the Model Registry component, models can be built dockerized and deployed to the Model Serving infrastructureserving infrastructure (typically, to the same Kubernetes cluster where yuuvis Momentum is running).

Info

icon	false

Read on

Section

Column

width	25%

Inference Schema

Insert excerpt

	Inference Schema
	Inference Schema
nopanel	true

Keep reading

Column

width	25%

Kairos CLI App

KAIROS-API Service

Insert excerpt

	Kairos CLI App	Kairos CLI AppKAIROS-API Service
	KAIROS-API Service
nopanel	true

Keep reading

Column

width	25%

PREDICT-API Service

Insert excerpt

	PREDICT-API Service
	PREDICT-API Service
nopanel	true

Keep reading

...

Versions Compared

Old Version 11

New Version Current

Key

Table of Contents

Characteristics

Function

Data Export

Machine Learning Pipelines

Document Classification

Machine Learning Training

Model Evaluation

Model Registry

Read on

Inference Schema

KAIROS-API Service

PREDICT-API Service

Page Comparison

Versions Compared

Old Version 11

New Version Current

Key

Table of Contents

Characteristics

Function

Data Export

Machine Learning Pipelines

Document Classification

Machine Learning Training

Model Evaluation

Model Registry

Read on

Inference Schema

KAIROS-API Service

PREDICT-API Service