Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Excerpt

If you use Elasticsearch as search engine for yuuvis® Momentum, find here a reindex example procedure.


Section
bordertrue


Column

Table of Contents

Table of Contents
exclude(Table of Contents|Read on|COMMANDER Service for System Maintenance|yuuvis® Postman Collections|Installation and Configuration)


Introduction

In case new requirements, either in terms of load capacity or functionality (i.E. new supported languages), are introduced to a yuuvis® Momentum System system during production, it may become necessary to overhaul the backend Elasticsearch cluster by performing a Reindex operation. The operation itself is slow and resource-intensive, as it essentially creates a copy of the orignal original Elasticsearch Index within the same Elasticsearch cluster, so make sure enough storage space is available and optionally create more data nodes which can be shut down after the operation

...

Below you can find a detailed overview of the CURL commands needed to successfully perform a Reindex in yuuvis® Momentum. Note that all commands assume you have port-forwarded the ElasticSearch API to http:\\localhost:9200.

...

Creating a Momentum-Capable Elasticsearch Index

Two steps are required to create an Elasticsearch Index that can interact with yuuvis® Momentum:1. Creating

Creating a new Elasticsearch Index

A new index needs to be created to fit the specifications of the new requirements. 

Expand
titleCURL command & more information

Create a new Index with a unique name.

The creation of the new elasticsearch Elasticsearch index allows for the optimization of the working Index for the current storage requirements. Indices work best when storing around 10 GB, and should contain no more than 50 GB of data. 

For a cluster than contains 50GB of Elasticsearch data, a high-performance Index might look something like this:

Code Block
titleCURL command
collapsetrue
curl -X PUT "localhost:9200/yuuvis_2?pretty" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index": {
      "number_of_shards": 5, 
      "number_of_replicas": 1 
    }
  }
}
'

Make sure to change the Index parameters to suit your storage and reliability requirements.

...

Apply yuuvis® Momentum Elasticsearch Index Mapping and Settings

The yuuvis® Momentum services require Elasticsearch to use a custom mapping. Using ElasticsearchsElasticsearch's automatic mapping algorithm for reindexing renders the new index unusable for the yuuvis® Momentum Systemsystem.

Expand
titleCURL command & more information

To get a compatible mapping, one can retrieve the original Elasticsearch IndexsIndex's mapping through use of the Get mapping API:

Code Block
titleCURL command
collapsetrue
curl -X GET "localhost:9200/yuuvis/_mapping" 

Then apply the extracted Mapping the new Index

Code Block
titleCURL Command apply Mapping
collapsetrue
curl -X PUT "localhost:9200/yuuvis_2/_mapping?pretty" -H 'Content-Type: application/json' -d'
{
   
    "dynamic_templates" : [
        {
            "keyword" : {
            "match" : "key_*",
            "mapping" : {
                "type" : "keyword"
            }
            }
        },
        {
            "text" : {
            "match" : "txt_*",
            "mapping" : {
                "type" : "text"
            }
            }
        },
        {
            "string" : {
            "match" : "str_*",
            "mapping" : {
                "fields" : {
                "raw" : {
                    "type" : "keyword"
                }
                },
                "type" : "text"
            }
            }
        },
        {
            "number" : {
            "match" : "num_*",
            "mapping" : {
                "type" : "long"
            }
            }
        },
        {
            "double" : {
            "match" : "dbl_*",
            "match_mapping_type" : "double",
            "mapping" : {
                "type" : "double"
            }
            }
        },
        {
            "object" : {
            "match" : "obj_*",
            "match_mapping_type" : "object",
            "mapping" : {
                "type" : "object"
            }
            }
        },
        {
            "date" : {
            "match" : "dte_*",
            "match_mapping_type" : "date",
            "mapping" : {
                "format" : "date_optional_time",
                "type" : "date"
            }
            }
        },
        {
            "boolean" : {
            "match" : "bol_*",
            "match_mapping_type" : "boolean",
            "mapping" : {
                "type" : "boolean"
            }
            }
        },
        {
            "table" : {
            "match" : "tab_*",
            "mapping" : {
                "type" : "nested"
            }
            }
        },
        {
            "rawtable" : {
            "match" : "rtb_*",
            "mapping" : {
                "fields" : {
                "raw" : {
                    "type" : "keyword"
                }
                },
                "type" : "text"
            }
            }
        },
        {
            "locationpath" : {
            "match" : "locationpath",
            "match_mapping_type" : "string",
            "match_pattern" : "regex",
            "mapping" : {
                "analyzer" : "paths",
                "type" : "string"
            }
            }
        },
        {
            "typepath" : {
            "match" : "typepath",
            "match_mapping_type" : "string",
            "match_pattern" : "regex",
            "mapping" : {
                "analyzer" : "paths",
                "type" : "string"
            }
            }
        },
        {
            "contentidx" : {
            "match" : "contentidx",
            "match_mapping_type" : "string",
            "mapping" : {
                "term_vector" : "no",
                "type" : "text"
            }
            }
        },
        {
            "contentfile" : {
            "match" : "contentfile",
            "match_mapping_type" : "string",
            "mapping" : {
                "term_vector" : "no",
                "type" : "text"
            }
            }
        }
        ],
    "properties" : {
        "contentfile" : {
            "type" : "text"
        },
        "contentidx" : {
            "type" : "text"
        },
        "dte_date" : {
            "type" : "date",
            "format" : "date_optional_time"
        },
        "dte_system:creationdate" : {
            "type" : "date",
            "format" : "date_optional_time"
        },
        "dte_system:lastmodificationdate" : {
            "type" : "date",
            "format" : "date_optional_time"
        },
        "num_appbillion:index" : {
            "type" : "long"
        },
        "num_system:contentstreamlength" : {
            "type" : "long"
        },
        "num_system:versionnumber" : {
            "type" : "long"
        },
        "str_appbillion:bmstring1" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_appbillion:bmstring2" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_name" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:basetypeid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentstreamfilename" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentstreamid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentstreammimetype" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentstreammimetypegroup" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentstreamrepositoryid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:createdby" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:digest" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:lastmodifiedby" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:objecttype" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:objecttypeid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:secondaryobjecttypeids" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:tenant" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:traceid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        }
    }

}
'

You can also optionally base the new Index' settings on the orignal original configuration:

Code Block
titleCURL Command Apply Settings
collapsetrue
curl -X PUT "localhost:9200/yuuvis_2/_settings?pretty" -H 'Content-Type: application/json' -d'
{
    "index": {
        "codec": "best_compression",
        "number_of_shards": "80",
        "max_result_window": "2147483647",
        "analysis": {
            "filter": [],
            "analyzer": {
                "default_search": {
                    "useExactTerms": "false",
                    "prefix_length": "0",
                    "languages": ["de","en"],
                    "type": "intrafind_search",
                    "excessiveSplitting": "false",
                    "stopwords": ["",""]
                },
                "default": {
                    "useExactTerms": "false",
                    "prefix_length": "0",
                    "languages": ["de","en"],
                    "type": "intrafind_index",
                    "excessiveSplitting": "false",
                    "stopwords": ["",""]
                },
                "paths": {
                    "prefix_length": "0",
                    "tokenizer": "path_hierarchy"
                }
                
            },
            "number_of_replicas": "1"
        }
        
    }

}
'

...


Migrating the Data to the new Index

Once a compatible Index has been created, the Reindex operation can be triggered through the Elasticsearch API.

Expand
titleCURL command & more information

The Reindex operation itself provides a few options for configuration. For large data volumes especially, we employ parameters that increase stability and performance of the operation, such the 'slices' parameter for automatic parallelization of the Reindex.

Code Block
titleCURL Command Reindex Operation
collapsetrue
curl -X POST "localhost:9200/_reindex?pretty&slices=20&wait_for_completion=false&refresh" -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "yuuvis_1"
  },
  "dest": {
    "index": "yuuvis_2"
  },
  "conflicts": "proceed"
}
'


After the reindex Reindex has completed, the new Index must be activated by reassigning the 'yuuvis' alias alias.  

Expand
titleCURL command & more information

The new index needs to inherit the 'yuuvis' alias from the original index, meaning that the 'yuuvis' alias must be deleted from the original index beforehand.

Code Block
titleCURL Command Reindex Operation
collapsetrue
curl -X DELETE "localhost:9200/yuuvis_1/_alias/yuuvis?pretty"

curl -X POST "localhost:9200/_aliases?pretty" -H 'Content-Type: application/json' -d'
{
  "actions": [
    {
      "add": {
        "index": "yuuvis_2",
        "alias": "yuuvis"
      }
    }
  ]
}
'

...


Deleting the original Index

To free up space in the Elasticsearch Cluster, it's sensible to remove the original Index after verifying the Momentum system has accepted the new Index.

Expand
titleCURL command & more information

Make sure to verify the Momentum system still works correctly and contains all expected data before proceeding with the deletion of the original index.

Code Block
titleCURL command
collapsetrue
curl -X DELETE "localhost:9200/yuuvis_1"



Info
iconfalse

Read on

Section


Column
width25%

COMMANDER Service for System Maintenance

Insert excerpt
COMMANDER Service for System Maintenance
COMMANDER Service for System Maintenance
nopaneltrue
 Keep reading


Column
width25%

yuuvis® Postman Collections

Insert excerpt
yuuvis® Postman Collections
yuuvis® Postman Collections
nopaneltrue
 Keep reading


Column
width25%

Service Monitoring and Maintenance

Insert excerpt
Service Monitoring and Maintenance
Service Monitoring and Maintenance
nopaneltrue
 Keep reading