Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In case new requirements, either in terms of load capacity or functionality (ie.Eg., new supported languages), are introduced to a yuuvis® Momentum system during production, it may become necessary to overhaul the backend Elasticsearch cluster by performing a Reindex reindex operation. The operation itself is slow and resource-intensiverequires a lot of resources, as it essentially creates a copy of the original Elasticsearch Index index within the same Elasticsearch cluster, so to make sure enough storage space is available and to optionally create more data nodes which can be shut down after the operation

An Elasticsearch Reindex reindex entails the creation of a new Momentum-capable Indexindex, the migration of Elasticsearch data from the original Index index into said new Indexindex, and finally the removal of the old Indexindex. These steps are achieved by interaction with the Elasticsearch API, which is exposed by Elasticsearch through Port port 9200 on elected Master Nodesselected master nodes. It 's is highly recommended to create an Elasticsearch snapshot using the same API before attempting the Reindexreindex.

Below, you can find a detailed overview of the CURL commands needed to successfully perform a Reindex reindex in yuuvis® Momentum. Note that all commands assume you have port-forwarded the ElasticSearch API to http:\\localhost:9200.

...

Two steps are required to create an Elasticsearch Index index that can interact with yuuvis® Momentum:

  1. Creation of a new Elasticsearch index, and
  2. Applying yuuvis® Momentum Elasticsearch index mapping and settings

Creating a new Elasticsearch Index

A new index needs to be created to fit match the specifications of the new requirements. 

Expand
titleCURL command & more information

Create a new Index with a unique name.

The creation of the new Elasticsearch index allows for the optimization of the working Index index for the current storage requirements. Indices work best when storing around 10 GB, and should contain no more than 50 GB of data. 

For a cluster than contains 50GB 50 GB of Elasticsearch data, a high-performance Index index might look something like this:

Code Block
titleCURL command
collapsetrue
curl -X PUT "localhost:9200/yuuvis_2?pretty" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index": {
      "number_of_shards": 5, 
      "number_of_replicas": 1 
    }
  }
}
'

Make sure to change the Index index parameters to suit your storage and reliability requirements.

...

Applying yuuvis® Momentum Elasticsearch Index Mapping and Settings

The yuuvis® Momentum services require Elasticsearch to use a custom mapping. Using Elasticsearch's automatic mapping algorithm for reindexing renders the new index unusable for the yuuvis® Momentum system.

true

index

Expand
titleCURL command & more information

To get a compatible mapping, one can retrieve the original Elasticsearch Indexindex's mapping through use of by using the Get mapping API:

Code Block
titleCURL commandcollapsetrue
curl -X GET "localhost:9200/yuuvis/_mapping" 

Then apply the extracted Mapping mapping the new Index

Code Block
titleCURL Command apply Mapping
collapse
Code Block
titleCURL command mapping
curl -X PUT "localhost:9200/yuuvis_2/_mapping?pretty" -H 'Content-Type: application/json' -d'
{
   
    "dynamic_templates" : [
        {
            "keyword" : {
            "match" : "key_*",
            "mapping" : {
                "type" : "keyword"
            }
            }
        },
        {
            "text" : {
            "match" : "txt_*",
            "mapping" : {
                "type" : "text"
            }
            }
        },
        {
            "string" : {
            "match" : "str_*",
            "mapping" : {
                "fields" : {
                "raw" : {
                    "type" : "keyword"
                }
                },
                "type" : "text"
            }
            }
        },
        {
            "number" : {
            "match" : "num_*",
            "mapping" : {
                "type" : "long"
            }
            }
        },
        {
            "double" : {
            "match" : "dbl_*",
            "match_mapping_type" : "double",
            "mapping" : {
                "type" : "double"
            }
            }
        },
        {
            "object" : {
            "match" : "obj_*",
            "match_mapping_type" : "object",
            "mapping" : {
                "type" : "object"
            }
            }
        },
        {
            "date" : {
            "match" : "dte_*",
            "match_mapping_type" : "date",
            "mapping" : {
                "format" : "date_optional_time",
                "type" : "date"
            }
            }
        },
        {
            "boolean" : {
            "match" : "bol_*",
            "match_mapping_type" : "boolean",
            "mapping" : {
                "type" : "boolean"
            }
            }
        },
        {
            "table" : {
            "match" : "tab_*",
            "mapping" : {
                "type" : "nested"
            }
            }
        },
        {
            "rawtable" : {
            "match" : "rtb_*",
            "mapping" : {
                "fields" : {
                "raw" : {
                    "type" : "keyword"
                }
                },
                "type" : "text"
            }
            }
        },
        {
            "locationpath" : {
            "match" : "locationpath",
            "match_mapping_type" : "string",
            "match_pattern" : "regex",
            "mapping" : {
                "analyzer" : "paths",
                "type" : "string"
            }
            }
        },
        {
            "typepath" : {
            "match" : "typepath",
            "match_mapping_type" : "string",
            "match_pattern" : "regex",
            "mapping" : {
                "analyzer" : "paths",
                "type" : "string"
            }
            }
        },
        {
            "contentidx" : {
            "match" : "contentidx",
            "match_mapping_type" : "string",
            "mapping" : {
                "term_vector" : "no",
                "type" : "text"
            }
            }
        },
        {
            "contentfile" : {
            "match" : "contentfile",
            "match_mapping_type" : "string",
            "mapping" : {
                "term_vector" : "no",
                "type" : "text"
            }
            }
        }
        ],
    "properties" : {
        "contentfile" : {
            "type" : "text"
        },
        "contentidx" : {
            "type" : "text"
        },
        "dte_date" : {
            "type" : "date",
            "format" : "date_optional_time"
        },
        "dte_system:creationdate" : {
            "type" : "date",
            "format" : "date_optional_time"
        },
        "dte_system:lastmodificationdate" : {
            "type" : "date",
            "format" : "date_optional_time"
        },
        "num_appbillion:index" : {
            "type" : "long"
        },
        "num_system:contentstreamlength" : {
            "type" : "long"
        },
        "num_system:versionnumber" : {
            "type" : "long"
        },
        "str_appbillion:bmstring1" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_appbillion:bmstring2" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_name" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:basetypeid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentstreamfilename" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentstreamid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentstreammimetype" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentstreammimetypegroup" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:contentstreamrepositoryid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:createdby" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:digest" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:lastmodifiedby" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:objecttype" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:objecttypeid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:secondaryobjecttypeids" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:tenant" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        },
        "str_system:traceid" : {
            "type" : "text",
            "fields" : {
            "raw" : {
                "type" : "keyword"
            }
            }
        }
    }

}
'

You Optionally, you can also optionally base the new Indexindex' settings on the original configuration:

Code Block
titleCURL Command Apply Settings
collapsetrue
command settings
curl -X PUT "localhost:9200/yuuvis_2/_settings?pretty" -H 'Content-Type: application/json' -d'
{
    "index": {
        "codec": "best_compression",
        "number_of_shards": "80",
        "max_result_window": "2147483647",
        "analysis": {
            "filter": [],
            "analyzer": {
                "default_search": {
                    "useExactTerms": "false",
                    "prefix_length": "0",
                    "languages": ["de","en"],
                    "type": "intrafind_search",
                    "excessiveSplitting": "false",
                    "stopwords": ["",""]
                },
                "default": {
                    "useExactTerms": "false",
                    "prefix_length": "0",
                    "languages": ["de","en"],
                    "type": "intrafind_index",
                    "excessiveSplitting": "false",
                    "stopwords": ["",""]
                },
                "paths": {
                    "prefix_length": "0",
                    "tokenizer": "path_hierarchy"
                }
                
            },
            "number_of_replicas": "1"
        }
        
    }

}
'


Migrating the Data to the new Index

Once a compatible Index index has been created, the Reindex reindex operation can be triggered through the Elasticsearch API.

Expand
titleCURL command & more information

The Reindex reindex operation itself provides a few options for configuration. For large data volumes especially, we employ parameters that increase stability and performance of the operation, such as the 'slices' parameter for automatic parallelization of the Reindexreindex.

Code Block
titleCURL Command Reindex Operation
collapsetrue
command reindex operation
curl -X POST "localhost:9200/_reindex?pretty&slices=20&wait_for_completion=false&refresh" -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "yuuvis_1"
  },
  "dest": {
    "index": "yuuvis_2"
  },
  "conflicts": "proceed"
}
'


After the Reindex has reindex is completed, the new Index index must be activated by reassigning the yuuvis alias.  

Expand
titleCURL command & more information

The new index needs to inherit the 'yuuvis' alias from the original index, meaning that the 'yuuvis' alias must be deleted from the original index beforehand.

Code Block
titleCURL Command Reindex Operation
collapsetrue
command reindex operation
curl -X DELETE "localhost:9200/yuuvis_1/_alias/yuuvis?pretty"

curl -X POST "localhost:9200/_aliases?pretty" -H 'Content-Type: application/json' -d'
{
  "actions": [
    {
      "add": {
        "index": "yuuvis_2",
        "alias": "yuuvis"
      }
    }
  ]
}
'


Deleting the

...

Original Index

To free up space in the Elasticsearch Clustercluster, it 's is sensible to remove the original Index index after verifying the yuuvis® Momentum system has accepted the new Indexindex.

Expand
titleCURL command & more information

Make sure to verify that the yuuvis® Momentum system still works correctly functions properly and contains all expected data before proceeding with the deletion of the original index.

Code Block
titleCURL commandcollapsetrue
curl -X DELETE "localhost:9200/yuuvis_1"


...