Migrating from Elasticsearch to OpenSearch
From DataMiner 10.4.0 [CU2]/10.4.4 onwards, a tool is available that allows you to migrate from Elasticsearch 6.8.22 to OpenSearch 2.11.1.
To use this tool, follow the instructions below:
Stop all Agents in the DMS.
Copy the snapshot to an Elasticsearch 7.10.0 cluster and restore it.
Copy the snapshot with the re-indexed data to an OpenSearch 2.11.1 cluster and restore it.
Restart all DataMiner Agents in the DMS.
Caution
To prevent data loss, all Agents of the DMS must be stopped during this procedure. They must not be started up again until the migration is completed.
Note
As of DataMiner 10.3.0 [CU16], 10.4.0 [CU4], and 10.4.7 [CU0], an improved version of the tool is available, which among others features better logging, as well as the possibility to retry failed indexes (once the cause of the failure has been resolved) using the following command-line option: ReIndexElasticSearchIndexes.exe [-R <path to failed indexes file>]
.
Take a snapshot of the Elasticsearch 6.8.22 cluster
Using Kibana, you can take a snapshot in the following way:
Check the path.repo configuration in elasticsearch.yml.
This configuration should point to a shared file system location to which each node has access.
Check the existing repositories by sending the following request:
GET /_snapshot/_all
This will return a response containing information about all the repositories configured in Elasticsearch. In case the desired repository already exists, you can skip the next step.
Example response when a repository already exists:
{ "test_tepo_1" : { "type" : "fs", "settings" : { "location" : "/usr/share/elasticsearch/repo", "maxRestoreBytesPerSec" : "40mb", "readonly" : "false", "compress" : "true", "maxSnapshotBytesPerSec" : "40mb" } } }
Create the repository by sending the following request:
PUT /_snapshot/<repo_name> { "type": "fs", "settings": { "location" : "<shared_repo_path>", "maxRestoreBytesPerSec" : "40mb", "readonly" : "false", "compress" : "true", "maxSnapshotBytesPerSec" : "40mb" } }
Variables:
- repo_name: A repository name of your choice.
- shared_repo_path: The path to the shared folder where the snapshots will be stored.
Take the snapshot by sending the following request:
PUT /_snapshot/<repo_name>/<snapshot_name> { "indices": "dms*" }
Variable:
- snapshot_name: A snapshot name of your choice.
Check the snapshot by sending the following request:
GET /_snapshot/<repo_name>/<snapshot_name>/_status
This request returns status information for a specific snapshot within a given repository. The state should be SUCCESS and all shards should be successful.
Restore the snapshot on an Elasticsearch 7.10.0 cluster
Using Kibana, you can restore the snapshot in the following way:
Check the path.repo configuration in elasticsearch.yml.
This configuration should point to a shared file system location to which each node has access.
Check the existing repositories by sending the following request:
GET /_snapshot/_all
This will return a response containing information about all the repositories configured in Elasticsearch. In case the desired repository already exists, you can skip the next step.
Example response when a repository already exists:
{ "test_tepo_1" : { "type" : "fs", "settings" : { "location" : "/usr/share/elasticsearch/repo", "maxRestoreBytesPerSec" : "40mb", "readonly" : "false", "compress" : "true", "maxSnapshotBytesPerSec" : "40mb" } } }
Create the repository by sending the following request:
PUT /_snapshot/<repo_name> { "type": "fs", "settings": { "location" : "<shared_repo_path>", "maxRestoreBytesPerSec" : "40mb", "readonly" : "false", "compress" : "true", "maxSnapshotBytesPerSec" : "40mb" } }
Restore the snapshot by sending the following request:
POST /_snapshot/<repo_name>/<snapshot_name>/_restore
Check the snapshot by sending the following request:
GET /_snapshot/<repo_name>/<snapshot_name>/_status
This request will return information about the status of the specified snapshot. The status should be "SUCCESS".
Check the cluster health by sending the following request:
GET /_cluster/health
The status of the cluster should turn green after the restore.
Run the re-indexing tool and take a snapshot
Open a terminal, and go to the folder containing the tool. By default, this is the folder
C:\Skyline DataMiner\Tools\ReIndexElasticSearchIndexes
:cd C:\Skyline DataMiner\Tools\ReIndexElasticSearchIndexes
Run the tool with the following arguments:
Argument Description -Node or -N The name of the node to be used for re-indexing (mandatory).
Format:http(s)://127.0.0.1:9200
orhttp(s)://fqdn:9200
-User or -U The user name, to be provided in case Elasticsearch was hardened.
See Securing the Elasticsearch database-Password or -P The user password -DBPrefix or -D The database prefix, to be provided in case a custom database prefix is used instead of the default dms-
prefix.
If you do not provide a prefix, the defaultdms-
will be used.-TLSEnabled or -T Whether or not TLS is enabled for this ElasticSearch database.
Values: true or false. Default: false-RetryFile or -R <path to failed indexes file> File path for a file containing failed indexes, to be provided in case the reindexer should retry reindexing previously failed indexes (supported from 10.3.0 [CU16]/10.4.0 [CU4]/10.4.7 [CU0] onwards). Note
In case the re-indexing fails, a file
<runId>_failed.json
will be created in the folder where the tool is located, listing all failed indexes. This file can be used with the-R <path to failed indexes file>
option to retry the failed indexes. To find out why these indexes failed, check the log file created in the logging folder located in the folder where the tool is available. Before retrying the re-indexing, make sure the failures are resolved.Take a snapshot of the re-indexed data by sending the following request:
PUT /_snapshot/<repo_name>/<snapshot_name_reindexed> { "indices": "dms*" }
Check the snapshot by sending the following request:
GET /_snapshot/<repo_name>/<snapshot_name>/_status
This request will return information about the status of the specified snapshot. The status should be "SUCCESS".
Restore the snapshot with the re-indexed data to an OpenSearch 2.11.1 cluster
Check the path.repo configuration in opensearch.yml.
This configuration should point to a shared file system location to which each node has access.
Check the existing repositories by sending the following request:
GET /_snapshot/_all
This will return a response containing information about all the repositories configured in Elasticsearch. In case the desired repository already exists, you can skip the next step.
Example response when a repository already exists:
{ "test_tepo_1" : { "type" : "fs", "settings" : { "location" : "/usr/share/elasticsearch/repo", "maxRestoreBytesPerSec" : "40mb", "readonly" : "false", "compress" : "true", "maxSnapshotBytesPerSec" : "40mb" } } }
Create the repository by sending the following request:
PUT /_snapshot/<repo_name> { "type": "fs", "settings": { "location" : "<shared_repo_path>", "maxRestoreBytesPerSec" : "40mb", "readonly" : "false", "compress" : "true", "maxSnapshotBytesPerSec" : "40mb" } }
Restore the snapshot by sending the following request:
POST /_snapshot/<repo_name>/<snapshot_name_reindexed>/_restore
Check the snapshot by sending the following request:
GET /_snapshot/<repo_name>/<snapshot_name>/_status
This request will return information about the status of the specified snapshot. The status should be "SUCCESS".
Check the cluster health by sending the following request:
GET /_cluster/health
The status of the cluster should turn green after the restore.
Reconfigure DB.xml on every Agent
In case the OpenSearch nodes have been installed on a different machine than the Elasticsearch nodes you are migrating from, do the following on every Agent in your DMS:
Navigate to the folder
C:\Skyline DataMiner
and open DB.xml.Locate the IP addresses or hostnames previously used by Elasticsearch.
Tip
For information on where these are located in DB.xml, see Indexing database settings.
Replace these with the IP addresses of the OpenSearch nodes.
Save the DB.xml file and exit.
When all DB.xml files have been reconfigured, you can restart all DataMiner Agents to finish the procedure.