Rebalancing the Cluster
A cluster is rebalanced with the POST /controller/rebalance
HTTP method and URI.
Description
Clusters can be rebalanced. When one or more nodes have been brought into a cluster, or have been taken out of a cluster, rebalance redistributes data, indexes, event processing, and query processing among the available nodes. The cluster map is correspondingly updated and distributed to clients. The process occurs while the cluster continues to service requests for data.
A conceptual overview of nodes, and how they are combined into clusters, is provided in Nodes.
Curl Syntax
curl -v -X POST -u [admin]:[password] http://[localhost]:8091/controller/rebalance [-d knownNodes | -d ejectedNodes]
The parameters are:
-
knownNodes
. Must always be specified. The value is a list containing all nodes that are in the cluster at the start of the rebalance process. -
ejectedNodes
. Specified when one or more nodes are to be ejected from the cluster by the rebalance process. The value is a list containing all nodes to be ejected.The node-list specified by
ejectedNodes
is thus a subset of that specified byknownNodes
.
For information on rebalance, see Rebalance.
Responses
Success gives the status 200 OK
, with no object returned.
If either parameter specifies an unknown node, or if a node currently in the cluster is omitted from the value of knownNodes
, the status 400 Bad Request
is given, and the following object is returned:
{"mismatch":1}
If knownNodes
is not specified and ejectedNodes
is specified, the status 400 Bad Request
is given, and the following object is returned:
{"empty_known_nodes":1}
Examples
The following examples demonstrate how to determine which nodes are currently in the cluster; how to rebalance after a hard failover; how to rebalance in order to eject a node from the cluster; and how to re-add an ejected node to a cluster.
For information on how to retrieve status on an ongoing rebalance, see Getting Rebalance Progress and Getting Cluster Tasks.
Return Cluster Nodes
To determine which nodes are currently in the cluster, use the GET /pools/default
HTTP method and URI.
In this example, the (extensive) output is piped to the jq
tool to be formatted, and the lines featuring the hostname
are then filtered by grep
, to ensure readability.
curl -u Administrator:password -X GET \ http://10.143.190.101:8091/pools/default | jq '.' | grep hostname
The output contains the following lines, which specify all nodes currently in the cluster:
"hostname": "10.143.190.101:8091", "hostname": "10.143.190.102:8091", "hostname": "10.143.190.103:8091",
For more information on this method and URI, see Viewing Cluster Details.
Perform Rebalance Following a Hard Failover
The hard failover of a node results in that node continuing to be a member of the cluster, but being unable to serve data. Since its active vBuckets are unavailable, corresponding replica vBuckets are promoted to active status on the still-available nodes. This means that data continues to be served, but is in an imbalanced state across the cluster. This requires rebalance.
For detailed conceptual information, see Hard Failover. For information on performing a hard failover, see Failing over Nodes.
To perform the rebalance, specify all cluster-nodes in the value for the knownNodes
parameter, including the one or ones to which hard failover is known to have been applied:
curl -u Administrator:password -v -X POST \ http://10.143.190.101:8091/controller/rebalance \ -d 'knownNodes=ns_1%4010.143.190.101%2Cns_1%4010.143.190.102%2Cns_1%4010.143.190.103'
On success, the status 200 OK
is given, and no object is returned.
The failed-over node has been removed, and the data is now distributed evenly across the surviving nodes.
The successful node-removal is confirmed by again examining the current cluster nodes:
curl -u Administrator:password -X GET \ http://10.143.190.101:8091/pools/default | jq '.' | grep hostname
The output indicates which nodes are still in the cluster:
"hostname": "10.143.190.101:8091", "hostname": "10.143.190.102:8091",
This confirms that 10.143.190.103
has been removed.
Add a Node to a Cluster, then Rebalance
Adding a node to a cluster is a two-step process.
-
The
POST /controller/addNode
HTTP method and URI are used to add the node. This allows service-deployment for the node to be specified. A placeholder username and password can be specified, when adding an unprovisioned node. -
The
POST /controller/rebalance
HTTP method and URI are used to rebalance the added node into the cluster. Include the new node in theknownNodes
node-list.
For example, the following command adds node 10.143.1990.103
to the cluster from which it was removed, and assigns it the Data Service:
curl -u Administrator:password -X POST \ 10.142.181.101:8091/controller/addNode \ -d 'hostname=10.143.190.103&user=someName&password=somePassword&services=kv'
If successful, this returns the following object, indicating that the node is now recognized as a member of the cluster:
{"otpNode":"ns_1@10.143.190.103"}
For more information on this method and URI, see Adding Nodes to Clusters.
Next, the added node is rebalanced into the cluster. This allows it to take its share of the data-distribution.
curl -u Administrator:password -v -X POST \ http://10.143.190.101:8091/controller/rebalance \ -d 'knownNodes=ns_1%4010.143.190.101%2Cns_1%4010.143.190.102%2Cns_1%4010.143.190.103'
On success, the response code 200 OK
is given, and no object is returned.
The cluster is now rebalanced.
At the conclusion, the cluster can again be checked for its current membership:
curl -u Administrator:password -X GET \ http://10.143.190.101:8091/pools/default | jq '.' | grep hostname
The output now includes the following:
"hostname": "10.143.190.101:8091", "hostname": "10.143.190.102:8091", "hostname": "10.143.190.103:8091",
This confirms that 10.143.190.103
has been rebalanced into the cluster.
Eject a Node
To eject a node, use the POST /controller/rebalance
HTTP method and URI.
Specify the entire current node-list for the cluster as the value of the knownNodes
parameter.
Specify the list of nodes to be ejected as the value of the ejectedNodes
parameter.
For example, the following command ejects node 10.143.190.103
from the cluster:
curl -u Administrator:password -v -X POST \ http://10.143.190.101:8091/controller/rebalance \ -d 'ejectedNodes=ns_1%4010.143.190.103' \ -d 'knownNodes=ns_1%4010.143.190.101%2Cns_1%4010.143.190.102%2Cns_1%4010.143.190.103'
On success, the response code 200 OK
is given, and no object is returned.
At the conclusion, the cluster can again be checked for its current membership:
curl -u Administrator:password -X GET \ http://10.143.190.101:8091/pools/default | jq '.' | grep hostname
The output now includes the following:
"hostname": "10.143.190.101:8091", "hostname": "10.143.190.102:8091",
Adjusting Rebalance During Compaction
Description
If a rebalance is performed while a node is undergoing index compaction, rebalance delays may be experienced.
The parameter, rebalanceMovesBeforeCompaction
, is used to improve rebalance performance: potentially, this results in a larger index.
This setting can be modified with the POST /internalSettings
endpoint.
By default, it is 64.
This specifies that 64 vBuckets are to be moved per node; at which point all vBucket movement is paused, and index compaction is triggered.
Since index compaction is therefore not performed while vBuckets are being moved, a large rebalanceMovesBeforeCompaction
value results in the server spending less time compacting indexes; potentially resulting in larger index files, which take up more disk space.
For example:
curl -X POST -u Administrator:password 'http://10.5.2.54:8091/internalSettings' \ -d 'rebalanceMovesBeforeCompaction=256'
See Also
For conceptual information on rebalance, see Rebalance. For information on how to retrieve status on an ongoing rebalance, see Getting Rebalance Progress and Getting Cluster Tasks.
For conceptual information on hard failover, see Hard Failover. For information on performing a hard failover with the REST API, see Failing over Nodes. For information on retrieving details of a cluster, including its current nodes, see Viewing Cluster Details. For information on obtaining and reading rebalance reports, see the Rebalance Reference.