Failover
Failover is a process whereby a node can be taken out of a Couchbase cluster with speed.
Failover Types
There are two basic types of failover: graceful and hard.
-
Graceful: The ability to remove a Data Service node from the cluster proactively, in an orderly and controlled fashion. This involves no downtime, and allows continued application-access to data. The process promotes replica vBuckets on the remaining cluster-nodes to active status, and the active vBuckets on the affected node to dead. Throughout the process, the cluster maintains all 1024 active vBuckets for each bucket.
Graceful failover can only be used on nodes that run the Data Service. If controlled removal of a non-Data Service node is required, Removal should be used.
-
Hard: The ability to drop a node from the cluster reactively, because the node has become unavailable. If the lost node was running the Data Service, active vBuckets have been lost: therefore the hard failover process promotes replica vBuckets on the remaining cluster-nodes to active status, until 1024 active vBuckets again exist for each bucket.
Hard failover should not be used on a responsive node, since this may disrupt ongoing operations (such as the writes and replications that occur on a Data Service node). Instead, available nodes should be taken out of the cluster by means of either graceful failover (if they are Data Service nodes) or removal (if they are nodes of any kind).
Graceful failover must be manually initiated. Hard failover can be manually initiated. Hard failover can also be initiated automatically by Couchbase Server: this is known as automatic failover. The Cluster Manager detects the unavailability of a node, and duly initiates a hard failover, without administrator intervention.
Note that when a node is failed over (as opposed to removed), some replica vBuckets are lost from the surviving nodes; since some are promoted to active status, and are not replaced with new replica-copies. By contrast, removal creates new copies of those replica vBuckets that would otherwise be lost. This maintains the cluster’s previous level of data-availability; but results in greater competition for memory resources, across the surviving nodes.
Ideally, after any failover, rebalance should be performed. This is especially important when a Data Service node has been failed over, since the rebalance will ensure an optimal ratio of active to replica vBuckets across all the remaining Data Service nodes.
Detecting Node-Failure
Hard failover is performed after a node has failed. It can be initiated either by administrative intervention, or through automatic failover.
When automatic failover is used, the Cluster Manager handles both the detection of failure, and the initiation of hard failover, without administrative intervention: however, the Cluster Manager does not identify the cause of failure. Following failover, administrator-intervention is required, to identify and fix problems, and to initiate rebalance, whereby the cluster is returned to a healthy state.
If manual failover is to be used, administrative intervention is required to detect that a failure has occurred. This can be achieved either by assigning an administrator to monitor the cluster; or by creating an externally based monitoring system that uses the Couchbase REST API to monitor the cluster, detect problems, and either provide notifications, or itself trigger failover. Such a system might be designed to take into account system or network components beyond the scope of Couchbase Server.
Failover and Replica Promotion
When failover has occurred, and active vBuckets have thereby been replaced through the promotion of replicas, the resulting imbalance in the ratio of active to replica vBuckets on the surviving nodes should be corrected, by means of rebalance.
The variations in vBucket numbers and ratios that occur due to outage, failover, and rebalance are demonstrated by the following tables.
Table 1 shows the disposition of data across a cluster of four nodes. One bucket, of 31,591 items, has been configured with three replicas. The ratio of active to replica items is therefore 1:3. (Note that some numbers in the table are approximated, and are therefore shown in italics.)
Host | Active Items | Replica Items |
---|---|---|
Node 1 |
7,932 |
23,600 |
Node 2 |
7,895 |
23,600 |
Node 3 |
7,876 |
23,700 |
Node 4 |
7,888 |
23,700 |
Total |
31,591 |
63,000 |
Table 2 shows the result of failing over node 4.
Host | Active Items | Replica Items |
---|---|---|
Node 1 |
11,000 |
20,500 |
Node 2 |
10,200 |
21,100 |
Node 3 |
10,200 |
21,300 |
Node 4 |
7,888* |
23,700* |
Total |
39,288 |
86,600 |
Active and replica vBuckets for the failed over node, 4, are still counted, but are not available (and so are marked here with asterisks). To compensate, replicas on nodes 1 to 3 have been promoted to active status; this being evident from their modified numbers and ratios. For example, on node 1, the number of active items is now raised (from its former total of 7,932) to 11,000; while the number of replica items is now lowered (from its former total of 23,600) to 20,500.
Subsequent rebalance corrects the ratios, as shown by Table 3, below.
Host | Active Items | Replica Items |
---|---|---|
Node 1 |
10,500 |
21,000 |
Node 2 |
10,500 |
21,000 |
Node 3 |
10,500 |
21,000 |
Node 4 |
NA |
NA |
Total |
31,500 |
63,000 |
Ratios on nodes 1 to 3 are now 1:2, indicating that rebalance has reduced the number of replicas for the bucket from 3 to 2, in correspondence with the reduced node-count.
For further examples of rebalance in the context of node removal, see Removal.
Failover Efficiency
The efficiency with which failover occurs is related to the cluster’s Master Services. These manage operations with cluster-wide impact; such as failover, rebalance, and adding and deleting buckets. The Master Services are sometimes referred to as the Orchestrator.
At any given time, only one instance of the Master Services is in charge: the instances having negotiated among themselves, to identify and elect the instance. Should the elected instance subsequently become unavailable, another takes over.
When the node that is currently hosting the active Master Services must itself be failed over, this requires greater latency than usual; since time must be taken to ensure that negotiation and election of a replacement occur. In case where the Master Services are co-located with a highly subscribed services (in particular, the Data Service), this can result in unnecessary latency in failing over the node and ensuring that data can be served.
In consequence, the Master Services should ideally not be co-located with any service. Such a configuration is possible in Couchbase Enterprise Server 7.6+: one or more serviceless nodes can be added to the cluster, with the intention of ensuring that the Master Services will occupy such a node, and thereby not be co-located with any service.
Node Removal
Node removal uses the rebalance process to remove a node from a cluster in a controlled fashion. It creates on the remaining nodes new copies of replica vBuckets that would otherwise be lost when the selected node is taken offline. See Removal for a conceptual description of node-removal. For practical steps, see Remove a Node and Rebalance.