crm_no_quorum_policy(7) | ClusterTools2 | crm_no_quorum_policy(7) |
NAME¶
crm_no_quorum_policy - how the cluster should act in case of quorum loss.
DESCRIPTION¶
The crm basic parameter no_quorum_policy defines how the cluster should act in case of quorum loss.
A cluster is said to have quorum when more than half the known or expected nodes are online, or for the mathematically inclined, whenever the following equation is true: total_nodes < 2 * active_nodes .
A two-node cluster only has quorum when both nodes are running, which is no longer the case if one node fails. This would normally make the creation of a two-node cluster pointless, however it is possible to control how Pacemaker behaves when quorum is lost. In particular, we can tell the cluster to simply ignore quorum at all. This is the recommended setup for two-node clusters. Of course, exactly one node should survive in such a scenario. Therefor the STONITH has to set up accordingly.
With more than two nodes, the cluster must not ignore the quorum loss. For this multi-node clusters, an odd number of nodes is recommended. Different policies are available to handle the quorum loss. Nevertheless, STONITH has to be set up as well.
The following policies are available:
no_quorum_policy = stop
Node wont shoot anyone when it doesn't have quorum. Cluster in affected
partition will stop resources and will not be able to start it's existing
resources. This is the default, if nothing is configured.
no_quorum_policy = ignore
Node will shoot anyone it can't see. Cluster will not be able to add and start
new resources. This is the recommended setting for two-node clusters. Note:
For SLE-HA12 SP2 and later the corosync parameter two-node:1 is recommended
instead.
no_quorum_policy = freeze
Cluster in affected partition wont shoot anyone when it doesn't have quorum.
Cluster in affected partition will not be able to add and start new
resources.
no_quorum_policy = suicide
Cluster will only shoot itself and any other nodes in its current partition.
It wont try to shoot the nodes it can't see.
EXAMPLES¶
* Two-node cluster
Common configuration for the widely used Simple Stack, as well as for SAP Enqueue Replication, SAP HANA System Replication scale-up, and IBM DB2 HADR. The cluster can be stretched across two sites. The stonith-timeout depends on the storage stack. For stretched clusters, usually two SAN storages are used. Thus, at least two SBD devices are used.
property $id="cib-bootstrap-options" \
expected-quorum-votes="2" \
no-quorum-policy="ignore" \
dc-deadtime="20s" \
default-resource-stickiness="1000" \
stonith-enabled="true" \
stonith-timeout="150s"
* Multi-node cluster across two sites, with auxiliary node on third site
Configuration for SAP HANA System Replication scale-out. Since this setup is symmetrical across two sites, a third site with an auxiliary cluster node is needed. This makes it work in case one complete site fails at once. This so called majority maker node does not need to run any application resources. Nevertheless, it needs access to all STONITH resources. The expected-quorum-votes number is two times the total number of nodes at one HANA site plus one. The stonith-timeout depends on the storage stack. Usually two SAN storages, and thus at least two SBD devices are used. A third SBD device at the auxiliary site makes sense.
property $id="cib-bootstrap-options" \
expected-quorum-votes="31" \
no-quorum-policy="stop" \
dc-deadtime="20s" \
default-resource-stickiness="1000" \
stonith-enabled="true" \
stonith-timeout="150s" \
concurrent-fencing="true"
* Three-node cluster at single site
Common configuration for a local cluster with OCFS2 cluster filesystem, f.e. some SAP BW Accelerator installations. In case of quorum loss, the cluster node commits suicide to clear the OCFS2 resources. For this setup with three nodes, it is assumed that not more than one node fails at once. The stonith-timeout depends on the storage stack. Usually one SAN storage, and thus one SBD device is used.
property $id="cib-bootstrap-options" \
expected-quorum-votes="3" \
no-quorum-policy="suicide" \
dc-deadtime="20s" \
stonith-enabled="true" \
stonith-timeout="150s
FILES¶
- /usr/share/ClusterTools2/samples/
-
samples of config files and scripts.
BUGS¶
Feedback is welcome, please mail to feedback@suse.com
SEE ALSO¶
crm(8), cs_prepare_crm_basics(8), stonith(8),
stonith_sbd(7), sbd(8), crm_simulate(8),
corosync-cfgtool(8), corosync.conf(7),
ClusterTools2(7),
http://www.suse.com/documentation/sle_ha/ ,
https://www.suse.com/support/kb/doc.php?id=7012110 ,
http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg00106.html ,
http://www.gossamer-threads.com/lists/linuxha/pacemaker/72609?do=post_view_threaded
AUTHORS¶
The content of this manual page was mostly derived from online documentation mentioned above.
COPYRIGHT¶
(c) 2011-2018 SUSE Linux GmbH, Germany.
(c) 2019 SUSE LLC ClusterTools2 comes with ABSOLUTELY NO WARRANTY.
For details see the GNU General Public License at
http://www.gnu.org/licenses/gpl.html
01 Nov 2019 |