Scroll to navigation

SAPCMControlZone_basic_cluster(7) SAPCMControlZone SAPCMControlZone_basic_cluster(7)

NAME

SAPCMControlZone_basic_cluster - basic settings to make SAPCMControlZone work.

DESCRIPTION

The Convergent Mediation (CM) ControlZone needs a certain basic cluster configuration. Besides neccessary settings, additional configurations might match specific needs.

* Operating System Basics

Users and groups

Technical users and groups, such as "mzadmin" are defined locally in the Linux system. See manual page passwd(5) and usermod(8). The mzadmin user needs certain environment variables set in ~/.bashrc. See manual page ocf_suse_SAPCMControlZone(7) for details.

Hostnames

Name resolution of the cluster nodes and the virtual IP address must be done locally on all cluster nodes. See manual page hosts(5).

Time synchronization

Strict time synchronization between the cluster nodes is mandatory, e.g. NTP. See manual page chrony.conf(5). Further the nodes should have configured the same timezone.

mzadmin´s ~/.bashrc

Certain values for environment variables JAVA_HOME, MZ_HOME and MZ_PLATFORM are needed, depending on the specific setup. See manual page ocf_suse_SAPCMControlZone(7) for details.

NFS mounted filesystem

A shared filesystem for ControlZone data can be statically mounted on both cluster nodes. This filesystem holds work directories, e.g. for batch processing. It must not be confused with the ControlZone application itself. The application is copied from NFS to both cluster nodes into local filesystems. Client-side write caching has to be disabled for the NFS shares containing customer data. See manual page fstab(5) and example below.

* CRM Basics

stonith-enabled = true

The cib bootstrap option stonith-enabled is crucial for any reliable pacemaker cluster.
The value 'true' is one pre-requisite for having a cluster supported.

migration-threshold = 3

The crm rsc_default parameter migration-threshold defines how many errors on a resource can be detected before this resource will be moved to another node. A value greater than 1 is needed for resource monitor option on-fail=restart. See also failure-timeout.

record-pending = true

The crm op_default record-pending defines, whether the intention of an action upon the resource is recorded in the Cluster Information Base (CIB). Setting this parameter to ´true´ allows the user to see pending actions like ´starting´ and ´stopping´ in crm_mon.

failure-timeout = 86400

The crm op_default failure-timeout defines how long failed actions will be kept in the CIB. After that time the failure record will be deleted. Time unit is seconds. See also migration-threshold.
The value '86400' means failure records will be cleaned automatically after one day.

priority-fencing-delay = 30

The optional crm property priority-fencing-delay specified delay for the fencings that are targeting the lost nodes with the highest total resource priority in case we do not have the majority of the nodes in our cluster partition, so that the more significant nodes potentially win any fencing match, which is especially meaningful under split-brain of 2-node cluster. A promoted resource instance takes the base priority + 1 on calculation if the base priority is not 0. Any delay that are introduced by pcmk_delay_max configured for the corresponding fencing resources will be added to this delay. A meta attribute priority=100 or alike for the ControlZone resource is needed to make this work. See ocf_suse_SAPCMControlZone(7).
The delay should be significantly greater than, or safely twice, pcmk_delay_max.

EXAMPLES

* CRM basic configuration.

This example has been taken from a two-node cluster SLE-HA 15 SP4 with disk-based SBD. Priority fencing is configured and the SBD pcmk_delay_max has been reduced accordingly. The stonith-timeout is adjusted to SBD on-disk msgwait. The migration-threshold is set to 3. The failure-timeout is 86400.

primitive rsc_stonith_sbd stonith:external/sbd \
params pcmk_delay_max=15

property cib-bootstrap-options: \
cluster-infrastructure=corosync \
placement-strategy=balanced \
dc-deadtime=20 \
stonith-enabled=true \
stonith-timeout=150 \
stonith-action=reboot \
have-watchdog=true \
priority-fencing-delay=30

rsc_defaults rsc-options: \
resource-stickiness=1 \
migration-threshold=3 \
failure-timeout=86400

op_defaults op-options: \
timeout=120 \
record-pending=true

* Statically mounted NFS share for ControlZone platform data.

Below is an fstab example for a shared filesystem holding application data. The filesystem is statically mounted on all nodes of the cluster. The correct mount options are depending on the NFS server. However, client-side write caching has to be disabled in any case.

nfs1:/s/c11/platform /mnt/platform nfs4 rw,noac,sync,default 0 0

Note: The NFS share might be monitored, but not mounted/umounted by the HA cluster. See ocf_suse_SAPCMControlZone(7) for details.

* Ping cluster resource for checking connectivity.

Below is an example of an optional ping resource for checking connectivity to the outer world. If the nodes have only one network interface, shared between HA cluster and application, this measure may not improve availability.
ControlZone should run on an node from which more ping targets can be reached than from others. If all nodes are same, the ControlZone application resource group (e.g. grp_C11) stays where it is. Three vital infrastructure servers outside the datacenter are choosen as ping targets. If at least two targets are reachable, the current node is preferred for running ControlZone. The maximum time for detecting connectivity changes is ca.180 seconds.

primitive rsc_ping ocf:pacemaker:ping \
params name=ping-okay host_list="proxy1 proxy2 proxy3" \
op monitor interval=120 timeout=60 start-delay=10 on-fail=ignore

clone cln_ping rsc_ping

location loc_connect_C11 grp_C11 \
rule 90000: ping-okay gt 1

FILES

/etc/passwd
the local user database
/etc/groups
the local group database
/etc/hosts
the local hostname resolution database
/etc/chrony.conf
the basic configuration for time synchronisation
/etc/sysctl.d/*.conf
the OS kernel parameters, e.g. TCP tunables
/etc/fstab
the filesystem table, for statically mounted NFS shares
~/.bashrc
the mzadmin´s ~/.bashrc, defining JAVA_HOME, MZ_HOME and MZ_PLATFORM
/usr/lib64/jvm/jre-17-openjdk/
the Java 1.7 runtime environment shipped with the OS, potentially $JAVA_HOME

BUGS

In case of any problem, please use your favourite SAP support process to open a request for the component BC-OP-LNX-SUSE.
Please report feedback and suggestions to feedback@suse.com.

SEE ALSO

ocf_suse_SAPCMControlZone(7), ocf_heartbeat_ping(7) , crm(8) , passwd(5) , usermod(8) , hosts(5) , fstab(5) , nfs(5) , mount(8) , chrony.conf(5) , ha_related_suse_tids(7) , ha_related_sap_notes(7) ,
https://documentation.suse.com/sbp/sap/ ,
https://documentation.suse.com/#sle-ha ,
https://www.suse.com/support/kb/doc/?id=000019514 ,
https://www.suse.com/support/kb/doc/?id=000019722 ,
https://launchpad.support.sap.com/#/notes/1552925 ,
https://launchpad.support.sap.com/#/notes/3079845

AUTHORS

F.Herschel, L.Pinne

COPYRIGHT

(c) 2023-2024 SUSE LLC
SAPCMControlZone comes with ABSOLUTELY NO WARRANTY.
For details see the GNU General Public License at http://www.gnu.org/licenses/gpl.html

18 Mar 2024