Scroll to navigation

SAPStartSrv_basic_cluster(7) SAPStartSrv SAPStartSrv_basic_cluster(7)

NAME

SAPStartSrv_basic_cluster - basic settings to make SAPStartSrv work

DESCRIPTION

The SAP Enqueue Standalone 2 (ENSA2) scenario needs a certain basic cluster configuration. Besides this necessary settings, some additional configurations might match specific needs.

* Operating System Basics

systemd services

The services sapinit, sapping and sappong are needed for this cluster.

For SystemV style saphostagent and sapstartsrv, the sapinit script needs to be enabled. For systemd style saphostagent and sapstartsrv, the service saphostagent needs to be enabled and running, instance services SAP${SID}_${INO} need to be disabled. See also REQUIREMENTS in man page ocf_suse_SAPStartSrv(7).

tcp_retries2 = 9

The OS network parameter tcp_retries2 influences the timeout of an alive TCP connection, when retransmissions remain unacknowledged. The SAP application servers (PAS/AAS) and central services (ASCS/ERS) are relying on TCP timeouts for detecting lost connections. On the other hand SAP session timeouts and enqueue replication timeouts are defined on application level. Tuning tcp_retries2 helps SAP sessions and enqueue replication surviving cluster actions.
A value of 9 (for HZ=250) should let Linux TCP connections timeout fast enough for default SAP application server and central services configuration.

Users and groups

Technical users and groups, such as <sid>adm are defined locally in the Linux system. Further user <sid>adm needs to be in group haclient. See man page passwd(5) and usermod(8.

Hostnames

Name resolution of the cluster nodes and the virtual IP address must be done locally on all cluster nodes. See man page hosts(5).

Time synchronization

Strict time synchronization between the cluster nodes is mandatory, e.g. per NTP. See man page chrony.conf(5).

NFS mounted filesystems

The shared filesystems /sapmnt/$SID/ and /usr/sap/$SID/ can be statically mounted on all cluster nodes. See man page fstab(5) and example below.

* CRM Basics

stonith-enabled = true

The cib bootstrap option stonith-enabled is crucial for any reliable pacemaker cluster.
The value 'true' is one pre-requisite for having a cluster supported.

resource-stickiness = 1

The crm rsc_default resource-stickiness defines the 'stickiness' score a resource gets on the node where it is currently running. This prevents the cluster from moving resources around whithout an urgent need during a cluster transition. The correct value depends on number of resources, colocation rules and resource groups. Particularly additional resources colocated to the ASCS resource can affect cluster decisions. Too high value might prevent not only unwanted but also useful actions.

migration-threshold = 1

The crm rsc_default parameter migration-threshold defines how many errors on a resource can be detected before this resource will be moved to another node. For ENSA1 the migration-threshold needs to be 1 always. For ENSA2 the value could be higher. See also failure-timeout .

record-pending = true

The crm op_default record-pending defines, whether the intention of an action upon the resource is recorded in the Cluster Information Base (CIB). Setting this parameter to ´true´ allows the user to see pending actions like ´starting´ and ´stopping´ in crm_mon. Also the sap_suse_cluster_connector interface uses this information.

failure-timeout = 86400

The crm op_default failure-timeout defines how long failed actions will be kept in the CIB. After that time the failure record will be deleted. Time unit is seconds. See also migration-threshold.
The value '86400' means failure records will be cleaned automatically after one day.

priority-fencing-delay = 30

The optional crm property priority-fencing-delay specified delay for the fencings that are targeting the lost nodes with the highest total resource priority in case we do not have the majority of the nodes in our cluster partition, so that the more significant nodes potentially win any fencing match, which is especially meaningful under split-brain of 2-node cluster. A promoted resource instance takes the base priority + 1 on calculation if the base priority is not 0. Any delay that are introduced by pcmk_delay_max configured for the corresponding fencing resources will be added to this delay. A meta attribute priority=100 or alike for the ASCS resource is needed to make this work. See ocf_suse_SAPStartSrv(7).

The delay should be significantly greater than, or safely twice, pcmk_delay_max.

EXAMPLES

* crm basic configuration

Below are examples of crm basic configuration for ENSA2 clusters. Shown are specific parameters which are needed. Some general parameters are left out.
This example has been taken from an ENSA2 three-node cluster SLE-HA 15 GA with diskless SBD:

property cib-bootstrap-options: \
expected-quorum-votes=3 \
no-quorum-policy=suicide \
dc-deadtime=20 \
have-watchdog=true \
cluster-infrastructure=corosync \
cluster-name=hacluster \
stonith-enabled=true \
stonith-watchdog-timeout=10 \
placement-strategy=balanced \

rsc_defaults rsc-options: \
resource-stickiness=1 \
migration-threshold=3 \
failure-timeout=86400

op_defaults op-options: \
timeout=600 \
record-pending=true

This example has been taken from an ENSA2 two-node cluster SLE-HA 15 GA with disk-based SBD. An optional priority fecing is configured and the SBD pcmk_delay_max has been reduced:

primitive rsc_stonith_sbd stonith:external/sbd \
params pcmk_delay_max=15

property cib-bootstrap-options: \
dc-deadtime=20 \
cluster-infrastructure=corosync \
cluster-name=hacluster \
stonith-enabled=true \
stonith-timeout=150 \
placement-strategy=balanced \
priority-fencing-delay=30

rsc_defaults rsc-options: \
resource-stickiness=1 \
migration-threshold=3 \
failure-timeout=86400

op_defaults op-options: \
timeout=600 \
record-pending=true

* NFS shares for SAP instance filesystems

Below is an fstab example for filesystems needed by the ASCS/ERS pair. The filesystems are statically mounted on all nodes of the cluster for SAP system EN2. The SAP instance name is used consequently to prepare for optional multi-SID setups. The parent directory /usr/sap/ resides on each node locally. The file sapservices must not be shared between nodes. The correct mount options are depending on the NFS server.

nfs1:/s/EN2/sapmnt /sapmnt/EN2 nfs rw,hard,intr,nolock,actimeo=1,proto=tcp 0 0
nfs1:/s/EN2/usrsap /usr/sap/EN2 nfs rw,hard,intr,nolock,actimeo=1,proto=tcp 0 0

* ping resource for checking connectivity

Below is an example of an optional ping resource for checking connectivity to the outer world. If the nodes have only one network interface, shared between HA cluster and application, this measure does not improve availability.
ASCS should run on an node from which more ping targets can be reached than from others. If all nodes are same, ASCS stays where it is. Three vital infrastructure servers outside the datacenter are choosen as ping targets. If at least two targets are reachable, the current node is preferred for running the ASCS. The maximum time for detecting connectivity changes is ca.180 seconds.

primitive rsc_ping ocf:pacemaker:ping \
op monitor interval=120 timeout=60 start-delay=10 on-fail=ignore \
params name=ping_ok host_list="proxy1 proxy2 proxy3"

clone cln_ping rsc_ping

location ASCS00_connected grp_EN2_ASCS00 \
rule 90000: ping_ok gt 1


* systemd services for the SAP instance

In case systemd style init is used for the SAP instance: saphostagent needs to be enabled and running, instance services need to be disabled. Example SID is HA1, instance number is 10.


# systemctl list-unit-files | grep -i sap
# systemctl status SAPHA1_10.service
# systemd-cgls -u SAP.slice
# systemd-cgls -u SAPHA1_10.service

* check saphostagent and show SAP instances

Basic check for the saphostagent.

# /usr/sap/hostctrl/exe/saphostctrl -function Ping
# /usr/sap/hostctrl/exe/saphostctrl -function ListInstances

* SAP instance profile

Check the instance profile for HA specific settings. Example SID is EN2, instance number is 10.


# su - en2adm
~> sapcontrol -nr 10 -function GetStartProfile |\
grep -e art_Program_ -e Autostart -e halib
~> exit

* sidadm group membership

Check if the sidadm user is member of the HA specific haclient group. Example SID is EN2.


# groups en2adm

FILES

/etc/passwd
the local user database
/etc/groups
the local group database
/etc/hosts
the local hostname resolution database
/etc/chrony.conf
basic config for time synchronisation
/etc/sysctl.d/*.conf
OS kernel parameters, e.g. TCP tunables
/etc/fstab
filesystem table, for statically mounted NFS shares
/etc/systemd/system/SAP<SID>_<NR>.service
systemd unit file for SAP instance

BUGS

In case of any problem, please use your favourite SAP support process to open a request for the component BC-OP-LNX-SUSE. Please report feedback and suggestions to feedback@suse.com.

SEE ALSO

ocf_suse_SAPStartSrv(7) , sap_suse_cluster_connector(8) , ocf_pacemaker_ping(7) , ocf_heartbeat_ethmonitor(7) , attrd_updater(8) , sbd(8) , stonith_sbd(8) , crm(8) , corosync.conf(5) , votequorum(5) , hosts(5) , fstab(5) , passwd(5) , groups(8) , usermod(8) , chrony.conf(5) , systemctl(1) , systemd-cgls(1) , systemd-analyze(1) , systemd-delta(1) , ha_related_suse_tids(7) , ha_related_sap_notes(7) ,
https://documentation.suse.com/sbp/all/?context=sles-sap ,
https://documentation.suse.com/sles-sap/ ,
https://www.suse.com/support/kb/ ,
https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

AUTHORS

F.Herschel, L.Pinne

COPYRIGHT

(c) 2020-2023 SUSE LLC
SAPStartSrv comes with ABSOLUTELY NO WARRANTY.
For details see the GNU General Public License at http://www.gnu.org/licenses/gpl.html

30 Jan 2023