Scroll to navigation

SAPHanaSR-alert-fencing(7) SAPHanaSR SAPHanaSR-alert-fencing(7)

NAME

SAPHanaSR-alert-fencing - Alert agent for cluster fencing alerts.

DESCRIPTION

SAPHanaSR-alert-fencing can be used to react on Linux cluster fencing alerts.

The Linux cluster provides an interface to initiate external actions when a cluster event occurs (alert). Than the cluster calls an external program (an alert agent) to handle that alert.

When the Linux cluster has performed an node fencing, it can call SAPHanaSR-alert-fencing on each active cluster node. The agent checks whether the local node belongs to the same HANA site as the fenced node. If so, it asks the cluster to fence the local node as well.

This improves four use cases for HANA scale-out:
- HA/DR provider hook script susChkSrv.py action_on_lost=fence
- resource agent SAPHanaController ON_FAIL_ACTION=fence
- resource agent SAPHanaFilesystem ON_FAIL_ACTION=fence
- pacemaker service PCMK_fail_fast=yes
See also manual pages ocf_suse_SAPHanaController(7), ocf_suse_SAPHanaFilesystem(7), SAPHanaSR-ScaleOut_basic_cluster(7) and susChkSrv.py(7).

SUPPORTED PARAMETERS

If the alert agent does not complete within this amount of time, it will be terminated. Optional, default "30s". Example "meta timeout=30s".
How long a node must be up and running (uptime) before fencing alerts will be processed. This avoids fencing loops. Optional, default "300". Example "attributes alert_uptime_threshold=300".

RETURN CODES

0 Successful program execution.
>0 Usage, syntax or execution errors.
In addition log entries are written, which can be scanned by using a pattern like "SAPHanaSR-alert-fencing".

EXAMPLES

* Example configuration for the fencing alert handler.

The following lines needs to be added to the cluster´s CIB:

alert fencing-1 "/usr/bin/SAPHanaSR-alert-fencing" \
select fencing \
attributes alert_uptime_threshold=300

* Example for configuring the alert agent by using crm.

Alternate way for configuring the alert agent.

# crm configure alert fencing-1 "/usr/bin/SAPHanaSR-alert-fencing" select fencing

* Showing all configured alert agents.

# crm configure show type:alert

* Showing agent messages.

# grep SAPHanaSR-alert-fencing /var/log/messages

* Showing history of fence actions and cleaning it up.

Example node with failed fencing action is node22.

# crm_mon -1 --include=none,fencing
# stonith_admin --cleanup --history node22

* Example for manually fencing an node.

This could be done for testing the SAPHanaSR-alert-fencing agent integration. This test should not be done on production systems. See manual page crm(8) for details. Fenced node is node1.
Note: Understand the impact before trying.

# crm node fence node1

* Example for sudo permissions in /etc/sudoers.d/SAPHanaSR .

See also manual page sudoers(5).

# SAPHanaSR-alert-fencing needs
hacluster ALL=(ALL) NOPASSWD: /usr/sbin/crm --force node fence *

FILES

/usr/bin/SAPHanaSR-alert-fencing
the alert agent
/run/crm/SAPHanaSR_site_cache
the internal cache for host to site relation - do not touch this file
/etc/sudoers.d/
directory for sudoers config files
/etc/sysconfig/sbd
config file for SBD daemon
/etc/sysconfig/pacermaker
config file for pacemaker daemon

REQUIREMENTS

1. Pacemaker 2.1.2 or newer.

2. SAP HANA scale-out performance-optimized scenario. No HANA host auto-failover, thus no standby nodes.

3. Only one SID is controlled by the Linux cluster.

4. Site names and host names should not be changed.

5. No other alert agent should be configured for the fencing alert.

6. User hacluster is member of group haclient. Both are defined locally on each cluster nodes.

7. User hacluster needs password-less sudo permission on "/usr/sbin/crm --force node fence *".

8. Concurrent fencing is configured, see manual page SAPHanaSR-ScaleOut_basic_cluster(7).

9. SAPHanaFilesystem RA with monitor operations is active.

10. Automatic restart of just fenced nodes should be disabled by adapting SBD_START_MODE. In case of automatic restart of just fenced nodes, it might be necessary to adapt SBD_START_DELAY in order to avoid fencing loops. See manual page sbd(8).

11. The alert agent unconditionally executes fencing. The alert agent relies on the preceding fencing decision. Neither site role nor SR state is checked.

12. The alert agent runtime almost completely depends on call-outs to OS and Linux cluster.

BUGS

In case of any problem, please use your favourite SAP support process to open a request for the component BC-OP-LNX-SUSE. Please report any other feedback and suggestions to feedback@suse.com.

SEE ALSO

SAPHanaSR-angi(7) , SAPHanaSR-ScaleOut(7) , ocf_suse_SAPHanaController(7) , ocf_suse_SAPHanaFilesystem(7) , susChkSrv.py(7) , crm(8) , sbd(8) , sudoers(5) ,
https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Administration/singlehtml/#alert-agents

AUTHORS

F.Herschel, L.Pinne.

COPYRIGHT

(c) 2024 SUSE LLC
SAPHanaSR-alert-fencing comes with ABSOLUTELY NO WARRANTY.
For details see the GNU General Public License at http://www.gnu.org/licenses/gpl.html

02 Dec 2024