table of contents
SAPHanaSR-alert-fencing(7) | SAPHanaSR | SAPHanaSR-alert-fencing(7) |
NAME¶
SAPHanaSR-alert-fencing - Alert agent for cluster fencing alerts.
DESCRIPTION¶
SAPHanaSR-alert-fencing can be used to react on Linux cluster fencing alerts.
The Linux cluster provides an interface to initiate external actions when a cluster event occurs (alert). Than the cluster calls an external program (an alert agent) to handle that alert.
When the Linux cluster has performed an node fencing, it can call SAPHanaSR-alert-fencing on each active cluster node. The agent checks whether the local node belongs to the same HANA site as the fenced node. If so, it asks the cluster to fence the local node as well.
This improves four use cases for HANA scale-out:
- HA/DR provider hook script susChkSrv.py action_on_lost=fence
- resource agent SAPHanaController ON_FAIL_ACTION=fence
- resource agent SAPHanaFilesystem ON_FAIL_ACTION=fence
- pacemaker service PCMK_fail_fast=yes
See also manual pages ocf_suse_SAPHanaController(7),
ocf_suse_SAPHanaFilesystem(7), SAPHanaSR-ScaleOut_basic_cluster(7) and
susChkSrv.py(7).
SUPPORTED PARAMETERS¶
- timeout
- If the alert agent does not complete within this amount of time, it will be terminated. Optional, default "30s". Example "meta timeout=30s".
- alert_uptime_threshold
- How long a node must be up and running (uptime) before fencing alerts will be processed. This avoids fencing loops. Optional, default "300". Example "attributes alert_uptime_threshold=300".
RETURN CODES¶
0 Successful program execution.
>0 Usage, syntax or execution errors.
In addition log entries are written, which can be scanned by using a pattern
like "SAPHanaSR-alert-fencing".
EXAMPLES¶
* Example configuration for the fencing alert handler.
The following lines needs to be added to the cluster´s CIB:
select fencing \
attributes alert_uptime_threshold=300
* Example for configuring the alert agent by using crm.
Alternate way for configuring the alert agent.
* Showing all configured alert agents.
* Showing agent messages.
* Showing history of fence actions and cleaning it up.
Example node with failed fencing action is node22.
# stonith_admin --cleanup --history node22
* Example for manually fencing an node.
This could be done for testing the SAPHanaSR-alert-fencing agent
integration. This test should not be done on production systems. See manual
page crm(8) for details. Fenced node is node1.
Note: Understand the impact before trying.
* Example for sudo permissions in /etc/sudoers.d/SAPHanaSR .
See also manual page sudoers(5).
hacluster ALL=(ALL) NOPASSWD: /usr/sbin/crm --force node fence *
FILES¶
- /usr/bin/SAPHanaSR-alert-fencing
- the alert agent
- /run/crm/SAPHanaSR_site_cache
- the internal cache for host to site relation - do not touch this file
- /etc/sudoers.d/
- directory for sudoers config files
- /etc/sysconfig/sbd
- config file for SBD daemon
- /etc/sysconfig/pacermaker
- config file for pacemaker daemon
REQUIREMENTS¶
1. Pacemaker 2.1.2 or newer.
2. SAP HANA scale-out performance-optimized scenario. No HANA host auto-failover, thus no standby nodes.
3. Only one SID is controlled by the Linux cluster.
4. Site names and host names should not be changed.
5. No other alert agent should be configured for the fencing alert.
6. User hacluster is member of group haclient. Both are defined locally on each cluster nodes.
7. User hacluster needs password-less sudo permission on "/usr/sbin/crm --force node fence *".
8. Concurrent fencing is configured, see manual page SAPHanaSR-ScaleOut_basic_cluster(7).
9. SAPHanaFilesystem RA with monitor operations is active.
10. Automatic restart of just fenced nodes should be disabled by adapting SBD_START_MODE. In case of automatic restart of just fenced nodes, it might be necessary to adapt SBD_START_DELAY in order to avoid fencing loops. See manual page sbd(8).
11. The alert agent unconditionally executes fencing. The alert agent relies on the preceding fencing decision. Neither site role nor SR state is checked.
12. The alert agent runtime almost completely depends on call-outs to OS and Linux cluster.
BUGS¶
In case of any problem, please use your favourite SAP support process to open a request for the component BC-OP-LNX-SUSE. Please report any other feedback and suggestions to feedback@suse.com.
SEE ALSO¶
SAPHanaSR-angi(7) , SAPHanaSR-ScaleOut(7) ,
ocf_suse_SAPHanaController(7) , ocf_suse_SAPHanaFilesystem(7)
, susChkSrv.py(7) , crm(8) , sbd(8) , sudoers(5)
,
https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Administration/singlehtml/#alert-agents
AUTHORS¶
F.Herschel, L.Pinne.
COPYRIGHT¶
(c) 2024 SUSE LLC
SAPHanaSR-alert-fencing comes with ABSOLUTELY NO WARRANTY.
For details see the GNU General Public License at
http://www.gnu.org/licenses/gpl.html
02 Dec 2024 |