SAPCMControlZone_maintenance_examples(7)

SAPCMControlZone

SAPCMControlZone_maintenance_examples(7)

NAME¶

SAPCMControlZone_maintenance_examples - maintenance examples for SAPCMControlZone.

DESCRIPTION¶

Maintenance examples for ControlZone resources in Linux HA clusters. Please see ocf_suse_SAPCMControlZone(7) for more examples and read the REQUIREMENTS section there.

EXAMPLES¶

* Example for checking clean and idle state.

This steps should be performed before doing anything with the cluster, and after something has been done. Example user is mzadmin.

# su - mzadmin -c "mzsh status"
# crm_mon -1r
# crm configure show | grep cli-
# cibadmin -Q | grep fail-count
# cs_clusterstate -i

* Example for manual takeover using Linux cluster.

ControlZone application and Linux cluster are checked for clean and idle state. The ControlZone resources are moved to the other node. The related location rule is removed after the takeover took place. ControlZone application and Linux cluster are checked for clean and idle state. Resource group is grp_C11, user is mzadmin.

# su - mzadmin -c "mzsh status"
# crm_mon -1r
# crm configure show | grep cli-
# cibadmin -Q | grep fail-count
# cs_clusterstate -i

# crm resource move grp_C11 force
# cs_wait_for_idle -s 9; crm_mon -1r
# crm resource clear grp_C11

# cs_wait_for_idle -s 6; crm_mon -1r
# crm configure show | grep cli-
# su - mzadmin -c "mzsh status"

* Example for generic maintenance procedure.

ControlZone application and Linux cluster are checked for clean and idle state. The ControlZone resource group is set into maintenance mode. This is needed to allow manual actions on the resources. After the manual actions are done, the resource group is put back under cluster control. It is neccessary to wait for each step to complete and to check the result. ControlZone application and Linux cluster are checked for clean and idle state. Resource group is grp_C11, user is mzadmin. See also example above.

# su - mzadmin -c "mzsh status"
# crm_mon -1r
# crm configure show | grep cli-
# cibadmin -Q | grep fail-count
# cs_clusterstate -i
# crm resource maintenance grp_C11

# crm resource refresh grp_C11
# cs_wait_for_idle -s 6; crm_mon -1r
# crm resource maintenance grp_C11 off
# cs_wait_for_idle -s 6; crm_mon -1r
# su - mzadmin -c "mzsh status"

* Set whole Linux cluster into maintenance.

This disables all resource management as well as node fencing. However, fence requests from outside will be queued and executed once the cluster leaves maintenance mode. See manual page crm(8) and stonith_admin(8).

# crm maintenance on
# crm_attribute --query -t crm_config -n maintenance-mode

* Remove left-over maintenance attribute from overall Linux cluster.

This could be done to avoid confusion caused by different maintenance procedures. Before doing so, check for cluster attribute maintenance-mode="false".

# crm_attribute --query -t crm_config -n maintenance-mode
# crm_attribute --delete -t crm_config -n maintenance-mode

* Remove left-over standby attribute from Linux cluster nodes.

This could be done to avoid confusion caused by different maintenance procedures. Before doing so for all nodes, check for node attribute standby="off" on all nodes. Example node is node1.

# crm_attribute --query -t nodes -N node1 -n standby
# crm_attribute --delete -t nodes -N node1 -n standby

* Remove left-over maintenance attribute from resource.

This should usually not be needed. Resource group is grp_C11.

# crm_resource --resource grp_C11 --delete-parameter maintenance --meta

* Disable Linux cluster on all cluster nodes.

On any cluster node the cluster will not start automatically on boot anymore. Nevertheless a currently running cluster will keep running. Needs password-less ssh between cluster nodes.

# crm cluster run "crm cluster disable"
# crm cluster run "systemctl status pacemaker" | grep pacemaker.service

* Start Linux cluster on all cluster nodes.

Needs password-less ssh between cluster nodes.

# crm cluster start --all
# crm cluster run "systemctl status pacemaker" | grep pacemaker.service

* Start Linux cluster after node has been fenced.

It is recommended to not configure the Linux cluster for always starting autmatically on boot. Better is to start automatically only, if cluster and/or node have been stopped cleanly. If the node has been rebooted by STONITH, the cluster should not start automatically. If the cluster is configure that way, some steps are needed to start the cluster after a node has been rebooted by STONITH. STONITH via SBD is used in this example.

# cs_clear_sbd_devices --all
# cs_show_sbd_devices
# crm cluster start
# crm_mon -r

* Overview on maintenance procedure for Linux cluster or OS.

ControlZone instance remains running. But it will not automatically be moved or restarted while the Linux cluster is inactive. It is neccessary to wait for each step to complete and to check the result. See examples above for details.

1. Check status of Linux cluster and ControlZone instance, see above.
2. Set the Linux cluster into maintenance mode.
3. Stop Linux Cluster on all nodes.
4. Perform maintenance on Linux cluster or OS.
5. Start Linux cluster on all nodes.
6. Let Linux cluster detect status of ControlZone resources.
7. Set cluster ready for operations.
8. Check status of Linux cluster and ControlZone instance, see above.

BUGS¶

Please report feedback and suggestions to feedback@suse.com.

AUTHORS¶

F.Herschel, L.Pinne

COPYRIGHT¶

(c) 2024 SUSE LLC
SAPCMControlZone comes with ABSOLUTELY NO WARRANTY.
For details see the GNU General Public License at http://www.gnu.org/licenses/gpl.html

15 Apr 2024

Source file:	SAPCMControlZone_maintenance_examples.7.en.gz (from sap-convergent-resource-agents 1.0.0-160000.2.2)
Source last updated:	2024-05-13T09:10:52Z
Converted to HTML:	2025-10-30T21:13:05Z