Scroll to navigation

md_monitor(8) System Manager's Manual md_monitor(8)

NAME

md_monitor - MD device monitor

SYNOPSIS

md_monitor [-d|--daemonize] [-f file|--logfile file] [-s|--syslog] [-e num|--expires=num] [-P num|--process-limit=num] [-O num|--open-file-limit=num] [-m|--fail-mirror] [-o|--fail-disk] [-r num|--retries=num] [-p prio|--log-priority=prio] [-v|--verbose] [-y|--check-in-sync] [-c cmd|--command=cmd] [-V|--version] [-h|--help]

DESCRIPTION

The md_monitor monitors the component devices of each MD array for I/O issues. It will update the monitored MD arrays on each status change, setting devices to 'faulty' or re-integrate working devices.

OPTIONS

md_monitor recognizes the following command-line options:

Send command cmd to daemon.
Start md_monitor in background
Set failfast_expires to num.
Write logging information into file instead of stdout
Display md_monitor usage information.
Fail and reset the entire mirror half when one device failed. This is the default.
Set maximum number of open files (RLIMIT_NOFILE, see getrlimit(2)) to num. Default is 4096.
Only fail the affected disk when one device failed. This is the opposite of --fail-mirror.
Set maximum number of processes (RLIMIT_NPROC, see getrlimit(2)) to num.
Set logging priority to prio.
Set failfast_retries to num.
Write logging information to syslog.
Run path checker every secs seconds. Default is 1.
Increase logging priority
Run path checkers for 'in_sync' devices. Without this option path checkers will be stopped whenever a device is detected to be 'in_sync'. They will be re-started once a device has been marked as 'faulty' or 'timeout'.
Display md_monitor version information.

MD_MONITOR COMMAND MODE

When specifying --command the md_monitor program connects to a already running md_monitor program and send a pre-defined command. The command has the following syntax:

The following values for cmd are recognised. If not specified otherwise, md needs to be the device node of an existing MD array.

Shutdown md_monitor; md argument should be /dev/console
Rebuild has started on array md.
Rebuild has finished on array md.
Array md has been stopped; md_monitor will stop monitoring the component devices for array md.
Array md has been started. This event is ignored by md_monitor; new arrays will be detected via uevents.
MD detected a failure on the component device dev of array md. md_monitor will re-check the device every failfast_expires seconds.
MD detected a failure on the spare device dev of array md. md_monitor will re-check the device every failfast_expires seconds.
The component device dev has been removed from the MD array md. md_monitor will stop monitoring this device.
MD has integrated the device dev into array md. md_monitor will re-start monitoring of this device every failfast_expires seconds. The check interval will be increased for each successful check up to a maximum of failfast_expires * failfast_retries seconds.
Return the current internal status of the monitored devices.
Return the status of the MD component devices in abbreviated form. Each character represents the status of the MD component device at that position. For the possible states see the next paragraph.
Return the current I/O status of the monitored devices in abbreviated form. Each character represents the I/O status of the monitored device in abbreviated form.

DEVICE STATUS DISPLAY

md_monitor will be displaying state information about the monitored devices when the CLI command MirrorStatus or MonitorStatus is sent. Each character of the returned string represents the state of the device at that location.

The possible states for MirrorStatus are:

.
Unknown
In_Sync
Faulty
Timeout
Spare
-
Removed
Recovery pending
Removal pending
Blocked

R and P are intermediate states, which are set by md_monitor whenever a command has been sent to mdadm, but no notification has been received yet. B is set when MD attempts to fail the second half of the mirror when the first half is already failed. MD will hold off I/O to the entire mirror until the second half is useable again.

The possible states for MonitorStatus are:

.
Unknown
MD will be stopped
I/O ok
I/O failed
I/O pending
I/O timeout - Removed S Spare R Recovery

P and T describe the same condition, ie I/O has been stalled. The state will switch from P to T when the timeout as set by failfast_expires * failfast_retries seconds has expired. -, S, and R, are steps MD takes to recover a device; first the device will be removed, then it will be re-added as a 'spare' device, and then recovery will be starting for re-adding the spare device into the MD array.

THEORY OF OPERATION

md_monitor sets up a path checker thread for each MD component device. This path checker will issue every check-time seconds an asynchronous I/O request to the device. It will then wait up to failfast_expires * failfast_retries seconds for this I/O to complete. If no response has been received during that time, the monitor status for this path is set to I/O timeout. If the I/O completed the monitor status for this path will be set to I/O ok or I/O failed, depending on whether the I/O completed without error or not. If the path checker has been interrupted during waiting, the monitor status for this path will be set to I/O pending. After the monitor status has been updated, the path checker thread will update the MD status for this device and invoke an action, depending on these two states. If check-in-sync has been specified the path checker continue to run even for in_sync paths. Otherwise the path checker be stopped when a path is marked as in_sync. Path checkers will be restarted whenever a device is marked as faulty or timeout.

MDADM INTEGRATION

md_monitor listens to udev events for any device changes. It is designed to integrate into MD via the --monitor functionality of mdadm.

To use this function mdadm needs to be started with

where md_script is a bash script containing eg:

#!/bin/bash
# MD monitor script
#
EVENT=$1
MD=$2
DEV=$3
/sbin/md_monitor -c "${EVENT}:${MD}@${DEV}"

A default md_script is installed at /usr/share/misc/md_notify_device.sh.

It is recommended to use an /etc/mdadm.conf configuration file when using md_monitor to monitor MD arrays. To enable automatic device assembly into MD arrays the configuration file should include the lines:


POLICY action=re-add
AUTO -all

It is recommended to include these line when using md_monitor.

VERSIONS

This manual page documents md_monitor version 4.26.

FILES

/usr/share/misc/md_notify_device.sh
Default md_monitor script.
/etc/mdadm.conf
MD configuration file

SEE ALSO

mdadm(8), mdadm.conf(7)

Thu Nov 5 2015 md_monitor 6.6