5.2. Checking DRBD status

5.2.1. Retrieving status with drbd-overview

One convenient way to look at DRBD’s status is the drbd-overview utility (lines broken here for readability).

nina# drbd-overview
  0:r0/0  Connected(*) Seco(*)/Prim(nina) UpTo(*)/Disk(nono)
            /mnt ext3 1008M 18M 940M 2%
  1:r1/0  Connected(*) Secondary(*)       UpTo(*)/Disk(nono)
  5:r2/0  Connected(*) Seco(*)/Prim(nini) UpTo(*)/Disk(nono)
  6:r2/1  Connected(*) Seco(*)/Prim(nini) UpTo(*)/Disk(nono)

This output says that

  1. r0 volume 0 is Primary on host nina (the local host), and is mounted as ext filesystem on /mnt.
  2. Host nono is Diskless for all resources (a DRBD client)
  3. All other hosts are UpToDate for all resources.
  4. r1 is Secondary on all hosts, ie. not used.
  5. r2 is used on host nini

drbd-overview tries to be exact, but concise; sadly these requirements conflict with each other, so for more complicated status outputs it might be easier to read drbdsetup status directly.

5.2.2. Status information in /proc/drbd

[Note]Note

'/proc/drbd' is deprecated. While it won’t be removed in the 8.4 series, we recommend to switch to other means, like Section 5.2.3, “Status information via drbdadm; or, for monitoring even more convenient, Section 5.2.4, “One-shot or realtime monitoring via drbdsetup events2.

/proc/drbd is a virtual file displaying basic information about the DRBD module. It was used extensively up to DRBD 8.4, but couldn’t keep up with the amount of information provided by DRBD 9.

$ cat /proc/drbd
version: 9.0.0 (api:1/proto:86-110) FIXME
GIT-hash: XXX build by linbit@buildsystem.linbit, 2011-10-12 09:07:35

The first line, prefixed with version:, shows the DRBD version used on your system. The second line contains information about this specific build.

5.2.3. Status information via drbdadm

In its simplest invocation, we just ask for the status of a single resource.

# drbdadm status home
home role:Secondary
  disk:UpToDate
  nina role:Secondary
    disk:UpToDate
  nino role:Secondary
    disk:UpToDate
  nono connection:WFConnection

This here just says that the resource home is locally, on nina, and on nino UpToDate and Secondary; so the three nodes have the same data on their storage devices, and nobody is using the device currently.

The node nono is not connected, its state is reported as WFConnection; please see Section 5.2.5, “Connection states” below for more details.

You can get more information by passing the --verbose and/or --statistics arguments to drbdsetup (lines broken for readability):

# drbdsetup status home --verbose --statistics
home node-id:1 role:Secondary suspended:no
    write-ordering:none
  volume:0 minor:0 disk:UpToDate
      size:1048412 read:0 written:1048412 al-writes:0 bm-writes:48 upper-pending:0
                                        lower-pending:0 al-suspended:no blocked:no
  nina local:ipv4:10.9.9.111:7001 peer:ipv4:10.9.9.103:7010 node-id:0
                                               connection:Connected role:Secondary
      congested:no
    volume:0 replication:Connected disk:UpToDate resync-suspended:no
        received:1048412 sent:0 out-of-sync:0 pending:0 unacked:0
  nino local:ipv4:10.9.9.111:7021 peer:ipv4:10.9.9.129:7012 node-id:2
                                               connection:Connected role:Secondary
      congested:no
    volume:0 replication:Connected disk:UpToDate resync-suspended:no
        received:0 sent:0 out-of-sync:0 pending:0 unacked:0
  nono local:ipv4:10.9.9.111:7013 peer:ipv4:10.9.9.138:7031 node-id:3
                                                           connection:WFConnection

Every few lines in this example form a block that is repeated for every node used in this resource, with small format exceptions for the local node - see below for more details.

The first line in each block shows the node-id (for the current resource; a host can have different node-ids in different resources). Furthermore the role (see Section 5.2.6, “Resource roles”) is shown.

The next important line begins with the volume specification; normally these are numbered starting by zero, but the configuration may specify other IDs as well. This line shows the connection state in the replication item (see Section 5.2.5, “Connection states” for details) and the remote disk state in disk (see Section 5.2.7, “Disk states”). Then there’s a line for this volume giving a bit of statistics - data received, sent, out-of-sync, etc; please see Section 5.2.9, “Performance indicators” and Section 5.2.8, “Connection information data” for more information.

For the local node the first line shows the resource name, home, in our example. As the first block always describes the local node, there is no Connection or address information.

please see the drbd.conf manual page for more information.

The other four lines in this example form a block that is repeated for every DRBD device configured, prefixed by the device minor number. In this case, this is 0, corresponding to the device /dev/drbd0.

The resource-specific output contains various pieces of information about the resource:

5.2.4. One-shot or realtime monitoring via drbdsetup events2

NOTE: This is available only with userspace versions 8.9.3 and up. This is a low-level mechanism to get information out of DRBD, suitable for use in automated tools, like monitoring.

In its simplest invocation, showing only the current status, the output looks like this (but, when running on a terminal, will include colors):

 
# drbdsetup events2 --now r0
exists resource name:r0 role:Secondary suspended:no
exists connection name:r0 peer-node-id:1 conn-name:remote-host connection:Connected role:Secondary
exists device name:r0 volume:0 minor:7 disk:UpToDate
exists device name:r0 volume:1 minor:8 disk:UpToDate
exists peer-device name:r0 peer-node-id:1 conn-name:remote-host volume:0
    replication:Established peer-disk:UpToDate resync-suspended:no
exists peer-device name:r0 peer-node-id:1 conn-name:remote-host volume:1
    replication:Established peer-disk:UpToDate resync-suspended:no
exists -
 
 -- 'drbdsetup' **example output** (lines broken for readability)

Without the '--now', the process will keep running, and send continuous updates like this:

# drbdsetup events2 r0
...
change connection name:r0 peer-node-id:1 conn-name:remote-host connection:StandAlone
change connection name:r0 peer-node-id:1 conn-name:remote-host connection:Unconnected
change connection name:r0 peer-node-id:1 conn-name:remote-host connection:Connecting

Then, for monitoring purposes, there’s another argument '--statistics', that will produce some performance counters and other facts:

drbdsetup verbose output (lines broken for readability):

# drbdsetup events2 --statistics --now r0
exists resource name:r0 role:Secondary suspended:no write-ordering:drain
exists connection name:r0 peer-node-id:1 conn-name:remote-host connection:Connected
                                                        role:Secondary congested:no
exists device name:r0 volume:0 minor:7 disk:UpToDate size:6291228 read:6397188
            written:131844 al-writes:34 bm-writes:0 upper-pending:0 lower-pending:0
                                                         al-suspended:no blocked:no
exists device name:r0 volume:1 minor:8 disk:UpToDate size:104854364 read:5910680
          written:6634548 al-writes:417 bm-writes:0 upper-pending:0 lower-pending:0
                                                         al-suspended:no blocked:no
exists peer-device name:r0 peer-node-id:1 conn-name:remote-host volume:0
          replication:Established peer-disk:UpToDate resync-suspended:no received:0
                                      sent:131844 out-of-sync:0 pending:0 unacked:0
exists peer-device name:r0 peer-node-id:1 conn-name:remote-host volume:1
          replication:Established peer-disk:UpToDate resync-suspended:no received:0
                                     sent:6634548 out-of-sync:0 pending:0 unacked:0
exists -

You might also like the '--timestamp' parameter.

5.2.5. Connection states

A resource’s connection state can be observed either by issuing the drbdadm cstate command:

# drbdadm cstate <resource>
Connected
Connected
StandAlone

If you are interested in only a single connection of a resource, specify the connection name, too:

The default is the peer’s hostname as given in the configuration file.

# drbdadm cstate <peer>:<resource>
Connected

A resource may have one of the following connection states:

StandAlone No network configuration available. The resource has not yet been connected, or has been administratively disconnected (using drbdadm disconnect), or has dropped its connection due to failed authentication or split brain.

Disconnecting Temporary state during disconnection. The next state is StandAlone.

Unconnected Temporary state, prior to a connection attempt. Possible next states: WFConnection and WFReportParams.

Timeout Temporary state following a timeout in the communication with the peer. Next state: Unconnected.

BrokenPipe Temporary state after the connection to the peer was lost. Next state: Unconnected.

NetworkFailure Temporary state after the connection to the partner was lost. Next state: Unconnected.

ProtocolError Temporary state after the connection to the partner was lost. Next state: Unconnected.

TearDown Temporary state. The peer is closing the connection. Next state: Unconnected.

WFConnection This node is waiting until the peer node becomes visible on the network.

WFReportParams TCP connection has been established, this node waits for the first network packet from the peer.

Connected A DRBD connection has been established, data mirroring is now active. This is the normal state.

StartingSyncS Full synchronization, initiated by the administrator, is just starting. The next possible states are: SyncSource or PausedSyncS.

StartingSyncT Full synchronization, initiated by the administrator, is just starting. Next state: WFSyncUUID.

WFBitMapS Partial synchronization is just starting. Next possible states: SyncSource or PausedSyncS.

WFBitMapT Partial synchronization is just starting. Next possible state: WFSyncUUID.

WFSyncUUID Synchronization is about to begin. Next possible states: SyncTarget or PausedSyncT.

SyncSource Synchronization is currently running, with the local node being the source of synchronization.

SyncTarget Synchronization is currently running, with the local node being the target of synchronization.

PausedSyncS The local node is the source of an ongoing synchronization, but synchronization is currently paused. This may be due to a dependency on the completion of another synchronization process, or due to synchronization having been manually interrupted by drbdadm pause-sync.

PausedSyncT The local node is the target of an ongoing synchronization, but synchronization is currently paused. This may be due to a dependency on the completion of another synchronization process, or due to synchronization having been manually interrupted by drbdadm pause-sync.

VerifyS On-line device verification is currently running, with the local node being the source of verification.

VerifyT On-line device verification is currently running, with the local node being the target of verification.

5.2.6. Resource roles

A resource’s role can be observed by issuing the drbdadm role command:

# drbdadm role <resource>
Primary

You may see one of the following resource roles:

Primary The resource is currently in the primary role, and may be read from and written to. This role only occurs on one of the two nodes, unless dual-primary mode is enabled.

Secondary The resource is currently in the secondary role. It normally receives updates from its peer (unless running in disconnected mode), but may neither be read from nor written to. This role may occur on one or both nodes.

Unknown The resource’s role is currently unknown. The local resource role never has this status. It is only displayed for the peer’s resource role, and only in disconnected mode.

5.2.7. Disk states

A resource’s disk state can be observed either by issuing the drbdadm dstate command:

# drbdadm dstate <resource>
UpToDate

The disk state may be one of the following:

Diskless No local block device has been assigned to the DRBD driver. This may mean that the resource has never attached to its backing device, that it has been manually detached using drbdadm detach, or that it automatically detached after a lower-level I/O error.

Attaching Transient state while reading meta data.

Failed Transient state following an I/O failure report by the local block device. Next state: Diskless.

Negotiating Transient state when an Attach is carried out on an already-Connected DRBD device.

Inconsistent The data is inconsistent. This status occurs immediately upon creation of a new resource, on both nodes (before the initial full sync). Also, this status is found in one node (the synchronization target) during synchronization.

Outdated Resource data is consistent, but outdated.

DUnknown This state is used for the peer disk if no network connection is available.

Consistent Consistent data of a node without connection. When the connection is established, it is decided whether the data is UpToDate or Outdated.

UpToDate Consistent, up-to-date state of the data. This is the normal state.

5.2.8. Connection information data

localShows the network family, the local address and port that is used to accept connections from the peer.

peerShows the network family, the peer address and port that is used to connect.

congestedThis flag tells whether the TCP send buffer of the data connection is more than 80% filled.

5.2.9. Performance indicators

One line of drbdadm status-output includes the following counters and gauges:

send (network send). Volume of net data sent to the partner via the network connection; in Kibyte.

receive (network receive). Volume of net data received by the partner via the network connection; in Kibyte.

read (disk write). Net data written on local hard disk; in Kibyte.

written (disk read). Net data read from local hard disk; in Kibyte.

al-writes (activity log). Number of updates of the activity log area of the meta data.

bm-writes (bit map). Number of updates of the bitmap area of the meta data.

lower-pending (local count). Number of open requests to the local I/O sub-system issued by DRBD.

pendingNumber of requests sent to the partner, but that have not yet been answered by the latter.

unacked (unacknowledged). Number of requests received by the partner via the network connection, but that have not yet been answered.

upper-pending (application pending). Number of block I/O requests forwarded to DRBD, but not yet answered by DRBD.

write-ordering (write order). Currently used write ordering method: b(barrier), f(flush), d(drain) or n(none).

out-of-syncAmount of storage currently out of sync; in Kibibytes.

resync-suspendedWhether the resynchronization is currently suspended or not. Possible values are no, user, peer, dependency.

blockedShows local I/O congestion.

  • no: No congestion.
  • upper: I/O above the DRBD device is blocked, ie. to the filesystem. Typical causes are

  • lower: Backing device is congested.

It’s possible to see a value of upper,lower, too.