DRBD and the sync rate controller (8.3.9 and above)

Posted on 2012-01-05 by Flip

The sync-rate controller is used for controlling the used bandwidth during resynchronization (not normal replication); it runs in the SyncTarget state, ie. on the (inconsistent) receiver side.

It’s configured as follows:

Set c-plan-ahead to approximately 10 times the RTT; so if ping from one node to the other says 200msec, configure 2 seconds (ie. a value of 20, as the unit is tenths of a second).¹
Please note that the controller is only polling every 100msec; so c-plan-ahead values below 5 don’t make sense, as the controller hasn’t collected enough information to decide whether to request more data. We recommend to use at least 1 second (configured value is 10).
This value specifies the “thinking ahead” time of the controller, ie. the time period the controller has to achieve the actual sync-rate.
Configure minimum and maximum values via c-min-rate and c-max-rate; these depend mostly on the available bandwidth per resource.
The c-min-rate is the minimum bandwidth that will be used during a resync, whereas c-max-rate is the most bandwidth that can be used by a resync.
Now decide whether to use c-fill-target or c-delay-target – you can choose only one.

Difference between delay and fill based control

If you set c-fill-target to a non-zero value, DRBD will try to keep that much data on the wire; if application IO gets in, it will temporarily displace the synchronization traffic. This means that application data will have only a limited amount of synchronization data in the buffers before it, which helps latency a bit.
The data still has to fit into the socket buffers, along with the application IO, so using multi-MB sizes here doesn’t make sense; 100kByte is a good starting value.

With a proxy you should use c-delay-target, so set the c-fill-target value to zero. This way the time interval that the synchronization data is on the wire is measured; if application IO gets in, this triggers the controller, and it will turn back the synchronization speed, to keep the communication latency at the specified value. Use 5 times the RTT as a starting point.

DRBD causes too much CPU-load

Posted on 2011-12-20 by Flip

The TL;DR version: don’t use data-integrity-alg in a production. Continue reading →

Editing the Pacemaker configuration with VIM

Posted on 2011-12-19 by Flip

For people using the VIM editor I’ve got two small tips when editing Pacemaker configurations:

Use syntax highlight. This helps to see unmatched quote characters easily. Whether it’s too colorful can be discussed, though
A current version can be found here, and the mailing list post is here.

For correlating resource names I recommend the Mark plugin. Continue reading →

“al-extents” explained

Posted on 2011-12-18 by Flip

There is quite a bit of confusion about the DRBD configuration value al-extents (activity log extents), so here’s another shot at explaining it. Continue reading →

Make the kernel start write-out earlier

Posted on 2011-12-03 by Flip

Similar to the recent post about setting the vm.min_free_kbytes value there’s another sysctl that might improve the behaviour: the dirty ratio. Continue reading →

DRBD resources need different monitor intervals

Posted on 2011-12-02 by Flip

As briefly mentioned in Pacemaker Explained, DRBD devices need two different values set for their monitor intervals:

primitive pacemaker-resource-name ocf:linbit:drbd         \
        params drbd_resource="drbd-resource"              \
        op monitor interval="61s" role="Slave"            \
        op monitor interval="59s" role="Master"

The reason is that Pacemaker distinguishes monitor operations by their resource and their interval – but not by their role. So, if this distinction is not done “manually”, Pacemaker will monitor only one of the two (and, with DRBD 9, more) nodes, which is not what you want (usually).

Increase vm.min_free_kbytes for better OOM resistance

Posted on 2011-12-02 by Flip

Depending on your setup and your workload (eg. within a virtual machine with little memory and much I/O) you could get into the situation that the kernel has little memory left, so wants to write some dirty pages to disk, but cannot, because for that it would need some memory free!

Now, while that cannot happen with DRBD alone (as this has some small, reserved memory pool to guarantee progress without needing extra memory allocated), you can get into this with eg. iSCSI and md, especially with too low values for the sysctl vm.min_free_kbytes (which is set to 128 on some installations, ie. only 128kByte reserved!).

To make the system much more resistant against these OOM scenarios you can increase the value; for a machine with more than 8GByte RAM you can easily spare 128MByte (which would translate into a sysctl value of 131072, as it has to be given in kByte).

The common way to change the value is to edit /etc/sysctl.conf, and then to use the sysctl command to write the changes into the kernel.

LRMd hangs on Ubuntu (Lucid and Maverick)

Posted on 2011-11-30 by Flip

We’ve recently come across a case where stopping pacemaker (in this case via /etc/init.d/heartbeat stop) didn’t work; and, similarly, crm configure property maintenance-mode=false wouldn’t work.

After some searching and testing the solution was found: Upgrading libglib2.0-0 to at least the natty version 2.28.6-0ubuntu1 fixed the problem.

The bug seems to have been a locking problem.

LINBIT Blogs

Tips and Tricks, Hints and Solutions around Linux High-Availability