The Cluster Guy

ǝɹǝɥ ʇxǝʇ lnɟʇɥƃısuı

TheClusterGuy blog is moving

Nothing against Tumblr, but with Posterous being shut down (even though I’m not using it here), I’ve decided to take back control of my content.

As a result I’ve started using Octopress to publish to blog.clusterlabs.org

Octopress is pretty nifty. It generates a static site (good for performance) that can either be hosted at GitHub or (if GitHub ever goes dark) anywhere Apache can run. I was even able to easily import my old posts!

For now I’m taking the GitHub path with a custom domain name (not the same as this one so that the old links still work).

See you on the other side

Feb
13th

Wed

Pacemaker 1.0.13 now available

Thanks once again to the efforts of the fine folks from NTT, the latest bug fixes have been back-ported from 1.1 and another instalment of the Pacemaker 1.0 release series is now ready for general consumption.

Changesets	129
Diff	173 files changed, 12206 insertions(+), 767 deletions(-)

Important changes since Pacemaker-1.0.12 include:

cib: Don’t halt disk writes if the previous digest is missing
cib: Fix coverity RESOURCE_LEAK defect
Core: Avoid assertion error when underflowing days of the month in iso8601 date code
Core: Correctly determine when an XML file should be decompressed
Core: Ensure signals are handled eventually in the absense of timer sources or IPC messages
Core: Strip text nodes from on disk xml files
crmd: cl#5051 - Fixes file leak in pe ipc connection initialization.
crmd: cl#5057 - Restart sub-systems correctly (bnc#755671)
crmd: Fast-track shutdown if we couldn’t request it via attrd
crmd: Leave it up to the PE to decide which ops can/cannot be reload
crmd: Prevent use-of-NULL when free’ing empty hashtables
crmd: Supply format arguments in the correct order
Fix memory leak in cib when writing the cib contents.
legacy: Set to the minimum scheduling priority when using SCHED_RR policy (bnc#779259)
pengine: Bug #5007, Fixes use of colocation constraints with multi-state resources
pengine: Bug cl#5038 - Prevent restart of anonymous clones when clone-max decreases
pengine: Bug cl#5101 - Ensure stop order is preserved for partially active groups
pengine: cl#5069 - Honor ‘on-fail=ignore’ even when operation is disabled.
pengine: cl#5072 - Fixes monitor op stopping after rsc promotion.
pengine: Ensure post-migration stop actions occur before node shutdown
pengine: Fix coverity REVERSE_INULL defects
pengine: Fix use-after-free errors detected by coverity
pengine: Prevent segfault when ensuring unmanaged resources don’t prevent shutdown
pengine: Reload of a resource no longer causes a restart of dependant resources
RA: controld - use the correct dlm_controld when membership comes from corosync directly
tools: crm_resource - Fix coverity FORWARD_NULL defect
Tools: crm_shadow - Bug cl#5062 - Correctly set argv[0] when forking a shell process

You also can see the full changelog,

The next 1.0.x release will occur if and when needed (but probably not before mid-2013).

The source tarball is also available directly from GitHub.

Users of more most distributions are encouraged to use the latest 1.1.x release - either from the 1.1 Build Area or from the distribution directly.

General installation instructions are available at from the ClusterLabs wiki.

Oct
30th

Tue

Can Pacemaker 1.1.8 be used with…

Short answer: yes Longer answer: seriously, yes :-)

Pacemaker 1.1.8 should be fully functional with all three current corosync release series (1.2.x, 1.4.x and 2.0.x) as well as Heartbeat.

We have not removed support for anything, so if something is not working for you, please let us know on the mailing list.

Pacemaker and Cluster Filesystems

There is some confusion out there on how to use Pacemaker with the OCFS2 and GFS2 cluster filesystems.

Section 8.1 and 8.2 of Clusters from Scratch mentions some of the issues involved in the context of CMAN, but the principles are generally applicable.

The most important take-away, is that it is very important that all parts of the stack are making decisions based on the same membership and quorum data.

There were/are three options to achieve this:

have everyone talk to pacemaker
have everyone talk to cman
have everyone talk to corosync

Option 1 - Everyone Talks to Pacemaker

This option was written for and is maintained/supported by SUSE but didn’t really gain much traction outside of SLES. It also relies on a pacemaker plugin that gets loaded into corosync/openais, something that is no longer possible with corosync 2.x It briefly appeared upstream but once option 2 became possible, option 1 was removed (not by me).

Anyone not paying for OCS2 on SLES is probably best advised to move to option 2 or 3.

Requirements:

Filesystems supported: OCFS2
Corosync: 1.x
Pacemaker: any
Other: openais

Option 2 - Everyone Talks to CMAN

This is what works on most distros (except openSUSE/SLES) today. By virtue of being part of RHCS and its age, cman is available on most of today’s enterprise distros and is supported by OCFS2 and GFS2.

By modifying Pacemaker to support it, we gained the ability to use GFS2 and OCFS2 “for free” - without the need for custom dlm, gfs and ocfs controld’s.

Requirements:

Filesystems supported: GFS2, OCFS2
Corosync: 1.x
Pacemaker: 1.1.6 or later
Other: cman, openais

Option 3 - Everyone Talks to Corosync 2.0

With RHEL6 to be the last hoorah for CMAN, this is where things are headed upstream, however the only distro that ships this solution today is Fedora-17 (and shortly 18).

In this scenario, all components obtain membership and quorum directly from corosync. So far OCFS2 is the only component that hasn’t been updated to support this - they’re appear content to continue using their own messaging and membership layer.

Requirements:

Filesystems supported: GFS2
Corosync: 2.x
Pacemaker: 1.1.7 or later
Other: none

Which Is The Best Option For Me

If you’re a SLES customer looking to use OCFS2, absolutely take the Option 1 route. For everyone else, although Option 3 is architecturally superior, Option 2 is likely to be the safest approach for the next couple of years.

Mar
29th

Thu

Pacemaker 1.1.7 Now Available

After much hard work, the latest installment of the Pacemaker 1.1 release series is now ready for general consumption.

Changesets	513
Diff	1171 files changed, 90472 insertions, 19368 deletions

As well as the usual round of bug fixes, see the full changelog, this new release brings:

Support for Corosync 2.0
Logging optimisations (less of it and less work performed for logs that wont be printed)
The ability to specify that A starts after ( B or C or D )
Support for advanced fencing topologies: eg. kdump || (network && disk) || power
Resource templates and tickets have been promoted to the stable schema
Support for gracefully giving up resources depending on a ticket

As per our release calendar, the next 1.1 release is planned for mid-July.

Packages for all current editions of Fedora have been built and will be appearing shortly in the update channels. Other distributions will follow when their schedules allow it.

The source tarball (tar.gz) is also available directly from GitHub.

General installation instructions are available at from the ClusterLabs wiki.

Nov
24th

Thu

Pacemaker 1.0.12 Released

Thanks once again to the efforts of Keisuke MORI from NTT, the latest bug fixes have been back-ported from 1.1 and another instalment of the Pacemaker 1.0 release series is now ready for general consumption.

Changesets	96
Diff	121 files changed, 8617 insertions(+), 988 deletions(-)

Important changes since Pacemaker-1.0.11 include:

cib: Call gnutls_bye() and shutdown() when disconnecting from remote TLS connections
cib: Remove disconnected remote connections from mainloop
crmd: Cancel timers for actions that were pending on dead nodes
crmd: Do not wait for actions that were pending on dead nodes
crmd: Ensure we do not attempt to perform action on failed nodes
PE: Correctly recognise which recurring operations are currently active
PE: Demote from Master does not clear previous errors
PE: Ensure restarts due to definition changes cause the start action to be re-issued not probes
PE: Ensure role is preserved for unmanaged resources
PE: Ensure unmanaged resources have the correct role set so the correct monitor operation is chosen
PE: Move master based on failure of colocated group
pengine: Correctly determine the state of multi-state resources with a partial operation history
PE: Only allocate master/slave resources once
Shell: implement -w,—wait option to wait for the transition to finish
Shell: repair template list command

You also can see the full changelog,

I have updated the release calendar and the next 1.0.x release is planned for mid-May 2012.

The source tarball is also available directly from GitHub.

Pre-built packages for Pacemaker are available immediately for current openSUSE (12.1, 11.4, 11.3) and Fedora (16, 15, 14) releases as well as EPEL-5 from the ClusterLabs Build Area.

Users of more most distributions are encouraged to use the latest 1.1.x release - either from the 1.1 Build Area or from the distribution directly.

General installation instructions are available at from the ClusterLabs wiki.

Oct
13th

Thu

New Version Control System

Since September, Pacemaker has started using Git for the 1.1 and devel trees.

There were some minor technical advantages over Mercurial (which I still personally prefer), but mostly the decision was driven by the pain associated with switching between SCMs multiple times a day.

The majority of development now happens on GitHub, which has some great features for reviewing patches and general collaboration.

The Pacemaker tree is also periodically sync’d to the Cluster Labs server in case GitHub is unavailable for any reason.

For those new to Git, GitHub has many tips for setting up Git, creating a local copy of the Pacemaker repo to work in, submitting your changes upstream (we use the Fork + Pull Model), and other assorted resources.

Be sure to configure email and user information so you get credit for your hard work too!

New Issue Tracker

Since it’s clearly not acceptable for our issue tracker to be offline for months at a time, it is time to replace the Bugzilla instance hosted by the Linux Foundation with something else.

One candidate that came close was the github issue tracker, but alas it doesn’t support attachments. The end result is that we now have an instance of Bugzilla v4 at:

bugs.clusterlabs.org

Bug numbers start at 5000.
This avoids clashing with older ones and may enable us to import the old ones if it ever comes back up again. I would advise people to assume this wont happen and to re-create any unresolved issues.

May
2nd

Mon

Pacemaker 1.0.11 Released

The latest installment of the Pacemaker 1.0 release series is now ready for general consumption.

Changesets	85
Diff	500 files changed, 69642 insertions(+), 58270 deletions(-)

Thanks once again to the efforts of Keisuke MORI and NTT, the latest bug fixes have been back-ported from 1.1

Important changes since Pacemaker-1.0.10 include:

cib: Repair the processing of updates sent from peer nodes
crmd: All pending operations should be recorded, even recurring ones with high start delays
crmd: Bug lf#2509 - Watch for config option changes from the CIB even if we’re not the DC
crmd: Bug lf#2528 - Introduce a slight delay when creating a transition to allow attrd time to perform its updates
crmd: Bug lf#2545 - Ensure notify variables are accurate for stop operations
crmd: Bug lf#2559 - Fail actions that were scheduled for a failed/fenced node
crmd: Cancel recurring operations while we’re still connected to the lrmd
crmd: Don’t abort transitions when probes are completed on a node
crmd: Ensure the CIB is always writable on the DC by removing a timing hole
crmd: Update failcount for failed promote and demote operations
PE: Bug lf#2495 - Prevent segfault by validating the contents of ordering sets
PE: Bug lf#2508 - Correctly reconstruct the status of anonymous cloned groups
PE: Bug lf#2544 - Prevent unstable clone placement by factoring in the current node’s score before all others
PE: Bug lf#2554 - target-role alone is not sufficient to promote resources
PE: Ensure fencing of the DC preceeds the STONITH_DONE operation
PE: Ensure that fencing has completed for stop actions on stonith-dependent resources (lf#2551)
PE: Prevent clones from being stopped because resources colocated with them cannot be active
PE: Prevet use-after-free resulting from unintended recursion when chosing a node to promote master/slave resources
Shell: don’t create empty optional sections (bnc#665131)
Tools: Bug lf#2528 - Make progress when attrd_updater is called repeatedly within the dampen interval but with the same value
Tools: Prevent crm_resource commands from being lost due to the use of cib_scope_local

You also can see the full changelog,

As per our release calendar, the next 1.0.x release is planned for mid-September.

The source tarball is also available directly from Mercurial.

Pre-built packages for Pacemaker and it’s immediate dependancies are available immediately for openSUSE 11.2, 11.3, Fedora-13 and EPEL-5 from the ClusterLabs Build Area.

Users of more recent distributions are encouraged to use the latest 1.1.x - either from the 1.1 Build Area or the distribution directly.

General installation instructions are available at from the ClusterLabs wiki.

Feb
23rd

Wed

Pacemaker 1.1.5 Released

The latest installment of the Pacemaker 1.1 release series is now ready for general consumption.

Changesets	184
Diff	605 files changed, 46103 insertions(+), 26417 deletions(-)

As well as the usual round of bug fixes, see the full changelog, S.U.S.E. has implemented support for ACLs. This means that you can now delegate permission to control parts of the cluster (as defined by you) to non-root users.

ACLs are still disabled by default, but you can read their documentation, provide feedback and decide if its something you want to use.

As per our release calendar, the next 1.1 release is planned for mid-April and 1.0.11 should be available in March depending on how quickly we can get the bugfixes from 1.1 backported.

Pre-built packages for Pacemaker and it’s immediate dependancies are available immediately for openSUSE 11.3, Fedora-14 and EPEL-5 from the ClusterLabs Build Area.

The source tarball is also available directly from Mercurial.

General installation instructions are available at from the ClusterLabs wiki.