Saturday 14 November 2009

Linux HA Pt 2: Explained and Examples

It's probably right that at this time we look at a little bit of history of Linux-HA. Development of Linux-HA had been going along well based around the old Linux-HA web site primarily focussed on a package called heartbeat. Heartbeat was responsible for ensuring a cluster of servers (referred to from here as nodes) had quorum. It would then operate services such as Apache, MySQL etc. depending on the state of the cluster. The old format of the configuration for making something like Apache highly-available would be like this:

/etc/ha.d/ha.cf:
node ha1 ha2 # Refers to the hosts required to be in the cluster
bcast eth0 # The type of network communication chosen


/etc/ha.d/haresources:
ha1 192.168.1.100 httpd # ha1 is the preferred node, using a virtual (floating) IP address of 192.168.1.100 and control the httpd service.


This is where we must introduce the concept of OCF scripts. These are enhanced startup scripts much like standard Linux init scripts. They are used to start, stop and give status of a resource (cluster term for a service). Heartbeat would try to use OCF scripts first and if it couldn't find them it would try to use the init script of the specified name.

These were the bad old days, it was all rather limited but it did what it said. Now we have Pacemaker.

Fast forward to now and Heartbeat has been pushed aside by OpenAIS and Pacemaker. 2 separate pieces of software work together to give you a cluster with lots more features than before. OpenAIS does the low-level cluster messaging layer and provides Pacemaker with information on the state of the cluster. Pacemaker is a replacement for the later Cluster Resource Manager of Heartbeat and does everything else required - start and stop resources, decide on policy and make decisions of where and how to run resources, fence off misbehaving nodes etc.

The configuration format of Pacemaker has evolved to the following:

# crm configure show

node ha1
node ha2
primitive MySQL ocf:heartbeat:mysql \
        params user="root" binary="/usr/bin/mysqld_safe" pid="/var/run/mysqld/mysqld.pid" socket="/var/run/mysqld/mysqld.sock" \
        op monitor interval="1m" timeout="20s" \
primitive Virtual-IP ocf:heartbeat:IPaddr2 \
        params ip="192.168.1.100" broadcast="192.168.1.255" cidr_netmask="24" \
        op monitor interval="1m" timeout="10s" \

 The node entries at the top aren't really required. They are dynamically set by the cluster from what OpenAIS will tell it. The primitive MySQL refers to a standard resource called MySQL (arbitrary name), then it's of type OCF (another valid entry here is LSB, which refers to a standard init script), provided by heartbeat (most of the OCF resource agents are originally taken from Heartbeat) and it's called mysql. You can see how this relates to the filesystem like so:

root@ha1:~# ls /usr/lib64/ocf/resource.d/heartbeat/mysql*
/usr/lib64/ocf/resource.d/heartbeat/mysql  /usr/lib64/ocf/resource.d/heartbeat/mysql-proxy

So in the OCF RA's provided by Heartbeat related to MySQL we have mysql and mysql-proxy.


Next we have another primitive resource named Virtual-IP. This is another OCF RA provided by Heartbeat named IPaddr2. IPaddr2 will add and remove IP addresses and move them around the cluster as you specify. It's just a shell script that does quite a lot. Have a look at how it works if you feel so inclined:


root@ha1:~# vi /usr/lib64/ocf/resource.d/heartbeat/IPaddr2


The parameters we pass to it are fairly easy to understand. The next line which is op monitor is telling the cluster to add a monitor operation which asks the resource agent every 1 minute if the IP address is functioning correctly and will wait 10 seconds before giving up and flagging a failure.


With the features Pacemaker now provides it's very easy to set up clusters with many nodes limited only by your imagination. By default a resource can run on any node but only on 1 node. We can tell the cluster to run the resource anywhere and then take advantage of Linux Virtual Server load-balancing. I'll post how to do that next time.

No comments:

Post a Comment