Quorum Server Illustrated - updated
In two earlier posts [1] [2], I gave brief descriptions of the quorum server which seem to have left as much confusion as they provided clarity. This post is only about the Linux-HA quorum server, and includes illustrations for clarity.
The Linux-HA Quorum API
In the Linux-HA quorum API, you can configure a number of quorum modules which are used as follows. If a quorum module returns HAVEQUORUM, then the cluster has quorum. If it returns NOQUORUM then the cluster does not have quorum. If a quorum module returns QUORUMTIE, then the next quorum module in the list is consulted. If the final module returns QUORUMTIE, then it is treated as a NOQUORUM event.
The quorum daemon is normally used in conjunction with the nomal arithmetic voting quorum module, so that it is only consulted when the number of nodes in the cluster is exactly half the number of configured modules in the system. So, it is worth noting that the quorum server will never be consulted if a cluster has an odd number of nodes.
Quorum Server Scenarios
Below, I'll go through the basic quorum server cases so you can see how all this works in more detail - with pictures, even!
Normal Situation - Everything up
In the picture above, everything is normal. The quorum server is up, and both sites are also up. Because the cluster has all its nodes up, the quorum server is irrelevant.
In the situation above, we show the "New Jersey" site as down. In this case, the conventional voting quorum has a tie (1/2 - exactly half of the nodes). In this case the quourm server is consulted. Since only New York is talking to the quorum server, the quorum server grants quorum to the New York site.
In the case above, the link between the sites has been lost, but both sites and the quorum server are all up. In this case, both New York and New Jersey contact the quorum server because each sees 1/2 nodes as being up - resulting in a tie condition.
In this case, the quorum server will choose one of the two sites to provide quorum to, and I assume in this case that New York was chosen. Because New Jersey wasn't granted quorum, it will shut its resources down.
What happens when the quorum server goes down?
That is the situation shown above. Because New York and New Jersey are both up, they have 2/2 votes and both provide service as they should. This illustrates the point that the quorum server is not a single point of failure.
Multiple Failures -> Loss of Service
In this final case, multiple failures have occurred - both New Jersey and the quorum server are down. In this case, New York doesn't have quorum, so it shuts down services and none are provide by any node in the cluster. Of course, this situation can be overridden in the cluster configuration by changing the quorum policy, but from an automated perspective, this is all that can be (should be) done.
Security Concerns
If you want to run your quorum server communications across networks which mig


