In this exciting episode of the Assimilation Project Neo4j schema series, I'll go over how the monitoring ring structures (which are key to its scalability) are represented in Neo4j. This is in some ways pretty simple, and some ways kind of clever. Either way it is one of the more critical parts of the schema to update correctly. Below is an example of a 3-node ring - after three nodes have joined, and been connected up in a ring. This particular ring was produced by one of my test cases.
Some of you who read the Neo4j mailing list will remember my struggles with coming up with a schema for this - mentioning I was going to have to think harder finishing with something like "I hate that". The schema below is the result of this harder thinking. Although there was some thinking involved, it wasn't so much harder as it was retraining my brain to think of how to express my problem within the power and limitations of a graph database, and in particular Neo4j. Although I didn't initially like the arbitrarily-named ring relationships, the idea has grown on me and I kind of like it by now. If you don't like it, then by all means let me know, and tell me how I should have done it in the comments.
Fans from my previous post will note the introduction of a new entity - a Ring - which I have named "The_One_Ring" (with a nod and apologies to J. R. R. Tolkien). Each of the rings has a RingMember_The_One_Ring relationship pointing to the ring(s) they belong to. If a Drone belongs to more than one ring, then it will have more than one RingMember_* relationship. Note that this is where the idea of a formal schema gets stretched a little thin. There is no pre-set limit on the types (or names) of RingMember_* relationships that might exist - you'll have one for every subnet, and one for every switch. But because Neo4j is schemaless, and our code is flexible and consistent, no one cares.
Then, there are the RingNext_The_One_Ring relationships - which are always cyclic. Each member that has a RingMember_The_One_Ring relationship also has exactly one RingNext_The_One_Ring relationship as well. These RingNext_* relationships will always form a cycle. These RingNext_* relationships represent exactly the monitoring rings which are described on the Assimilation web site, and appear in our earliest blog post on the subject. The only difference is that in the packet world, we have to construct separate links in each direction, and in Neo4j, there is no reason to create two relationships, since you can traverse the relationships in either direction. However, they aren't truly bi-directional links, since there is an ordering of nodes on the ring.
Whenever a node is added to a monitoring ring, the corresponding RingNext_* relationships are added to the Neo4j model of the world at the same time.
There are two deficiencies in the current ring management code - which will be remedied as time permits. These are:
- The RingMember_* relationships ought to have an IPaddr attribute which says which IP address of the server is used in this particular ring. Because servers may have multiple interfaces, many rings have network topology affinity, and interfaces and IP addresses can change over time, it would be good to know for sure which address was used when it joined the ring. This is simple to cure, but it's not my priority this week ;-)
- The maintenance of rings both in the database and in the network needs to be protected with transactions. It is essential that these rings are never corrupted, even if the server crashes while updating them. It is also vital that the database and the network be in sync. You definitely don't want to have to do the equivalent of an fsck on the network and the database to figure out what was broken because of the crash. That would get a significantly high reading on the old suck-o-meter. This one is not nearly as simple to cure.
Neo4j has transactions, but the REST interface I'm currently using doesn't support them easily. In addition, it is also important that updates to the rings in the network match the database. If the Collective Management Authority (CMA) crashes while doing a ring update, it is essential that we know that the update was in progress when the system crashed so we can complete it. For this purpose, Neo4j transactions are not directly helpful, since they can't repair the rings out in the network when they are completed after a crash. Most of the other updates to the database are idempotent - that is, they can be repeated with no ill-effects. However, maintaining a linked list in the database and in the network requires transactional semantics - beyond the obvious database transaction paradigm. This will an interesting puzzle to solve and the right solution will require some thought. Although it might make my head hurt, it will be a good kind of hurt...
You may recall from my previous post I mentioned that it could have easily been represented in a hierarchical database. Now, this part of the graph doesn't look hierarchical at all - and it's not even a DAG (Directed Acyclic Graph). There is always a cycle in it - and in a larger data center with multiple levels of rings, there will be many cycles by design. One could represent it with order in a relational database, but it would be painful. Although Neo4j doesn't enforce my schema, it does enforce something very useful - nodes cannot be removed and leave dangling relationships. Although you can write something similar in relational database constraints, it is expensive to enforce - lots of looking up nodes in secondary indices before deleting and so on. Definitely slow and messy.
There is still a lot of things to cover - some that I've already put in the code, and some that are coming soon on my radar screen. Stay tuned for another exciting episode in Assimilation Schema for Neo4j next week.