« Discovering Switches: It's amazing what you can learn just by listening... | Main | »

13 August 2012


Feed You can follow this conversation by subscribing to the comment feed for this post.

Michael Hunger


in general this is a sound thing. There are two caveats. First is that you create contention on your reference node when you create and assign new nodes from multiple threads. (A solution for that could be adding batching for updating this "secondardy index" structure).
The second is when reading from the ref-node that it will get hundreds of millions of rels in some domains and that hurts, especially if you have traversals that accidentally touch the reference-node (by following the instance_of relationship) and will explore all the many other rels in the next hop.

Both can be alleviated by creating a tree structure. First level can be concrete ref-nodes per type and if that's too much you can still introduce levels beneath that, partitioned e.g. by a domain aspect like time, location, subnet, customer, datacenter, you name it.

See http://docs.neo4j.org/chunked/milestone/cypher-cookbook-path-tree.html


Alan R.

This is good information to know. I don't currently expect for my problem domain to be creating large numbers of nodes of the same type in a short period of time on a routine basis. That would imply that lots of new hardware showed up all at once - which is likely to only occur during initial installation. But I much appreciate your expert opinion, and will keep your advice in mind as we go forward. Thanks Much!

The comments to this entry are closed.

Become a Fan

AddThis Social Bookmark Button
Blog powered by Typepad