Graph Database Feature Comparison Table

Pere Urbón Bayes has put together an excellent table comparing features of notable graph databases: post, table (PDF): Neo4j, HyperGraphDB, DEX, InfoGrid, Sones and VertexDB.

One can quibble, as one always can with tables like this, but it is the best comparison of graph databases so far that I’m aware of. Check it out.

Three’s a Crowd: Neo4j, Sones, Filament all implement InfoGrid’s FirstStep Example

Little did I know when I put up InfoGrid’s FirstStep example. The example creates just a few nodes and a few edges to show, in principle, how to build a URL tagging application based on a graph database like InfoGrid.

Alex Popescu at MyNoSQL challenged the Neo4j folks how they would implement it, and they responded promptly. Then, the guys are Sones implemented the same example themselves, and just now the Filament project did the same. Worth a blog post with the links!

Here they are:

for your reading and comparing pleasure.

I’m tempted to list my own observations, but I’d like to avoid a blogging contest in which — naturally — everybody will claim “but the way we do it is better”. Independent reviews anybody?

Access Control the InfoGrid Way

We do it very similarly in InfoGrid.

But we can go one big step further: have InfoGrid automatically enforce the access control rules that were set up. If we have the ACL information, why not use it and have the graph database do the enforcement for us? That functionality has been part of InfoGrid for a couple of years.

For some detailed examples how this works, consult the security tests that are part of InfoGrid’s automated test suite (particularly MeshBaseSecurityTest5).

Here’s the basic idea. (again, paraphrasing the code for easier readability. Consult the above link for full code.) When a graph database in InfoGrid is configured to run with a AclBasedAccessManager, we can do this:

First, we need a MeshObject that is going to be the owner of some access-controlled data object. Any MeshObject will do:

MeshObject owner = createMeshObject();

Then, we associate the owner with the current Thread. (Just like in UNIX, where the ownership of processes determines what the process can do)

theAccessManager.setCaller( owner );

Now here comes the access-controlled data object.

MeshObject data = createMeshObject();

Because the current Thread is associated with the owner MeshObject, InfoGrid automatically sets up an ownership relationship between the data object and the owner object — just like in UNIX, a newly created file automatically has an owner.

Going beyond UNIX, we can now put the data object into something we call a ProtectionDomain. It’s basically a collection of MeshObjects that all have the same access control policy. This is mainly for efficiency and easy of management.

MeshObject protectionDomain = createMeshObject( AclBasedSecuritySubjectArea.PROTECTIONDOMAIN );
domain.relateAndBless( AclBasedSecuritySubjectArea.PROTECTIONDOMAIN_GOVERNS_MESHOBJECT.getSource(), dataObject );

Now, let’s give some another entity some access rights to the data object:

MeshObject actorMayReadNotWrite = createMeshObject();
actorMayReadNotWrite.relateAndBless( AclBasedSecuritySubjectArea.MESHOBJECT_HASREADACCESSTO_PROTECTIONDOMAIN.getSource(), domain );

Note that it is the owner of the object that needs to do that; others can’t.

So now we change ownership on the thread.

theAccessManager.setCaller( actorMayReadNotWrite );

This call will succeed:

dataObject.getPropertyValue( <some property type> );

while this call will throw a NotPermittedException:

dataObject.setPropertyValue( <some property type>, <some property value> )

If the thread was currently associated with the owner, both calls would succeed. Again, I refer you to the follow code linked above. As you can say, it works very similar to how permissions work in UNIX, although of course the underlying ACL information is represented as a MeshObjectGraph.

If you like this, we can do even one better: the whole security mechanism is pluggable in InfoGrid. You don’t like the way we represent and enforce ACLs? Be our guest … write a new subclass of AccessManager, and it will work the way you want. (Did we say that InfoGrid is extremely pluggable?)

P.S. It’s great to see that we aren’t the only ones to think that security-related information is an excellent match for a graph database. There’s also the rather intriguing example for where Microsoft is going with their LDAP directory, which very much looks like evolution in the graph direction. Time to get on board graph databases!

Comparing FirstStep in InfoGrid and Neo4j

Alex Popescu has a great comparison how the InfoGrid FirstStep example would look like in Neo4j, another graph database. As I noted in an earlier post, there are far more similarities in our approaches to the basics of graph databases than there are differences.

Couple comments, addressing some of Alex’ notes. He says:

everything in Neo4j must happen inside a transaction even if it’s a graph traversal operation (this gives a very strong Isolation level). The InfoGrid traversal code seem to happen outside the transaction…

That’s correct. You can do the traversal inside a transaction if you like, but you are not required to. This gives application developers one more option for concurrency control: transactions, critical sections, and no protection.

Re InfoGrid terminology, it’s ancient roots are in object modeling (think UML) — for example, we still talk about InfoGrid Models. However, over time it became clear that InfoGrid’s core ideas are far distinct enough to warrant their own terms. So when we moved from InfoGrid V1 to V2 a few years ago, we changed terms. For example, an “instance” (in UML or programming terminology) aka “node” (graph terminology) rarely can have more than one type anywhere other than in InfoGrid. Think of a Java object that has more than one class, and you can dynamically add and remove classes to the instance at run-time. So we call them MeshObjects rather than something people might have the wrong connotations with. The closes we are aware of is Perl’s “bless”, which is why we use that term.

the Neo4j uses also the LuceneIndexService for indexing both the tag and web resources nodes, but that’s only because the code there makes sure not to duplicate either tags or web resources (i.e. this functionality is not present in the InfoGrid code and I don’t know how that would look like)

Correction. You are invited to modify the example and attempt to create a second MeshObject with the same identifier. You can’t (it will throw an exception at you).

But never mind the comparatively minor differences between Neo4j and InfoGrid. We should compare this to all the stuff one would have to do with a relational database to build the same thing. Object-relational mapping anybody? No thanks …