InfoGrid Screencasts on YouTube

InfoGrid is now on YouTube at InfoGridDotOrg.

Occasionally we’ll post a software demo. The first one is there already: a screencast of the MeshWorld example application for the InfoGrid graph database. It shows:

  • How to create and delete MeshObjects (ie. nodes)
  • How to relate them to each other (ie. edges)

It follows the FirstStep example, except that MeshWorld is a web application, while FirstStep is a command-line application. There’ll be more to come.

Graph Database Feature Comparison Table

Pere Urbón Bayes has put together an excellent table comparing features of notable graph databases: post, table (PDF): Neo4j, HyperGraphDB, DEX, InfoGrid, Sones and VertexDB.

One can quibble, as one always can with tables like this, but it is the best comparison of graph databases so far that I’m aware of. Check it out.

Operations on a Graph Database (Part 7 – Sets)

Graph Database Tutorial

Part 1: Nodes

Part 2: Edges

Part 3: Types

Part 4: Properties

Part 5: Identifiers

Part 6: Traversals

Part 7: Sets

Part 8: Events

Sets are a core concept of most databases. For example, any SQL SELECT statement in a relational database produces a set. Sets apply to Graph Databases just as well and are just as useful:

The most frequently encountered set of nodes in a Graph Database is the result of a traversal. For example, in InfoGrid, all traversal operations result in a set like this:

MeshObject    startNode     = ...; // some start node
MeshObjectSet neighborNodes = startNode.traverseToNeighbors();

We might as well have returned an array, or an Iterator over the members of the set, were it not for the fact that there are well-understood set operations that often make our jobs as developers much simpler: like set unification, intersection and so forth.

For example, in a social bookmarking application we might want to find out which sites both you and I have bookmarked. Code might look like this:

MeshObject me  = ...; // node representing me
MeshObject you = ...; // node representing you

TraversalSpecification ME_TO_BOOKMARKS_SPEC = ...;
    // how to get from a person to their bookmarks, see post on traversals
MeshObjectSet myBookmarks   = me.traverse( ME_TO_BOOKMARKS_SPEC );
MeshObjectSet yourBookmarks = you.traverse( ME_TO_BOOKMARKS_SPEC );

// Bookmarks that you and I share
MeshObjectSet sharedBookmarks = myBookmarks.intersect( yourBookmarks );

Notice how simple this code is to understand? One of the powers of sets. Or, if you know what a “minus” operation is on a set, this is immediately obvious:

// Bookmarks unique to me
MeshObjectSet myUniqueBookmarks = myBookmarks.minus( yourBookmarks );

This is clearly much simpler than writing imperative code which would have lots of loops and if/then/else’s and comparisons and perhaps indexes in it. (And seeing this might put some concerns to rest that NoSQL databases are primitive because they don’t have a SQL-like query language. I’d argue it’s less the language but the power of sets, and if you have sets you have a lot of power at your fingertips.)

To check out sets in InfoGrid, try package org.infogrid.mesh.set. Clearly much more can be done than we have so far in InfoGrid, but it’s a very useful start in our experience.

Welcome FlockDB

Nick Kallen at Twitter last night released FlockDB, Twitter’s social graph database. Source code is here.

If anybody doubted that graph databases are real, or are useful, this release is yet another good reason to investigate. Welcome FlockDB to the crowd.

I haven’t had time to take a detailed look, but it appears that FlockDB has a hard-coded schema developed specifically for the needs at Twitter. That makes a lot of sense for Twitter but less so as a general-purpose graph database. On the other hand, lots of people could probably benefit from that schema when building social applications. We’ll see.

So, welcome.

Big Data Workshop April 23, Mountain View, CA

I’m planning to be at Big Data Workshop, the first unconference on NoSQL and Big Data. If past events moderated by Kaliya Hamlin are any guide, it will be a great opportunity for everybody:

  • to explore together how the Big Data market will be coming together
  • to understand how the key technologies and projects work
  • what interfaces and interoperability standards are emerging and/or needed
  • how we can grow the overall market and make it easier for everybody to adopt these technologies for interesting new projects.

Arguably, without Internet Identity Workshop (also moderated by Kaliya) was the enabler for the stunning adoption rate over the past five years of OpenID, OAuth and related technologies (at last count, more than 1 billion enabled accounts). I hope history repeats itself here.

P.S. Feel free to corner me on InfoGrid, graph databases or any other subject. That’s the whole point of an unconference.