The Bless Relationships API

InfoGrid supports types on both nodes and edges, aka MeshObjects and Relationships. They are called EntityTypes and RelationshipsTypes, respectively.

Assume we have two related MeshObjects, each blessed with a (hypothetical) type Person. Like this:

MeshObject obj1 = ...;
MeshObject obj2 = ...;
obj1.bless( PERSON );
obj2.bless( PERSON );
obj1.relate( obj2 );

Now if we want to express that the second Person is the daughter of the first Person, we cannot do this:

obj1.blessRelationship( PERSON_PARENT_PERSON, obj2 );

because we don’t know, from this API, whether obj1 or obj2 is the daughter or the parent.

Instead, in InfoGrid, we have to explicitly specify the direction, and we do this by specifying the “end” of the relationship type instead of the relationship type for the bless operation, like this:

obj1.blessRelationship( PERSON_PARENT_PERSON.getSource(), obj2 );

Hopefully, the documentation of the parent relationship type in the model indicates the semantics of the source and the destination of each relationship type. These “ends” are called RoleTypes, by the way.

One nice side effect of this approach to the API is that it becomes quite straightforward to extend it if InfoGrid ever decided to support ternary or N-ary relationship types: Just pick a RoleType beyond a source or destination RoleType.

If you bless relationships from the HttpShell — the simplest way of manipulating the graph from a web application — you thus need to specify the identifier of the RoleType, not of the RelationshipType. By convention, the source RoleType of a RelationshipType with identifier ID is ID-S, and ID-D for the destination.

Error Messages, Parameters and Other Resources In InfoGrid

InfoGrid internally configures all parameters and messages via resource files. This makes internationalization much easier, and gives application developers a way of changing internal parameters if so desired. InfoGrid includes an override mechanism allows developers to use different values than InfoGrid’s default values without having to change InfoGrid’s property files.

Let’s take an example. Class org.infogrid.httpd.HttpServer implements a very simple HTTP server that’s been quite useful for automated testing, for example. Its default listening port should not be hard-coded, but instead is configured via a resource. In the code, it looks as follows:

private static final ResourceHelper theResourceHelper
        = ResourceHelper.getInstance( HttpServer.class );

protected static final int DEFAULT_ACCEPT_PORT
        = theResourceHelper.getResourceIntegerOrDefault(
                "DefaultAcceptPort",
                8081 );

Class org.infogrid.util.ResourceHelper is basically a wrapper around the Java resource facilities, but with some additional facilities, such as the ability to parse integers and use defaults as can be seen from this code fragment. When this code runs, InfoGrid looks for an override first (more about this below), then for a property file called org/infogrid/httpd/HttpServer.properties. In that file, it looks for a line that looks like this:

DefaultAcceptPort=1234

If found, DEFAULT_ACCEPT_PORT will be assigned that value. If not found (as in the current InfoGrid version, as the line is commented out), it will use the default specified in the code, here 8081.

If a developer wishes to override the property in that file, they could of course change that property file, but that would effectively create an InfoGrid branch, which is undesirable. Instead, an override mechanism can be used:

The ResourceHelper can be initialized with an “application resource bundle”, which essentially is just another property file specific to the application. Resources specified in that file override all others. An application developer could specify, in that file:

org.infogrid.httpd.HttpServer!DefaultAcceptPort=80

and DEFAULT_ACCEPT_PORT will become 80. It is necessary to qualify the name of the resource with the fully-qualified name of the ResourceHelper so no naming collisions occur across modules or developers, hence the prefix separated by the !.

The application resource bundle for an application can be set manually, by invoking ResourceHelper.setApplicationResourceBundle( ResourceBundle ) in the application’s startup code, or, preferably when using the InfoGrid Module Framework, in the application’s module advertisement as a parameter like this:

<parameter name="org.infogrid.util.ResourceHelper.ApplicationResourceBundle"
        value="com/example/ExampleApp" />

This identifies file com/example/ExampleApp.properties as the application resource file.

One More Word on GraphDBs and Schemas

myNoSQL picked up our recent post on why having a schema is A Good Thing. In the comments to his post, various other graph database vendors speak up. They are far more ambiguous on the matter than we are …

For now, I think of Brian Akers’s take on this subject as the final word. With the comment “I know where everything is… don’t touch”, he shows this picture:

[very messy office]

If you can afford code and data like this, be my (schema-less) guest. Personally, I can’t.

’nuff said.

Required vs. Optional Property Values

InfoGrid distinguishes between properties that must have a non-null value, and properties that may or or may not be null.

When creating an InfoGrid model, a developer has to specify which by using the <isoptional/> tag in the model file.

Why?

By way of parallel, consider the following piece of Java code:

class Foo {
    private int max1 = 10;
    private Integer max2 = 20;

    public void doSomething() {
        for( int i=0 ; i<max1 ; ++i ) {
            //...
        }
        for( int i=0 ; i<max2 ; ++i ) {
            //...
        }
    }

Spot the problem? max2 of course might be null, which means our code will throw an exception in the innocent-looking second for loop. To get the code right, we will have to protect that section with an if-then-else section that checks for null first.

Of course, such a protection is often the right thing to do. But in this example, a “max” should hardly ever be null, so using an “int” as a data type like for max1 (which can’t be null) is much better than using an “Integer” like for max2 (which may be null).

It’s the same thing for properties in InfoGrid models. Some properties simply should never be null. For example, consider a time stamp indicating when a MeshObject was created. Given that the MeshObject was created, the time stamp must exist, and therefore a null value makes no sense. In which case the property would be specified as “mandatory”. On the contrary, a time stamp when a MeshObject is likely to become obsolete is very likely optional: we might not know that time (yet), or it might never become obsolete, so null values are fine.

If InfoGrid did not distinguish between required and optional values, application code would be littered with unnecessary tests for null values. (or failing that, unexpected NullPointerExceptions.) We think being specific is better when creating the model; higher-quality and less cluttered application code is the reward.

Also check out the following related posts:

InfoGrid Storage Interface

We’ve been getting a lot of hits from search engines looking for the Infogrid’s “Storage Interface”. That interface is called Store and here is more info about it: entry on wiki, entry to JavaDoc.

Sometimes one has to manually fix Google and that’s the sole purpose of this post ;-)

ACID Transactions Are Overrated

For many years, the canonical example why we need database transactions has been banking. If you move $100, you don’t really want the money be subtracted from the first account, but never be added to the second because of some problem in between. You want both the subtraction and the addition to happen, or neither.

Sounds good so far. Just apparently that is not how banks work in the real world, and they certainly use enough database systems that have ACID transactions. The Economist (July 24, 2010, “Computer says no”) quotes a former executive of the Royal Bank of Scotland saying: “The reality was you could never be certain that anything was correct.” Continuing: “Reported numbers fot the bank’s exposure were regularly billions of dollars adrift of reality.” The article offers an explanation: “banks tend to operate lots of different databases producing conflicting numbers.” HSBC is quoted: 55 separate systems for core banking, 24 for credit cards, and 41 for internet banking.

According to traditional transaction wisdom, if a customer makes an internet transaction to pay off his credit card, it should be a single transaction: start transaction, take money from checking account, put it into credit card account, commit. But transactions generally cannot span systems. Because the system responsible for internet banking is separate in this real-world example from core banking and from credit card systems, no such single ACID transaction is possible. Given the numbers above, it looks very much like those money transfers that actually can follow the canonical ACID transaction pattern constitute only a very small fraction of all transactions (like transferring from checking to savings.)

If I look at my own banking, the vast majority of my banking activity isn’t even within the same bank, but with other banks: bills to pay usually have to be paid at other banks. No cross-bank ACID database transactions that I’ve ever heard of.

So banking software necessarily has to have functionality that prevents that money is deducted but never arrives, all without depending on database transactions. If we have to have this functionality anyway, why then are transactions “indispensable” as some people still want to make us believe?

This pattern can be generalized: the more distributed and decentralized a system is, the less likely it is that we can use transactions that span the entire system. That is certainly true for the banking system, apparently also true for systems inside banks, and in many other places. ACID transactions were invented for the mainframe, the world’s most centralized computing construct. But computing is not “one mainframe” any more I’m afraid as it was in the sixties.

Instead of trotting out transactions as the answer, what we need for NoSQL databases is the ability to get the same benefits in a distributed system (“nothing is lost”) without relying on transactions. That’s where all of our efforts should be.

In the InfoGrid graph database, we have a weaker form of transactions for individual MeshBases, but then synchronize with the rest of the world by passing XPRISO messages. I’d be surprised if the eventual “transactional” architecture for large-scale distributed and decentralized systems looked very different.

Tip: Always RelateIfNeeded when using HttpShell

The InfoGrid HttpShell is something rather amazing when creating web applications. With just a little bit of HTML markup, we can modify the graph of nodes and edges in the GraphDB any way we want, all without writing any handler code for it.

For edges, it understands the keywords:

  • relate
  • relateIfNeeded
  • unrelate
  • unrelateIfNeeded

Relate will fail if the two nodes in question are related already. And in practice, that turns out to be more common than one would assume; only blessing of the edge with the appropriate RoleType is often needed.

So here’s the tip of the day: unless there are good reasons not to, use “relateIfNeeded” and “unrelateIfNeeded” for your application code.

InfiniteGraph Implementation of FirstStep

InfiniteGraph, the currently youngest member of the GraphDB party, has now also implemented our FirstStep example. Todd Stavish’s code is here. It joins implementations of the same example from InfoGrid, Neo4j, Sones, and Filament.

[Update: see Todd's comment below. Apparently there is checking in the second step.] On cursory examination, I’m surprised that InfiniteGraph allows you (requires you?) to create edges without source and destination nodes. Only after the creation of the edge does one assign the nodes to the edge. I’m unclear why this is an advantage for any particular scenario; however, I would think there’s a clear disadvantage as an application developer, because I now have to check for null pointers that I wouldn’t have to in most other graph databases (InfoGrid included).

Hope somebody more neutral than me will perform an API comparison using this example some day.

InfoGrid Now Supports Money as a Native Data Type

There’s a new DataType called CurrencyDataType, and corresponding CurrencyValue. A CurrencyValue consists of a fixed-point decimal number and one of the ISO 4217 currencies. For example, you can say

CurrencyValue newValue = CurrencyValue.create( 12, 34, CurrencyDataType.USD );
System.out.println( newValue );

which will, by default, print:

12.34 USD

It correctly handles currencies that have 0, 1, or 3 digits after the decimal point, too.

In a JSP page, you can say, as you would expect:

<mesh:property
    meshObjectName="Subject"
    propertyType="org.infogrid.model.Test/AllProperties_OptionalCurrencyDataType" />

(or the identifier of a PropertyType in your model that is of type CurrencyDataType) which will print 12.34 USD in view mode, and automatically make it editable in edit mode.

Easiest to try out by running the test app org.infogrid.jee.testapp. Currently currency support is only in trunk, but will get promoted over time.

Isn’t it much easier if your data platform does this, than having to write currency parsing and rendering code all over one’s application?

Welcome Infinite Graph

It always looked like it was only a matter of time until the object database companies would try and become graph databases. Perhaps that is what they should have been all along. I’m speaking as somebody who tried several products almost 20 years ago and decided that they were just too much hassle to be worth it: graphs are a much better abstraction level than programming-level constructs for a database.

Today, Objectivity announced “Infinite Graph”, a:

strategic business unit is tasked with bringing [a] enterprise-ready and distributed graph database product to market

(I took the liberty of eliminating the “marketing” superlatives from the quote; the entire press release has a very generous sprinkling of them.)

Actually, they only announced a beta program, which I signed up for. InfiniGraph.com says:

X:\> BETA IS NOW OPEN

But then, on the screen behind, they say:

Over the next several days, we’ll be preparing our installer and documentation for distribution to the InfiniteGraph community. Stay tuned, and feel free to participate in the discussion on our beta blog!

Well, well, the difficulties of a launch. So I don’t know yet what they created. But it’s good to see another player legitimizing graph databases as a category. So, welcome Objectivity!