InfoGrid 2.9.5 Released

It’s been a while since the last release, but why change if users are been generally happy with it ;-) … Here’s what’s new:

Highlights:

  • Native support for currency values (e.g. $1.23 USD)
  • Even simpler JSON generation
  • StringDataTypes can now carry a regular expression, which is checked before assignment (e.g. makes it impossible to assign a String that isn’t to a property that must contain an e-mail address)
  • Support for cascading delete
  • InfoGrid compilation is now reported to work on Windows
  • Multi-tenancy support for LID
  • More flexible handling of HTTP status codes in Probe framework (e.g. what should happen upon a redirect)
  • Extensions to custom JEE tags and HttpShell to make JSP programming with InfoGrid even simpler, e.g. simple tags to create Javascript-based overlay forms
  • Improved error reporting
  • Additional tests
  • Lots of small usability improvements and bug fixes

Here are the gory details:

Core:

  • ModuleAdvertisementSerializer to generate relative paths to JARs
  • more types of MeshObjectSelector
  • added getLocalId to NetMeshObjectIdentifier
  • added guessFromExternalForm to IdentifierFactory
  • use “” not null for localId of home object
  • Augmented set of pre-defined BlobDataTypes
  • Improved BlobDataType to check for MIME compatibility with regexes
  • added CurrencyValue and CurrencyDataType to represent money as a primitive type
  • added convenience method to MeshObject: determine whether the relationship to a neighbor MeshObject is blessed with a certain RoleType
  • Moved OnDemandTransaction from HttpShell into utils
  • implemented cascading delete support
  • added TimeStampValue.earlierOf and TimeStampValue.laterOf for convenience
  • Special value NOW for TimeStampValue parsing
  • regex can be be specified for StringDataTypes
  • added AbstractDelegatingLocalizedRuntimeException for symmetry and completeness
  • better error reporting for TransactionActionExceptions
  • Explicitly set latin1 character set in MySQL store to avoid key size error when the default character set is multi-byte.
  • TransactionAction.execute() now does not take a Transaction parameter any more, but provides it as a member variable
  • check raw id even when guessing MeshObjectIdentifier
  • when there’s a ParseException during parsing of store content, don’t return null but throw NoSuchElementException
  • Added MeshObjectHelper for simpler JSP programming with TypedMeshObjectFacades
  • Carry specify exceptions keyed by failed MeshObjectIdentifiers in MeshObjectAccessException, instead of swallowing all exceptions other than the first
  • MeshObjectsNotFoundException is not a subclass of MeshObjectAccessException any more
  • Correct singular/plural error messages for MeshObjectAccessException and MeshObjectsNotFoundException
  • NetMeshObjectAccessException now carries redirect NetMeshObjectAccessSpecifications
  • NetMeshObjectAccessSpecification is now an Identifier
  • Migrated sweeping functionality from the various MeshBase implementations into Sweeper. The policy functionality of Sweeper became SweepPolicy.
  • Better pingpong debugging support
  • created MeshObjectSet.hasSameContent
  • upgrade to warn if MySQL database cannot be detected/created
  • Id column in MySQL needs to be case-sensitive
  • don’t report exception in rollback that isn’t an error
  • default expiration for UnnecessaryReplicasSweepPolicy
  • Allow the sweep of a single MeshObject
  • MeshObject.delete and sub-methods have no business to throw NotPermittedException
  • Non-semantic “kill” of a ShadowMeshBase
  • Kill button in ShadowAwareAllMeshBasesViewlet
  • No need for public method initiateCeaseCommunications on Proxy
  • Don’t try to talk to Proxies when NetMeshBase is dead
  • Tolerate some IsDeadExceptions when dead (duh)
  • Use default mime type for BlobValue when none is given when parsing StringRepresentation
  • Avoid that IsDeadException kills meaningful error message
  • support typefield XML attribute for StringDataTypes as well
  • added StringDataType for e-mail
  • fixed automatic merge problem
  • allow MeshObject.equals( TypedMeshObjectFacade ) for simplicity
  • use get_XXX pattern in TypedMeshObjectFacade for non-Properties/non-pseudo-Properties to avoid name clashes with code-generated setters
  • use get_XXX pattern in TypedMeshObjectFacade for non-Properties/non-pseudo-Properties to avoid name clashes with code-generated setters
  • renamed Rfc3339Util to DateTimeUtil and expanded to W3C timestamps
  • DefaultNetMeshBaseIdentifierFactory now uses pluggable “Scheme”s to support identifiers of various protocols, including http, https, file, xri, acct
  • renamed UnknownProtocolParseException to UnknownSchemeParseException
  • added ability to specify Maps in Resource files
  • When parsing NetMeshBaseIdentifiers, turn each String into canonical form as well, e.g. hTTp: and http:
  • added toColloquialExternalForm() to Identifier; more sane
  • removed IdentifierStringifier.toColloquialIdentifier()
  • use guessFromExternalForm in more places for more tolerance (e.g. when reading from Store to be more tolerant of Scheme changes)
  • be more tolerant of Exceptions when restarting MeshBases; try best-effort
  • Better logging of start / die of InfoGridWebApps
  • when resetting PingPong, only increment token by 1
  • more defensive programming if Storage layer is corrupt
  • refactor message sending framework to always deliver all received messages together instead of singly
  • merge two XprisoMessages before processing if possible
  • added ByPropertyValueSelector and PropertyComparisonOperator
  • NotRelatedException not thrown any more when testing whether two MeshObjects are related by a given RoleType; was counter-intuitive and pedantic
  • StringRepresentationParameters must always be given.
  • Make it easier to generate MeshObjectIdentifiers with different lengths
  • upgraded Postgresql JDBC for 64bit support
  • Netbeans 7 support
  • Make sure Threadpool is always cleanly shut down. Now Tomcat won’t hang occasionally after undeploy.
  • support custom regex violation error messages for PropertyTypes that specify regular expressions
  • refactored L10Map into L10PropertyValueMap and L10StringMap, moved to utils package
  • more HTML class declarations in BlobValue HTML
  • added invert method to most MeshObjectSelectors
  • added MeshObjectSet.subset convenience method
  • Don’t include dead MeshObjects in MeshObjectSets
  • factored out ExternalCommand so invoking an external process becomes simpler
  • default MeshObjectIdentifier to contain _ instead of ~

Identity-related:

  • AccountManager needs a site parameter for determining account to support multi-tenancy
  • generate login event even if only the session gets renewed
  • XrdsProbe to cache the XRDS file
  • added content and content type as parameters to XmlProbe interface; enables XrdsProbe to cache the XRDS file
  • added getGroupNames to LidAccount
  • check group membership by group identifier as well as group name
  • also check group membership by groups set as request attribute
  • factored out some symbolic OpenID parameter names
  • fix when associations expire
  • Better integrated AccessManager and ThreadIdentityManager
  • root may do anything in AclBasedAccessManager
  • Split AclBasedSecurity into two: an ACL-based AccessManager implementation that does not rely on Guards, and an AccessManager implementation that evaluates Guards but is independent on a particular model (such as the AclBasedSecurity model)
  • Eliminated org.infogrid.model.AclBasedSecurity, and instead introduced org.infogrid.meshbase.security.aclbased, which also includes a model
  • DelegatingAccessManager now supports delegating to multiple other AccessManagers, all of which must agree
  • added ThreadIdentityManager.suExec
  • When there isn’t a ProtectionDomain, it’s free for all in AclBasedAccessManager
  • moved org.infogrid.model.SecurityTest into the attic; currently not used
  • added expiring credentials to AuthenticationStatus
  • expired OpenID associations throw OpenIdAssociationExpiredException
  • OpenID dumb mode fixes
  • Added getCarriedExpiredCredentialTypes to LidClientAuthenticationStatus
  • added org.infogrid.lid.model.account/Account_LastLoggedIn which can come in handy
  • Removed Account Status value CREATED — not particularly useful and would have needed update logic
  • credential values are now separate from LidAccount, e.g. the StoreLidPasswordCredentialType knows where to look for the password
  • don’t need LidCredentialType.isRemote any more
  • allow password change from StoreLidPasswordCredentialType
  • removed realm from LID; never quite used
  • better support for multi-tenancy / virtual hosts in LID: carry siteIdentifier around in more places
  • distinguish one-time-tokens from other types of credentials in LID
  • more methods for setting, checking, resetting password credentials
  • support that MeshObject can own itself in AclUtils
  • A little helper utility to generate secret keys

Model-related:

  • Code-generate an entry for the SubjectArea itself, too
  • exactly one SubjectArea element is expected in each model element in the model XML
  • Added HTTP status code and location properties to WebResource
  • minor rename of SubjectArea user-visible name

Probe-related:

  • Handle HTTP redirects better when encountered by Probes; existing behavior should be unchanged however
  • Do not throw FactoryException as a direct cause of NetMeshObjectAccessException but its cause instead
  • Introduced ProbeException.HttpRedirectResponse
  • Added NetMeshBaseRedirectException, so ProbeException.HttpRedirectResponse does not need to be moved up to module org.infogrid.kernel.net
  • Introduced HttpMappingPolicy that enables different policies for mapping HTTP non-OK status codes into InfoGrid; default remains the same behavior
  • Change in API to set up ProbeManagers etc.: ProbeDirectory passed to ProbeManager instead of ShadowMeshBaseFactory, then ShadowMeshBaseFactory is notified where ProbeManager can be found
  • ProbeException.HttpErrorResponse not needed any more
  • Support HTTP redirects that give relative URLs as Location, against the HTTP spec (I’m looking at you, Google)
  • HttpMappingPolicy has moved from ProbeManager to ProbeDirectory
  • Fixes issue where wrong HttpMappingPolicy was set upon restore of Shadow from disk
  • Throw exception if ProbeManager is configured without a ProbeDirectory
  • added basic XRD support to InfoGrid
  • added acct: as one of the basic protocols supported by NetMeshBaseIdentifier
  • added Webfinger support
  • Don’t throw ProbeException.DontHaveNonXmlStreamProbe if the found content is of length zero

Viewlet/GUI-related:

  • Renamed NotIfViewletState to NotIfViewletStateTag for consistency
  • throw correct error message if no subject
  • fix status in AbstractSetIterateTag
  • colloquial=true by default
  • by default, sort alphabetically in TreeIterateTag
  • fix display of <HOME> for home object broken in last checkin
  • added potentiallyShorten to WebContextAwareMeshObjectIdentifierStringifier
  • instrument to track HTTP client-side calls
  • trim strings first to deal with sloppy JSP writing
  • added request property to JeeMeshObjectsToView
  • change default app server to Tomcat6
  • pass SaneUrl through to ViewletFactory, so it can access request attributes
  • created org.infogrid.jee.security.aclbased tag library for AclBasedSecurity Subject Area
  • added TextStructuredResponseSection.containsContent
  • replaced ViewletAlternativesTag.js with more general-purpose ToggleCssClass.js
  • support redirect to newly created object in HttpShell
  • Created BracketTags, which allow conditional generation of, say, <ul> and </ul> tags depending on whether or not the content tag has any non-whitespace content.
  • factored out AbstractSaneRequest.urlWithoutMatchingArguments for easier reusability
  • trim entered identifiers before trying to resolve them in HttpShellFilter
  • Better error reporting for HttpShell
  • IncludeViewletTag more robust if path not specified
  • Removed the List<Throwable> in request attribute for collecting processing exceptions; now abstracted into new interface ProblemReporter
  • Allow null identifiers for create access verb in HttpShell again
  • hyperlink on UnsafePostException’s message
  • TitleTag to use a separate section in the StructuredResponse
  • default mechanism for TitleTag with Viewlet name and app name, can be overridden with TitleTag
  • <tmpl:title> tag prints <title> tags itself
  • fixed userVisibleName on Viewlet
  • JeeViewlets’ default POST URL used to sometimes leave out reached MeshObjects, which would list all found-by-traversal MeshObjects after post, which was very confusing. Removed traversal spec in case there are no reached MeshObjects
  • Pass subject’s userVisibleString into default Viewlet title construction.
  • Created InvalidViewletActionException.
  • enable logging of when InfoGridWebApps start and stop
  • added HttpShellHandler to HttpShell
  • rollback Transaction if there is an exception during execution by HttpShell
  • support for radiobox-driven “which MeshObject should I be related to” relate/unrelate from the HttpShell
  • Added sweep and accessLocally to HttpShell HTML
  • Created ProxyTag in analogy to emitting other objects like MeshObjects etc.
  • Sweep support for the HttpShell
  • Added limit parameter to SetIterateTags in analogy to SQL limit
  • Added MeshObjectSetIterateBeyondLimitTag that gets invoked when a MeshObjectSet iteration stops due to a limit parameter so the JSP can display a “more” link or something like that
  • display trailing slash on URLs even if isColloquial; if not, user cannot distinguish URLs with and without trailing slashes
  • added IfNullTag and NotIfNull tag
  • allow to set properties via the HttpShell at the same time the MeshObject is only being created
  • working on allowing to set properties via the HttpShell at the same time the MeshObject is only being created
  • added allowNull attribute to PropertyTag to remove “remove” label for optional properties when so needed
  • fix ignore=”true” bug in MeshObjectTag
  • implemented a “JSP subroutine” set of custom tags
  • allow TypedMeshObjectFacade in place of any MeshObject for JSP custom tags; makes life simpler
  • use ‘block’ instead of ‘inline’ for create label in PropertyTags, so CSS can be used to move label vertically
  • added the just-completed Transactions to the invocation of HttpShellHandler, so implementations have a chance to understand what happened
  • added deleted MeshObjects to variable map passed to HttpShellHandler
  • tag to easily report a problem into the error log
  • tags to safeguard JSPs against inadvertent invocation by the wrong user or with the group user groups
  • support AccountStatus APPLIEDFOR and REFUSED
  • change default HTML title if Viewlet reports error; that way we don’t accidentially leak information if there is an access control problem
  • Re-implemented CSRF remedy: now match cookie and form instead of issuing form tokens
  • use lowercase cookie names
  • Added NotIfErrorsTag and NotIfInfoMessagesTag
  • Use StringRepresentation framework to format the label String for null PropertyValues
  • augmented HttpShellHandler to have invocable method before, after, at the beginning and at the end of Transactions
  • MeshType may be given directly in PropertyIterateTag
  • Change where <HOME> special identifier is defined. Stringifying MeshObjectIdentifier for HomeObject now produces “”, while stringifying MeshObject produces <HOME>
  • HttpShell now recognizes “not set” value because “” is now a valid value for a MeshObjectIdentifier
  • added JspoXXXTags
  • removed now unnecessary org.infogrid.jee.store project
  • carry incoming request in DefaultJeeNetMeshObjectsToView
  • Allow both bean and identifiers in AbstractRelatedTag and subclasses
  • always use multi-part encoding for Jspo forms so file uploads are always possible
  • make JSP tags more consistent that use MeshObjects: can now specify MeshObject directly, MeshObjectIdentifier (direct and as String), or name of bean
  • added MeshObjectBlessedByTag
  • allow optional parameters on Jspo and Jspf tags
  • Support $1.23 in addition to 1.23 USD for CurrencyValue
  • support different capitalizations for MeshObjectBlessedByTag
  • SingleMemberTraversalTag for convenience
  • additional keywords for HttpShell
  • allow HttpShell to override/specify direction of a redirect after shell execution
  • additional callback afterAccess for HttpShell
  • allow immediate activation of called JspoTag through property on call
  • Added Json viewlet. New viewlet that will write a direct acyclic object graph in Json to level n.
  • added custom tag that makes it easy to iterate over the intersection of two traversal sets
  • allow MeshObjectBlessedByTag to write to a variable instead of to stdout
  • rollback transaction when HttpShellHandler throws HttpShellException
  • added MissingArgumentException that turns out to be generally useful
  • Created custom tags to help matching URL arguments easily
  • fixed incorrect processing of RuntimeException in HttpShell (particularly from HttpShellHandlers)
  • DataType would sometimes invalidly truncate HTML output with maxLength specified. Hope this implementation is better.
  • slight refactoring of JeeViewlet-internal API to make it easier to remove URL parameters from Viewlet’s default Post URL

MeshWorld/NetMeshWorld-related:

  • Show number of MeshObjects in the MeshBase in AllMeshObjectsViewlet
  • round borders for webkit-based browsers too
  • NetMeshWorld has no business depending on AclBasedSecurity at this time
  • introduced property AnetMeshBase.ALLOW_NON_LOCAL_MESHOBJECT_CREATION to allow non-local NetMeshObject creation — useful for test instrumentation, for example
  • created ModuleInventoryViewlet
  • added app resource file to MeshWorld and NetMeshWorld
  • Added Sweeping button into NetMeshWorld.
  • Added ability to filter MeshBase content in AllMeshObjectsViewlet
  • Enable AllMeshObjectsViewlet to view empty sets.
  • Added properties file for Log4jConfigurationViewlet
  • add direct link to PropertySheetViewlet from AllMeshObjectsViewlet
  • Added filtering by home proxy to AllNetMeshObjectsViewlet
  • NetMeshWorld uses AllNetMeshObjectsViewlet, not AllMeshObjectsViewlet now
  • added “requires Javascript” via <noscript> to templates in example and test apps
  • Sort EntityTypes alphabetically in AllMeshObjectsViewlet

What’s the biggest obstacle to GraphDB adoption?

The recent workshop on Graph Databases in Barcelona sparked an interesting debate among vendors of graph databases about how to accelerate graph database adoption.

I don’t think that debate has been resolved yet. Perhaps this post will help a bit.

Some opinions that I heard were:

  • there are significant differences between the various graph database products on the market (e.g. graphs, property graphs, properties on edges or not, hypergraphs etc.). Unless they are more similar, customers will fear lock-in and not buy any of them. Here is a variation:
  • relational databases only took off once there was agreement on the SQL standard among vendors. We need to cooperate to create a similar language, otherwise graph database adoption will not take off either.
  • most potential users of graph databases have never heard of them. How could they use any if they don’t know they exist?
  • even if possible users know of graph databases, they do not know what the use cases are because use cases and success stories have not broadly been documented.

What do you think the biggest obstacles are for graph database adoption?

The Bless Relationships API

InfoGrid supports types on both nodes and edges, aka MeshObjects and Relationships. They are called EntityTypes and RelationshipsTypes, respectively.

Assume we have two related MeshObjects, each blessed with a (hypothetical) type Person. Like this:

MeshObject obj1 = ...;
MeshObject obj2 = ...;
obj1.bless( PERSON );
obj2.bless( PERSON );
obj1.relate( obj2 );

Now if we want to express that the second Person is the daughter of the first Person, we cannot do this:

obj1.blessRelationship( PERSON_PARENT_PERSON, obj2 );

because we don’t know, from this API, whether obj1 or obj2 is the daughter or the parent.

Instead, in InfoGrid, we have to explicitly specify the direction, and we do this by specifying the “end” of the relationship type instead of the relationship type for the bless operation, like this:

obj1.blessRelationship( PERSON_PARENT_PERSON.getSource(), obj2 );

Hopefully, the documentation of the parent relationship type in the model indicates the semantics of the source and the destination of each relationship type. These “ends” are called RoleTypes, by the way.

One nice side effect of this approach to the API is that it becomes quite straightforward to extend it if InfoGrid ever decided to support ternary or N-ary relationship types: Just pick a RoleType beyond a source or destination RoleType.

If you bless relationships from the HttpShell — the simplest way of manipulating the graph from a web application — you thus need to specify the identifier of the RoleType, not of the RelationshipType. By convention, the source RoleType of a RelationshipType with identifier ID is ID-S, and ID-D for the destination.

Required vs. Optional Property Values

InfoGrid distinguishes between properties that must have a non-null value, and properties that may or or may not be null.

When creating an InfoGrid model, a developer has to specify which by using the <isoptional/> tag in the model file.

Why?

By way of parallel, consider the following piece of Java code:

class Foo {
    private int max1 = 10;
    private Integer max2 = 20;

    public void doSomething() {
        for( int i=0 ; i<max1 ; ++i ) {
            //...
        }
        for( int i=0 ; i<max2 ; ++i ) {
            //...
        }
    }

Spot the problem? max2 of course might be null, which means our code will throw an exception in the innocent-looking second for loop. To get the code right, we will have to protect that section with an if-then-else section that checks for null first.

Of course, such a protection is often the right thing to do. But in this example, a “max” should hardly ever be null, so using an “int” as a data type like for max1 (which can’t be null) is much better than using an “Integer” like for max2 (which may be null).

It’s the same thing for properties in InfoGrid models. Some properties simply should never be null. For example, consider a time stamp indicating when a MeshObject was created. Given that the MeshObject was created, the time stamp must exist, and therefore a null value makes no sense. In which case the property would be specified as “mandatory”. On the contrary, a time stamp when a MeshObject is likely to become obsolete is very likely optional: we might not know that time (yet), or it might never become obsolete, so null values are fine.

If InfoGrid did not distinguish between required and optional values, application code would be littered with unnecessary tests for null values. (or failing that, unexpected NullPointerExceptions.) We think being specific is better when creating the model; higher-quality and less cluttered application code is the reward.

Also check out the following related posts:

InfiniteGraph Implementation of FirstStep

InfiniteGraph, the currently youngest member of the GraphDB party, has now also implemented our FirstStep example. Todd Stavish’s code is here. It joins implementations of the same example from InfoGrid, Neo4j, Sones, and Filament.

[Update: see Todd's comment below. Apparently there is checking in the second step.] On cursory examination, I’m surprised that InfiniteGraph allows you (requires you?) to create edges without source and destination nodes. Only after the creation of the edge does one assign the nodes to the edge. I’m unclear why this is an advantage for any particular scenario; however, I would think there’s a clear disadvantage as an application developer, because I now have to check for null pointers that I wouldn’t have to in most other graph databases (InfoGrid included).

Hope somebody more neutral than me will perform an API comparison using this example some day.

Welcome Infinite Graph

It always looked like it was only a matter of time until the object database companies would try and become graph databases. Perhaps that is what they should have been all along. I’m speaking as somebody who tried several products almost 20 years ago and decided that they were just too much hassle to be worth it: graphs are a much better abstraction level than programming-level constructs for a database.

Today, Objectivity announced “Infinite Graph”, a:

strategic business unit is tasked with bringing [a] enterprise-ready and distributed graph database product to market

(I took the liberty of eliminating the “marketing” superlatives from the quote; the entire press release has a very generous sprinkling of them.)

Actually, they only announced a beta program, which I signed up for. InfiniGraph.com says:

X:\> BETA IS NOW OPEN

But then, on the screen behind, they say:

Over the next several days, we’ll be preparing our installer and documentation for distribution to the InfiniteGraph community. Stay tuned, and feel free to participate in the discussion on our beta blog!

Well, well, the difficulties of a launch. So I don’t know yet what they created. But it’s good to see another player legitimizing graph databases as a category. So, welcome Objectivity!

InfoGrid 2.9.4 Released

This release contains some major improvements, in particular to the way the object graph is mapped to and from the web. Download or browse documentation here.

General:

  • improved stability and error reporting
  • lots of bug fixes
  • more tests
  • removed symbolic links from SVN; was an endless source of frustration

Core:

  • renamed TraversalDictionary to TraversalTranslator: it can be much more dynamic than a dictionary
  • implemented KeywordTraversalTranslator with fixed translation keywords
  • implemented XpathTraversalTranslator with a pseudo-subset of Xpath
  • introduced AllNeighborsTraversalSpecification instead of null TraversalSpecification; introduced StayRightHereTraversalSpecification
  • MeshObject’s userVisibleName now always returns value of a Property called “Name” if the MeshObject has one; this seems a sensible default for many applications
  • MeshObjectIdentifierFactory now has a pointer back to the MeshBase to which it belongs. Downside is one cannot use the same instance of MeshObjectIdentifierFactory for multiple MeshBases any more.
  • got rid of MeshStringRepresentationContext, which partially overlapped with the purpose of MeshStringRepresentationParameters; only historical reasons can explain why we had both
  • removed title/target/additionalParameters argument from HasStringRepresentation.toStringRepresentationLinkStart; now handled as parameters in StringRepresentationParameters
  • simplified StringRepresentation of PropertyValues and related
  • additional pre-defined StringRepresentations e.g. HttpPost
  • distinguish between formatting Properties (which may be null) and PropertyValues; no more muddling with funny MeshObject context parameters; formatting is now performed via DataType
  • use correct ClassLoader to load ResourceHelper default properties file during Module initialization
  • correct initialization of ResourceHelper in module.adv
  • expanded MeshObjectSet and MeshObjectSetFactory API

Identity-related:

  • refactored LID implementation to use an “instructions” based approach to the pipeline, instead of an exceptions-based approach. This is more flexible for users of the module.
  • fixed typo about credential vs. credtype.
  • renamed LidPersona to LidAccount; more natural to talk about it using that name
  • added LidAccountStatus and SiteIdentifier to LidAccount for multi-tenancy
  • factored account and session-related concepts from org.infogrid.lid.model.lid into new SubjectArea org.infogrid.lid.model.account.
  • new custom tag library that deals with identity

Model-related:

  • fixed code generation bug for descriptions of values in EnumeratedDataTypes
  • code generator to generate static constants for all EnumeratedValues.
  • added AccountCollection to account LID model
  • expanded TestModel to cover both optional and mandatory PropertyTypes; renamed PropertyTypes correspondingly
  • improved Test model for more comprehensive testing
  • better user-visible strings for EntityTypes Bookmark, Account and WebResource

Viewlet/GUI-related:

  • Viewlet framework and tag library extensions for including Viewlets in Viewlets; updated Viewlets accordingly; now allows in-context editing, change of viewlet types etc. for included JeeViewlets; no contiguous TraversalPath from top required
  • support REST-ful URLs on hierarchical Viewlets e.g. GraphTreeViewlet with multiple Viewlet alternatives in contained Viewlet
  • removed iframe in (Net)MeshWorld; hierarchical Viewlets is better approach
  • various HTML fixes and improvements related to the rendering and editing of PropertyValues
  • removed rootPath on all custom tags; not needed or used
  • sanitized formatting of Identifiers; it’s still not totally sane but a lot more so
  • eliminated PropertyValueTag.{css,js} and replaced with PropertyTag.{css,js}
  • removed -moz-opacity CSS value per recent Firefox updates
  • fix HTML doctype to make IE more happy
  • BlobViewlet to get its PropertyType from URL argument not POST argument
  • footer element for MeshObjectSetIterate tag
  • do not print iterateHeader and iterateFooter when set has no content in MeshObjectSetIterateTags
  • default POST behavior is now redirect-to-GET on same URL, so browser refresh is not as awful
  • added missing setter methods on StructuredResponse
  • created SetSizeTag to print the size of a MeshObjectSet in JSP
  • slight changes how MeshObjects are shown on screen by default (dropped annotation in which non-standard MeshBase they are)
  • overflow: auto; to support long CSS floats
  • added orderBy property to setIterate JSP tags
  • added ability to sort in the inverse direction
  • created propertymeter JSP tag for bar graphs or temperature graphs based on Properties
  • default sorting in JSP MeshObjectSet tags is by user-visible String
  • don’t set domains on cookies when run from localhost; makes life of developers hard
  • generate Javascript for PropertyValues
  • eliminating unnecessary projects by moving their code into other projects:
    • org.infogrid.jee.rest -> org.infogrid.jee.viewlet
    • org.infogrid.jee.rest.net -> org.infogrid.jee.viewlet.net
  • renaming projects:
  • org.infogrid.jee.rest.net.local -> org.infogrid.jee.viewlet.net.local
  • org.infogrid.jee.rest.net.local.store -> org.infogrid.jee.viewlet.net.local.store
  • org.infogrid.jee.rest.store -> org.infogrid.jee.viewlet.store
  • renamed meshObjectLoopVar to loopVar in custom tags
  • added arrivedAt property to Viewlet
  • enctype attribute on safeForm is all lowercase
  • created SaneUrl, new supertype of SaneRequest that allows to reuse API for URLs and servlet requests; slight API naming changes httpHost vs. server; allows us to get rid of OverridingSaneRequest nonsense
  • DEFAULT_LINK_START/END_ENTRY now consistently on StringRepresentation
  • removed RestfulRequest, replaced with a MeshObjectsToViewFactory that directly translates SaneRequest into MeshObjectsToView
  • an instance of MeshObjectsToViewFactory must now reside in Context
  • removed NetViewletDispatcherServlet; not needed any more
  • removed most redundant methods on Viewlet; better have one clear way how to do it only
  • upgraded ViewletFactoryChoice: now HasStringRepresentation and contains MeshObjectsToView; this means unfortunately that ViewletFactory setup in applications needs to pass MeshObjectsToView to their choices()
  • ViewedMeshObjects now keeps reference to MeshObjectsToView that it took its data from
  • removed unnecessary request attributes like JeeViewlet.VIEWLET_STATE_TRANSITION_NAME: can be obtained via Viewlet
  • made MeshObjectsToView an interface and subtyped to JeeMeshObjectsToView and NetMeshObjectsToView for cleaner model
  • renamed getMeshObjects to getViewedMeshObjects for consistency
  • ViewletState has moved from JeeViewlet to JeeViewedMeshObjects; added isDefaultState

Graph Database Scaling: InfoGrid’s Contrarian View

There’s an ongoing debate how to scale graph databases horizontally, and I got “strongly encouraged” to present the InfoGrid point of view.

In a nutshell: to make it work, scale like the web, not like a database.

Let me explain:

If you start out with a single relational database server, and you want to scale it horizontally to thousands of servers, you have a problem: it doesn’t really work. SQL makes too many assumptions that one simply cannot make for a massively distributed system. Which is why instead of running Oracle, Google runs BigTable (and Amazon runs Dynamo etc.), which were designed from the ground up for thousands of servers.

If you start out with a single hypertext system, and you want to scale it horizontally to millions of servers, you have a problem, too: it doesn’t really work either. Which is why we got the world-wide-web, which has been inherently designed for a massively decentralized architecture from the get-go, instead of a horizontally-scaled version of Xanadu.

So he we are, and people are trying to scale graph databases from one to many servers. They are finding it hard, and so far I have not seen a single credible idea, never mind an implementation, even in an early stage. Guess what? I think massively scaling a graph database that’s been designed using a one-server philosophy will not work either, just like it didn’t work for relational databases or hypertext. We need a different approach.

With InfoGrid, we take such a fundamentally different approach. To be clear, its current re-implementation in the code base is early and is still missing important parts. But the architecture is stable, the core protocol is stable, and it has been proven to work. Turning it back on across the network is a matter of “mere” programming.

To explain it, let’s lay out our high-level assumptions. They are very similar to the original assumptions of the web itself, but unlike traditional database assumptions:

Traditional database assumptions Web assumptions InfoGrid assumptions
All relevant data is stored in a centrally administered database Let everybody operate their own server with their own content and create hyperlinks to each other Let everybody operate their own InfoGrid server on their own data and share sub-graphs with each other as needed (see also note on XPRISO protocol below)
Data is “imported” and “exported” from/to the database from time to time, but not very often: it’s an unusual act. Only “my” data in “my” database really matters. Data stays where it is, a web server makes it accessible from/to the web. Through hyperlinks, that server becomes just one in millions that together form the world-wide-web. Data stays where it is, and a graph database server makes it look like a (seamless) sub-graph of a larger graph, which conceivably one day could be the entire internet
Mashing up data from different databases is a “web services” problem and uses totally different APIs and abstractions than the database’s Mashing up content from different servers is as simple as an <a…, <img… or <script… tag. Application developers should not have to worry which server currently has the subgraph they are interested in; it becomes local through automatic replication, synchronization and garbage collection.

This approach is supported by a protocol we came up with called XPRISO, which stands for eXtensible Protocol for the Replication, Integration and Synchronization of distributed Objects. (Perhaps “graphs” would have been a better name, but then the acronym wouldn’t sound as good.) There is some more info about XPRISO on the InfoGrid wiki.

Simplified, XPRISO allows one server to ask another server for a copy of a node or edge in the graph, so they can create a replica. When granted, this replica comes with an event stream reflecting changes made to it and related objects over time, so the receiving server can obtain the received replica in sync. When not needed any more, the replication relationship can be canceled and the replicas garbage collected. There are also features to move update rights around etc.

For a (conceptual) graphical illustration, look at InfoGrid Core Ideas on Slideshare, in particular slides 15 and 16.

Code looks like this:

// create a local node:
MeshObject myLocalObject = mb.getMeshObjectLifecycleManager().createMeshObject();
// get a replica of a remote node. The identifier is essentially a URL where to find the original
MeshObject myRemoteObject = mb.accessLocally( remote_identifier );

// now we can do stuff, like relate the two nodes:
myLocalObject.relate( myRemoteObject );

// or set a property on the replica, which is kept consistent with the original via XPRISO:
myRemoteObject.setProperty( property_type, property_value );

// or traverse to all neighbors of the replica: they automatically get replicated, too
MeshObjectSet neighbors = myRemoteObject.traverseToNeighbors();

Simple enough? [There are several versions of this scenario in the automated tests in SVN. Feel free to download and run.]

What does this contrarian approach to graphdb scaling give us? The most important result is that we can assemble truly gigantic graphs, one server at a time: each server serves a subgraph, and XPRISO makes sure that the overlaps between the subgraphs are kept consistent in the face of updates by any of the servers. Almost as important is that it enables a bottom-up bootstrapping of large graphs: everybody who feels like participating sets up their own server, populates it with the data they wish to contribute (which doesn’t even need to be “copied” by virtue of the InfoGrid Probe Framework), and link to others.

Now, if you think that makes InfoGrid more like a web system than a database, we sympathize. There’s a reason we call InfoGrid an “internet graph database”. Or it might look more like a P2P system. But other NoSQL projects use peer-to-peer protocols, too, like Cassandra’s gossip protocol. And I predict: the more we distribute databases, the more decentralized they become, the more the “database” will look like the web itself. That’s particularly so for graph databases: after all, the web is the largest graph ever assembled.

We do not claim that this approach addresses all possible use cases for scaling graph databases. For example, if you need to visit every single node in a gazillion-node graph in sequence, this approach is just as good or bad as any other: you can’t afford the memory to get the entire graph onto your local server, and so things will be slow.

However, the InfoGrid approach elegantly addresses a very common use case: adding somebody else’s data to the graph. “Simply” have them set up their own graph server, and create the relationships to objects in other graph servers: XPRISO will keep them maintained. Leave data in the place where its owners have it already is a very powerful feature; without it, the web arguably could not have emerged. It further addresses control issues and privacy/security issues much better than a more database’y approach, because people can remain in control over their subgraph: just like on the web, where nobody needs to trust a single “web server operator” with their content; you simply set up your own web server and then link.

Want to help? ;-)

InfoGrid Interview Published

Pere Urbon published his e-mail interview with me about InfoGrid:

This is the fourth in his series on graph databases:

It’s rather apparent that while these projects are all GraphDBs, they differ substantially in what they are trying to accomplish, and why, and therefore how they do it. This is a good resource for developers investigating GraphDBs and trying to understand their alternatives.

InfoGrid Screencasts on YouTube

InfoGrid is now on YouTube at InfoGridDotOrg.

Occasionally we’ll post a software demo. The first one is there already: a screencast of the MeshWorld example application for the InfoGrid graph database. It shows:

  • How to create and delete MeshObjects (ie. nodes)
  • How to relate them to each other (ie. edges)

It follows the FirstStep example, except that MeshWorld is a web application, while FirstStep is a command-line application. There’ll be more to come.