May 26th, 2010
·
It always looked like it was only a matter of time until the object database companies would try and become graph databases. Perhaps that is what they should have been all along. I’m speaking as somebody who tried several products almost 20 years ago and decided that they were just too much hassle to be worth it: graphs are a much better abstraction level than programming-level constructs for a database.
Today, Objectivity announced “Infinite Graph”, a:
strategic business unit is tasked with bringing [a] enterprise-ready and distributed graph database product to market
(I took the liberty of eliminating the “marketing” superlatives from the quote; the entire press release has a very generous sprinkling of them.)
Actually, they only announced a beta program, which I signed up for. InfiniGraph.com says:
X:\> BETA IS NOW OPEN
But then, on the screen behind, they say:
Over the next several days, we’ll be preparing our installer and documentation for distribution to the InfiniteGraph community. Stay tuned, and feel free to participate in the discussion on our beta blog!
Well, well, the difficulties of a launch. So I don’t know yet what they created. But it’s good to see another player legitimizing graph databases as a category. So, welcome Objectivity!
May 13th, 2010
·
This release contains some major improvements, in particular to the way the object graph is mapped to and from the web. Download or browse documentation here.
General:
- improved stability and error reporting
- lots of bug fixes
- more tests
- removed symbolic links from SVN; was an endless source of frustration
Core:
- renamed TraversalDictionary to TraversalTranslator: it can be much more dynamic than a dictionary
- implemented KeywordTraversalTranslator with fixed translation keywords
- implemented XpathTraversalTranslator with a pseudo-subset of Xpath
- introduced AllNeighborsTraversalSpecification instead of null TraversalSpecification; introduced StayRightHereTraversalSpecification
- MeshObject’s userVisibleName now always returns value of a Property called “Name” if the MeshObject has one; this seems a sensible default for many applications
- MeshObjectIdentifierFactory now has a pointer back to the MeshBase to which it belongs. Downside is one cannot use the same instance of MeshObjectIdentifierFactory for multiple MeshBases any more.
- got rid of MeshStringRepresentationContext, which partially overlapped with the purpose of MeshStringRepresentationParameters; only historical reasons can explain why we had both
- removed title/target/additionalParameters argument from HasStringRepresentation.toStringRepresentationLinkStart; now handled as parameters in StringRepresentationParameters
- simplified StringRepresentation of PropertyValues and related
- additional pre-defined StringRepresentations e.g. HttpPost
- distinguish between formatting Properties (which may be null) and PropertyValues; no more muddling with funny MeshObject context parameters; formatting is now performed via DataType
- use correct ClassLoader to load ResourceHelper default properties file during Module initialization
- correct initialization of ResourceHelper in module.adv
- expanded MeshObjectSet and MeshObjectSetFactory API
Identity-related:
- refactored LID implementation to use an “instructions” based approach to the pipeline, instead of an exceptions-based approach. This is more flexible for users of the module.
- fixed typo about credential vs. credtype.
- renamed LidPersona to LidAccount; more natural to talk about it using that name
- added LidAccountStatus and SiteIdentifier to LidAccount for multi-tenancy
- factored account and session-related concepts from org.infogrid.lid.model.lid into new SubjectArea org.infogrid.lid.model.account.
- new custom tag library that deals with identity
Model-related:
- fixed code generation bug for descriptions of values in EnumeratedDataTypes
- code generator to generate static constants for all EnumeratedValues.
- added AccountCollection to account LID model
- expanded TestModel to cover both optional and mandatory PropertyTypes; renamed PropertyTypes correspondingly
- improved Test model for more comprehensive testing
- better user-visible strings for EntityTypes Bookmark, Account and WebResource
Viewlet/GUI-related:
- Viewlet framework and tag library extensions for including Viewlets in Viewlets; updated Viewlets accordingly; now allows in-context editing, change of viewlet types etc. for included JeeViewlets; no contiguous TraversalPath from top required
- support REST-ful URLs on hierarchical Viewlets e.g. GraphTreeViewlet with multiple Viewlet alternatives in contained Viewlet
- removed iframe in (Net)MeshWorld; hierarchical Viewlets is better approach
- various HTML fixes and improvements related to the rendering and editing of PropertyValues
- removed rootPath on all custom tags; not needed or used
- sanitized formatting of Identifiers; it’s still not totally sane but a lot more so
- eliminated PropertyValueTag.{css,js} and replaced with PropertyTag.{css,js}
- removed -moz-opacity CSS value per recent Firefox updates
- fix HTML doctype to make IE more happy
- BlobViewlet to get its PropertyType from URL argument not POST argument
- footer element for MeshObjectSetIterate tag
- do not print iterateHeader and iterateFooter when set has no content in MeshObjectSetIterateTags
- default POST behavior is now redirect-to-GET on same URL, so browser refresh is not as awful
- added missing setter methods on StructuredResponse
- created SetSizeTag to print the size of a MeshObjectSet in JSP
- slight changes how MeshObjects are shown on screen by default (dropped annotation in which non-standard MeshBase they are)
- overflow: auto; to support long CSS floats
- added orderBy property to setIterate JSP tags
- added ability to sort in the inverse direction
- created propertymeter JSP tag for bar graphs or temperature graphs based on Properties
- default sorting in JSP MeshObjectSet tags is by user-visible String
- don’t set domains on cookies when run from localhost; makes life of developers hard
- generate Javascript for PropertyValues
- eliminating unnecessary projects by moving their code into other projects:
- org.infogrid.jee.rest -> org.infogrid.jee.viewlet
- org.infogrid.jee.rest.net -> org.infogrid.jee.viewlet.net
- renaming projects:
- org.infogrid.jee.rest.net.local -> org.infogrid.jee.viewlet.net.local
- org.infogrid.jee.rest.net.local.store -> org.infogrid.jee.viewlet.net.local.store
- org.infogrid.jee.rest.store -> org.infogrid.jee.viewlet.store
- renamed meshObjectLoopVar to loopVar in custom tags
- added arrivedAt property to Viewlet
- enctype attribute on safeForm is all lowercase
- created SaneUrl, new supertype of SaneRequest that allows to reuse API for URLs and servlet requests; slight API naming changes httpHost vs. server; allows us to get rid of OverridingSaneRequest nonsense
- DEFAULT_LINK_START/END_ENTRY now consistently on StringRepresentation
- removed RestfulRequest, replaced with a MeshObjectsToViewFactory that directly translates SaneRequest into MeshObjectsToView
- an instance of MeshObjectsToViewFactory must now reside in Context
- removed NetViewletDispatcherServlet; not needed any more
- removed most redundant methods on Viewlet; better have one clear way how to do it only
- upgraded ViewletFactoryChoice: now HasStringRepresentation and contains MeshObjectsToView; this means unfortunately that ViewletFactory setup in applications needs to pass MeshObjectsToView to their choices()
- ViewedMeshObjects now keeps reference to MeshObjectsToView that it took its data from
- removed unnecessary request attributes like JeeViewlet.VIEWLET_STATE_TRANSITION_NAME: can be obtained via Viewlet
- made MeshObjectsToView an interface and subtyped to JeeMeshObjectsToView and NetMeshObjectsToView for cleaner model
- renamed getMeshObjects to getViewedMeshObjects for consistency
- ViewletState has moved from JeeViewlet to JeeViewedMeshObjects; added isDefaultState
May 2nd, 2010
·
Pere Urbon published his e-mail interview with me about InfoGrid:
This is the fourth in his series on graph databases:
It’s rather apparent that while these projects are all GraphDBs, they differ substantially in what they are trying to accomplish, and why, and therefore how they do it. This is a good resource for developers investigating GraphDBs and trying to understand their alternatives.
April 5th, 2010
·
I’m planning to be at Big Data Workshop, the first unconference on NoSQL and Big Data. If past events moderated by Kaliya Hamlin are any guide, it will be a great opportunity for everybody:
- to explore together how the Big Data market will be coming together
- to understand how the key technologies and projects work
- what interfaces and interoperability standards are emerging and/or needed
- how we can grow the overall market and make it easier for everybody to adopt these technologies for interesting new projects.
Arguably, without Internet Identity Workshop (also moderated by Kaliya) was the enabler for the stunning adoption rate over the past five years of OpenID, OAuth and related technologies (at last count, more than 1 billion enabled accounts). I hope history repeats itself here.
P.S. Feel free to corner me on InfoGrid, graph databases or any other subject. That’s the whole point of an unconference.
March 4th, 2010
·
Whether programming systems should be strongly typed or weakly typed has been one of the longest-running controversies in the history of computer science going back something like 50 years. Generally speaking, strongly typed systems tend to require more programmer effort up-front, in exchange for earlier or more definite error reports.
We also need to distinguish between static typing and dynamic typing: a dynamically typed system enables changes of types at run-time, while a statically typed system can’t do that.
Not surprisingly, typing for graph databases (or any other kind of NoSQL database) can be implemented in different ways, too:
|
Weakly typed |
Strongly typed |
| Dynamically typed |
At development time: types may be declared but are not checked except perhaps rudimentarily.
At run-time: errors may occur, which may or may not be discovered; mis-interpretations of data are possible; data corruption is likely in case of programming errors.
|
At development time: types are declared and checked as well as possible.
At run-time: all operations are checked for type safety; types can be discovered dynamically; type mis-interpretations are not possible.
|
| Statically typed |
At development time: only rudimentary checking, if at all
At run-time: errors may occur, which may or may not be discovered; mis-interpretations of data are possible; data corruption is likely in case of programming errors.
|
At development time: all type errors are caught; additional developer effort is required; some types of data are hard to represent
At run-time: no checking required due to “correctness by construction”.
|
Let’s insert some systems into this table:
|
Weakly typed |
Strongly typed |
| Dynamically typed |
Most NoSQL systems |
InfoGrid |
| Statically typed |
|
SQL database (if used as intended) |
Side note: when NoSQL proponents argue that weakly typed systems are much better than stronger-typed SQL, they sometimes throw out the baby with the bath water: there are four choices, not two. We agree that statically, strongly typed systems like a typical SQL database has considerable disadvantages in a fast-moving world, but so do weakly typed systems; the only difference is the type of disadvantage. In our view, a strong but dynamic type system is the best compromise for most applications with a non-trivial schema, which is why InfoGrid V2 implements it. (There are some applications that do not require a non-trivial; web caching for example.)
In a graph database like InfoGrid, the following items can be typed:
In other graph databases, only a subset of these items may be typed. More in the next post on types.
February 17th, 2010
·
Building on the recent InfoGrid FirstStep example, here is another: FirstStepWithMySQL.
Using the same bookmarking/tagging application as FirstStep, it shows how to persist the same MeshObjectGraph using MySQL as a key-value store. It consists of two apps:
- the first app initializes the store, and creates a graph of objects
- the second app retrieves the graph from the store, and traverses the graph to retrieve information.
It’s of course a trivial example, but it illustrates:
- how easy it is in InfoGrid to keep the same application running against different storage backends with minimal code changes (in the initialization only)
- some of the advantages of graph databases compared to other types of storage technologies: note how simple it is to traverse the graph in all directions.
Annotated source code is here.
February 17th, 2010
·
Available for download here. This is mainly an incremental improvement/bug fix release, except:
- new capabilities in the
ig-lid project related to what Randy Farmer called the tripartite identity pattern.
- new example application: FirstStepWithMySQL (see separate post).
Summary of changes:
- fixed endless loop when Transaction open at MeshBase die time
- Moved (Net)MeshObjectIdentifierFactory to .mesh.net packages. That allows is to tighten permissions a bit.
- Check that localIds are at least 4 chars long. Not having this check created confusing results for users of MeshWorld where the GUI assumes this, but not the graph db.
- Replaced StringRepresentationParseException with java.text.ParseException in most places.
- More specific subclasses of LidInvalidCredentialException for better error reporting
- More resilient when Gpg home dir cannot be created
- Better TimeStampValue.toString()
- Make it easier to create “su” Transactions with elevated privileges
- collect all outgoing data into the same XprisoMessage; this prevents sending more than one message in response to a single incoming message
- improved abilities to freshen Replicas
- added ForwardReferenceTest9
- added DelegatingNetMeshObjectIdentifierFactory as a convenience class
- Major refactoring in module ig-lid to implement what Randy Farmer called the tripartite identity pattern. Nickname is what he calls Public ID. HasIdentifier is used to represent Login ID. LidPersona represents what he calls Account and also manages the relationships to the other items. There is a corresponding Model.
- Added identifier-as-entered to LidAuthenticationStatus for better error reporting
- Re-introduced LidPersona as a major concept
- TransactionAction now carries a few member variables (MeshBase, MeshBaseLifecycleManager, MeshObjectIdentifierFactory) in order to make the writing of transactional code more concise
- Split org.infogrid.probe.test into several test modules; makes it better manageable
- removed a bunch of unnecessary files from ig-vendors; they only take up space and bandwidth
- fixed multiple ModelBase bug
- failed to load model under some circumstances when not running under the Module framework
- added isIdentifiedBy method to org.infogrid.util.HasIdentifier
- changed MeshObjectSet.contains( MeshObjectIdentifier ) to MeshObjectSet.contains( Identifier )
- added FirstStepWithMySQL
- localization for LidAbortProcessingPipelineException
January 26th, 2010
·
The new FirstStep example application allows you to get an InfoGrid application running literally in 60 seconds or less.
FirstStep shows the essence of how a tagging application like delicious would be implemented using InfoGrid.
Instructions and annotated source code are here: http://infogrid.org/wiki/Examples/FirstStep.
January 26th, 2010
·
InfoGrid 2.9.2 is focused on the new project layout of the code base. This new layout has also been documented on the wiki, starting with the front page and continuing to the projects page.
The new layout will make it easier for newcomers to find their way around InfoGrid, and to selectively include only those parts of InfoGrid required for a given application. It’s top-level structure is as follows:
Below, you find directories such as:
- modules: contains the functionality of the project
- tests: automated tests for the project
- testapps: web applications testing the project
- etc.
Enjoy!
January 21st, 2010
·
Good line of reasoning in 10gen’s blog post:
One reason why NoSQL, or some iteration, is here to stay is that the way computer architectures are heading, having systems that can run across multiple machines is going to be an absolute requirement. The limitations of vertical scaling are going to get worse and worse. You’re going to get new chips that have more and more CPU cores on them, but the speed isn’t much higher. And they’re going to be cheaper too so you can get more computers but you’re not going to be able to get one computer that’s really fast at any price. But you’re going to be able to get 1000 computers that are not terribly fast really cheaply. So the question is, at the data storage layer, can you leverage that? The traditional approach is no, not without a lot of manual effort. But changing computer architectures, as well as the growth of cloud computing, necessitates a better set of database systems built to achieve scale. These new solutions are going to solve that and it’s going to be critical. We want a new set of tools for the data storage layer that work well with those cloud principles, which are things like infinite scalability, low to 0 configuration, and ease of development without friction.