Graphs, Graphs, Graphs, …

Last week, Datablend open-sourced two new Tinkerpop Blueprints implementations: blueprints-mongodb-graph and blueprints-datomic-graph. Tinkerpop is an open source project that provides an entire stack of technologies within the Graph Database space. At the core of this stack is the Blueprints framework. Blueprints can be considered as the JDBC of Graph Databases. By providing a collection of generic interfaces, it allows to develop graph-based applications without introducing explicit dependencies on concrete Graph Database implementations. Additionally, Blueprints provides concrete bindings for the Neo4J, OrientDB and Dex Graph Databases. On top of Blueprints, the Tinkerpop team developed an entire range of graph technologies, including Gremlin, a powerful, domain-specific language designed for traversing graphs. Hence, once a Blueprints binding is available for a particular Graph Database, an entire range of technologies can be leveraged.

 

1. mongoDB Graph

The mongoDB Blueprints implementation provides users with a scalable, distributed Graph Database implementation. mongoDB graph does not require any information on the underlying physical mongoDB setup, so read/write scalability can easily be achieved through mongoDB’s natively supported replication and sharding functionalities. Some initial benchmarks (writing/reading 100.000 vertices where each 2 vertices are connected through an edge) on a single mongoDB node shows the following performance:

Not too shabby for running it on a single mongoDB instance. Currently, each write is executed as a singular commit to the mongoDB store. Although this process could be improved by performing writes through optimized batches, we need to ensure that the transactional semantics of the Blueprints stack are still respected. Further performance optimizations will be released soon, so keep an eye on the blueprints-mongodb-graph github project.

 

2. Datomic Graph

Datomic is a novel distributed database system designed to enable scalable, flexible and intelligent applications, running on next-generation cloud architectures. Datomic employs a powerful data model (based upon the concept of Datoms) and an expressive query language (based upon the concept of Datalog. Additionally, it introduces an explicit notion of time, which allows for the execution of queries against both the previous and future states of the database. The RDF and SPARQL feel of the Datomic data model and query approach makes it an ideal target for implementing a property graph. Hence, the blueprints-datomic-graph Blueprints implementation.

Clever use the time-aware nature of the Datomic datastore, makes Datomic Graph the very first distributed, temporal graph database: users can perform queries against a specific version of the graph in the past. The code sample below illustrates how a time-aware social graph can be created, stored and queried through the Datomic Graph implementation.

 

As one would expect, this outputs the following information:

 

Datomic Graph does not only support versioning of the vertices and edges, but also on the properties of individual vertices/edges. Pretty slick, not? We are currently enhancing the implementation with additional time-based operations, including the easy comparison of subgraphs over time. So stay tuned!

Leave a Reply

Your email address will not be published. Required fields are marked *