Should Big Data always be Big?
Yesterday evening the first BigData.be MeetUp was organized at the IBBT in Ghent. The intention of this meeting is to bring together Belgian Big Data and NoSQL enthusiasts. It’s an ideal opportunity to share thoughts and experiences with a mix of people, each having different backgrounds and levels of expertise with Big Data and NoSQL.
At various occasions during the meeting, the Bigness of Big Data was discussed. Questions were raised concerning the number of nodes people deploy within their Big Data cluster, the number of Gigabytes of data people are storing, etc … Although these are all valid questions, I sometimes get the feeling that people are focusing too much on the ‘Big’-aspect of Big Data; if you explain to people you are running a NoSQL application on a single node, they generally do not consider it to be a Big Data solution …
For me personally, being able to easily scale horizontally is just one side of the Big Data story. The other side, namely the ‘alternative data model’-aspect, is equally as important. At first glance, key-value and wide-column data stores may give you the impression that you are still working with fairly typical row-oriented data. Yet, these alternative data models allow you to solve problems in a conceptual very different way. Graph databases for instance, which use the powerful notion of nodes and edges, allow you to model your data and solve your problems in a truly elegant fashion.
I guess what I’m trying to say is that you shouldn’t own millions of data records in order to be eligible to use NoSQL and Big Data technologies. By cleverly applying some of the alternative data models, you can simplify the software architecture of various day to day applications and unlock the full potential of your (maybe limited) data set.