The power of Manchester City: a data analysis

What makes Manchester City such a great team? The infographic below illustrates one of the teams most powerful characteristics: its successful passing capability. The visualisation is based upon the Opta dataset released in August 2011, containing the high detailed Bolton vs Manchester City match statistics. The data has been loaded in the neo4j graph databases

Continue Reading →

Coalition-Cocktail – Hacking the Elections @ Engagor

Last weekend, Engagor organised their hacktheelections hackaton. The Datablend team (Quentin, Stijn and Davy) was joined by Marc Broos, Tim Coene and Josbert van de Zande with one goal in mind: trying to visualise the (pre-arranged?) political coalition and, if possible, also predict the formation-period. Technically, we extracted over 160K tweets through the Engagor API.

Continue Reading →

Datablend lanceert vk14-bingo.be

Wordt U ook overladen met informatie in verband met de komende verkiezingen? Bent U, net zoals zo vele andere burgers, op zoek naar een eenvoudig alternatief waarbij U in 1 oogopslag kunt zien waar elke partij voor staat? Zoek niet langer en maak gebruik van vk14-bingo.be. We hebben voor U de verschillende partijprogramma’s woord voor

Continue Reading →

Yelp graph: checkin-based business clustering

Recently, Yelp made available a sample dataset from the greater Phoenix metropolitan area including around 11.000 business, 8000 checkin-sets, 43.000 users and 230.000 user reviews. With the help of this data, data scientists can execute real-life experiments with various data mining/machine learning algorithms. In our case, we are interested in finding out whether it is possible

Continue Reading →

Counting triangles smarter (or how to beat Big Data vendors at their own game)

A few months ago, I discovered Vertica’s “Counting Triangles”-article through Prismatic. The blog post describes a number of benchmarks on counting triangles in large networks. A triangle is detected whenever a vertex has two adjacent vertices that are also adjacent to each other. Imagine your social network; if two of your friends are also friends

Continue Reading →

Similr: blazingly fast chemical similarity searches

Today, Datablend announces Similr to be available for beta sign-up. Similr allows scientist (both from academics and enterprise) to quickly search for compounds that exhibit a particular chemical structure. It employs a wide range of fingerprinting algorithms, which combined, allow to identify matching compounds in millisecond time. Similr’s functionalities are available through a flexible and

Continue Reading →

Redis and Lua: a NoSQL power-horse

Recently, I’ve started implementing a number of Redis-based solutions for a Datablend customer. Redis is frequently referred to as the Swiss Army Knife of NoSQL databases and rightfully deserves that title. At its core, it is an in-memory key-value datastore. Values that are assigned to keys can be ‘structured’ through the use of strings, hashes,

Continue Reading →

← Previous Page