The Hadoop Dating Game: Hooking Up to Get Ahead

As summer starts to wind down and our family begins to look forward to another school year, I find myself reflecting on old friends, and thinking about new ones. Seems to me there are at least three categories of friends:

  • Summertime friends who float in and out in a season. These are the casual yet passionate friends that you meet randomly at a barbeque or game, get to know quickly, laugh with deeply, and then forget generally once you’re busy again with “more important” matters in the autumn.
  • Childhood friends that last a lifetime. These are the ones that helped shape who you are today, that you can still call years later and know they will be there for you, ready to listen and talk as if no time at all has passed. They get rarer as time goes by, but the mutual bond is no less strong.
  • Convenient friends that are almost entirely situational. These are fellow parents from your kid’s school, the guy with the good jokes at the office, the woman you talk to during breaks at your kick-boxing class. They have interesting perspectives, but maybe little in common outside the time and place where you happen to meet.

Companies in the broadly defined big data marketplace are exhibiting all of these relationships. Clearly partnerships are critical to success in IT; no one component of the technology stack does anything useful all by itself. As you’ll know from previous ESG research or your own experience, an integrated, multi-disciplinary team approach is also necessary for a new initiative to succeed.

So what to make of the dynamic social networks of the IT vendors here? If we look at the Hadoop distributions as the team captains or the popular new kids, the same friendship patterns emerge.

The business relationships are readily divisible into various groupings as technology alliances, go-to-market sales channels, and related service delivery partners, often analogous to the types of friendships above.

Weeding through all the "who is seeing whom" gossip could take the rest of the summer, but perhaps a simple count of popularity will give us some insights on overall momentum...

Cloudera racks up 236 friends of varying types on its website today.

MapR recognizes 137 partners, including 34 of these with ready to go modules in the MapR App Gallery.

Pivotal is buddies with 93 companies, including the strong federation with EMC, VMware, and RSA.

Hortonworks has a whopping 333 connections by my count, showing itself to be the most social butterfly.

Now obviously many of these aren’t exclusive relationships. Some big industry players like Cisco, Dell, HP, Informatica, SAP, SAS, and VMware are happy to collaborate with most, if not all, of the big Hadoop distributions, either at a software or hardware level. Some are fairly monogamous, at least within the class, such as Hortonworks and Microsoft. What will be most interesting to see is which of the relationships are hot now, which will withstand the tests of time, and which are merely temporarily convenient.

As the battle for Hadoop ecosystem supremacy continues, these shifting alliances are going to be important influences on which platforms gain acceptance and market share. Big data requires interoperability at a bare minimum, but really thrives with proper integration and combined innovation.

Who do you see as the most strategic alliances for big data in your business?

Topics: Data Management