Riding high at Strata+Hadoop World

This was my fifth go-round at Strata (see below for links to past coverage), so you'd think I'd be able to ride that bull by now. Yet again I found myself quickly thrown by the rapid turning and bucking of the young big data market. 

A few of the more interesting observations:

  • strata hadoopSpark continues to gain mindshare. A recent ESG survey showed Hadoop narrowly leading Spark in current adoption, but trailing in future interest. While they aren't directly competitive choices, Strata had many presenting Spark as the central building block to their solutions for streaming, SQL, graph, and machine learning analytics. I'd expect this momentum to continue, though skeptics are already wondering which shiny object will be the hot topic next. In any case, many big data vendors like Cloudera, MapR, BlueData, and MemSQL are now emphasizing their strengths in working with Spark, while Databricks gets better as the leader.
  • Machine learning is heating up. Not exactly new, but suddenly much more prominent was the range of machine learning solutions out there, which suggests it's getting more mainstream traction. A big question is whether machine learning needs to be "democratized" for everyone, or more so that tools should enable the real experts to be more efficient. Dato, Skytree, Google (TensorFlow), IBM (Watson), and more were promoting what they can do in these directions. As IoT grows in popularity, I'd expect machine machine learning to accompany it. Repetition intentional.
  • Data warehousing is fair game. You hear a lot about optimizing/offloading/complementing of data warehouses with Hadoop, but it looks like there is another market shift beginning. Now I see technology buyers increasingly weighing whether they can outright replace a traditional data warehouse. As Hadoop maturity improves, this is looking more a bit more realistic, though Teradata, Oracle, and others argue that it's more a synergistic design approach than a truly competitive enterprise offering (as yet.) Of course, these data warehouse vendors also have Hadoop offerings of their own, so any market risks are well hedged. If Hadoop solutions can deliver the quality and a friendly front-end for BI, it's starting to sound more possible that this trend could grow.
  • User interface matters a lot. Continuing on the last theme, the connection to a familiar or intuitive analytics, visualization, and collaboration application is increasingly important. Some Hadoop players are building out their linkage to tools like Tableau and Qlik, while others might be choosing to build out their own GUIs. For examples, Dell's Statistica has all the math you'd ever need and now is getting more user-friendly, or Interana and Platfora which are full-range platforms, not just analytics and discovery engines. Yet many here are also going for end user choice in this area.
  • Cloud is gaining traction. Our research shows around 35-45% of buyers are now considering cloud-based big data, analytics, databases, BI, and data warehouse solutions. The view at Strata seems to back this up. But the reasons seem inverted from the past when clouds were considered less expensive but worried about security. Now a lot of the market is saying they consider cloud services to be potentially more costly for big data but also better managed. IBM Cloud Data Services, Amazon Web Services EMR, Microsoft Azure, Databricks, Snowflake, and others were all representing the cloud-first deployment model.

Lots more going on, these were just a few highlights. Do let me know if you'd like to hear more or debate any of the points above. 

Past Strata coverage:

Strata + Hadoop World = Big Data x Innovation

5500 Viewpoints at Strata+Hadoop World (Includes Video)

A Striated Strategy at Strata

Lysi-Strata - Big Data’s Act I

devops info

Topics: Data Platforms, Analytics, & AI ESG on Location