Second half predictions for big data & analytics

big data predictionsAs summertime rolls on, we can enjoy a little sun, a little rest, and a big opportunity to reflect on the key trends to watch in the second half of 2016. Here are a few of my predictions of what comes next:

  • A renewed interest in the database and data warehouse, as some of the limitations around unstructured data become more apparent. There are plenty of mature relational databases that just keep getting better (like Microsoft SQL Server, Oracle, Teradata, and MySQL), as well some rapidly emerging favorites (like Cassandra, MongoDB, PostgreSQL, Redis, MariaDB, Amazon DynamoDB, Google BigTable, and MemSQL.) Data warehouse automation platforms like Amazon Redshift, Google BigQuery, Microsoft Azure SQL Data Warehouse, and the greatly facilitating TimeXtender DWA will also benefit.

    Column-stores, in-memory processing, scale-out, and other features will grow in popularity. Databases on Apache Hadoop will also continue to gain traction, including open and vendor distributions of HBase, Hive, Impala, and MapR-DB. ESG has upcoming research on database trends, let me know if you want to get in on the action.
  • Deeper focus on machine learning (ML), including AI and cognitive, as the amount of information available rapidly surpassing human understanding. This will show in two ways: more machine learning embedded in applications as an invisible helping hand, and more effort to make machine learning easier for the few data scientists, analysts, and statisticians who can manipulate it directly.

    Toolsets like R, H2O, Apache Spark MLlib, TensorFlow, and Dato will all grow rapidly, as will ready solutions like Microsoft Azure ML, Amazon ML, Google ML APIs, and Watson. ESG is also looking to study this area much more closely this fall.
  • Clouds will gather steam. No surprise here maybe, as most of the big data and analytics offerings above are IaaS-ready, if not full fledged cloud platforms themselves. Industry titans Amazon, Google, IBM, Microsoft, and Oracle are going to deliver ever better experiences that many enterprises will readily embrace or at least find it much harder to justify staying on-premises. Full-service solutions like Databricks and Qubole will also thrive for the more discerning buyer. I've written recently on some of the factors driving this movement to the cloud here.
  • The Hadoop market will consolidate and standardize, which is not to say commoditized. At this point, we're really coming down to Cloudera, MapR, Hortonworks and their many ODPi alliance partners. Apache Hadoop independent of a specific vendor distribution will still thrive, but increasingly it'll be one of the choices above PLUS purely open source projects. Each vendor will differentiate on important ancillary functions (think cluster management, security, and governance) as well as try to build the dominant unified big data solution (be that an enterprise data hub, converged or connected data platform.)

    This trend is going to actually help three audiences, being consumers, ISVs, and the distribution vendors themselves, by simplifying the competitive landscape without stifling innovation.

Other trends are going to intersect in the big data space as well, especially IoT, and there will be many subplots and niche directions explored, too, but these are the hot items to watch unfold. I'll bring the popcorn.

big data internet of things

Topics: Internet of Things Data Platforms, Analytics, & AI