The Big BLU Data BI-Analytics Juggernaut

Les Rechan, GM of IBM Software Business Analytics group, openly stated that the BI-analytics business at IBM is expected to account for about $20 billion of revenue by 2015. Mr. Rechan made his prediction at an IBM event focused on big data conducted at IBM’s Almaden Research Center during the first week of April. Naturally this $20 billion will stretch across all product and service lines, from software to services to hardware and perhaps even to financing. In my recently published Business Intelligence and Analytics Platforms in the Big Data Era: Do Big Data and Hadoop Really Alter the Balance? I opined that IBM was, by far, the largest BI/analytics vendor in the world, all in. My model estimates $11 billion for 2013 for IBM in business analytics, so either my estimate is conservative, or the IBM business analytics business will grow at a CAGR of around 40% for 2014-2015. Regardless, business analytics drives something north of 10% of IBM’s overall business. Why has IBM done so well?

  • Established footprint: IBM has been doing BI-analytics for decades in one form or another, and the resulting customer base and relationships keep the IBM pipeline well-greased.
  • On-going strategic commitment: IBM has underscored its commitment to BI-analytics time and time again through strategic acquisitions like Cognos and Netezza, plus its own R&D. About half of IBM’s $6 billion research budget focuses on BI-analytics related work.
  • Full solution provider, with flexibility: IBM has nearly one of everything you might need for a BI-analytics solution, from storage to software to services, plus the vertical domain expertise. But IBM also maintains a huge ecosystem, for example, offering truly value added reselling of SAP HANA. IBM’s flexibility to be only part of a BI-analytics deal has served them well down through the years, contributing significantly to their preeminent market position.

But big data keeps morphing, and not even IBM can afford to fall far behind the technology curve. A report I published a few months ago (see infographic) discussed how newer “Not Only SQL” databases, aka NoSQL, were well positioned for big data style BI-analytics, and other modern applications. Even the most established enterprise database providers have reacted to the NoSQL movement with offerings either through acquisition or their own R&D, such as Oracle NoSQL, Microsoft SQL Server 2012 xVelocity, SAP HANA and Sybase IQ, Teradata Aster, and Terracotta Big Memory (Software AG subsidiary). But where was IBM?

Before last week’s announcements, IBM already had several non-purely relational/SQL offerings: For analytics, for example, IBM could point to Netezza, but Netezza is delivered as an appliance rather than general purpose software offering a la DB2. For time series and spatial use cases beyond purely relational/SQL, IBM offers customers the venerable but rather long-in-the-tooth Informix. DB2 offers native variants for XML, popular for Web and document oriented implementations, and RDF (Resource Definition Framework) and SPARQL, enabling IBM to address the growing demand for graph analytics.

Despite these offerings, however, IBM had no answer for a blazing fast non-appliance columnar option necessary for advanced analytics. It also lacked that obvious performance jolt to keep up with the many entrees in the NoSQL segment, all of which seem to use performance as a differentiator, if not against one another, at least against legacy relational database for non-OLTP. Also, IBM did not offer support for the quite pervasive JSON, the risen star of RESTful APIs that has been pushing XML aside in Web/content oriented applications.

IBM provided the answers last week. While IBM announced a technology preview for JSON support, the highlight was about what IBM calls BLU Acceleration ("BLU"). BLU is IBM homegrown R&D that steps up the performance of IBM DB2 10.5 using a variety of technologies including in-memory advanced columnar compression and storage compression – definitely a NoSQL approach that should help advanced analytics processing fly.

IBM wanted to make using the BLU extremely easy, so a simple registry setting, DB2_WORKLOAD=Analytics, turns on BLU. That simple setting ensures that all subsequent database definitions will default to a columnar format, and all the technologies of acceleration will come to bear. IBM has plans to spread BLU Acceleration throughout its product line, including future availability for z/OS databases.

ESG believes that BLU Acceleration subtly but decisively underscores the architecture for DB2 going forward: a workload-driven, multi–data model approach that not only best matches each application use case, but also optimally marshals resources like memory and storage as appropriate for the workload class. IBM is not alone in this thinking: Amazon Web Services, and to some degree Microsoft through SQL Server 2012 and Azure, have taken the approach that a single data management or data service layer should support multiple data models to best serve different workload types. The highly anticipated Oracle Database 12c, while architecturally in that direction, will only support the relational model. SAP cites HANA for both transactional and operational, real-time BI-analytics purposes, but ESG believes that for advanced analytics most companies will want to keep the transactional water separated from the analytical oil. Regardless, IBM not only made up ground but moves towards the head of the class if you are a DBA or CIO who values the approach that one logical database that offers optimizations and data models for a wide range of workloads.

IBM also unveiled its plans to release a new PureData System for Hadoop appliance, using IBM’s InfoSphere BigInsights Hadoop distribution, in the latter half of 2013. For customers who want to use Hadoop, but don’t have the patience or skills or desire to add an increasingly unwieldy server farm to the data center, and have been surprised by how long it actually takes to deploy Hadoop using “commodity nodes,” appliances like PureData offer an attractive alternative. ESG has begun to see some customers using Hadoop as a data warehouse add-on; keeping the existing data warehouse for structured data, BI, and basic analytics, but plugging in Hadoop to deal with less structured data sources and more complex analytics. It looks like the upcoming PureData System for Hadoop should offer customers who want to apply Hadoop in that fashion.a fast on-ramp.

The other item that really caught my attention was IBM’s commitment to helping fill the skills gap of big data by partnering with over 200 universities to add or augment big data related courses to the curriculum. In addition, IBM is hosting, which has registered over 75,000 students. Not only is this a great way to start addressing the well known lack of availability of data scientists and data analysts, it should create some loyalty that will continue to feed the world’s largest purveyor of BI-analytics.

Topics: Data Platforms, Analytics, & AI