Big Data Strategy

‘It was easy to cover up ignorance by the mystical word “intuition.”’ Foundation’s Edge, 1982, Issac Asimov

Business-driven Outburst of Demand for Analytics AKA Big Data
More than ever, businesses and non-profits crave meaningful, timely, and actionable information, preferably on a continuous basis if possible. One victorious analytics project is not the objective; one good project begets the desire for more projects plus a deeper desire for increasingly real-time analytics.

The IT shift from transactional to informational, starting in the early days of reporting/BI, followed by the web homepage, portal, search, and informational sites like Wikipedia, and more recently trumpeted by social and rich media, has come around to analytics applied to massive volumes of data (by historical standards) as well as significant data diversity. This hyperbolic, data intensive scenario for analytics and applications is referred to as Big Data. Yes editor, deservedly capital B, capital D.

Hadoop is the Heart of Big Data

The catalyst for this business craving of analytics comes from the lemming effect associated with public displays of success of Big Data projects. While business periodicals like the Wall Street Journal are probably the most responsible for spreading news of Big Data successes (for a recent compelling WSJ blog post that does not require a subscription, see The Morning Download: Big Data Hits Main Street), you can find the best quick repository for such proof points at

The virtual Vatican of the Big Data analytics movement lives not in a conclave inside of Rome but rather at - the Apache Hadoop project. Though, based on an ESG survey from earlier this year, only a small minority of companies have pursued Hadoop-based projects, make no mistake about it: Hadoop sits squarely at the heart of Big Data, spiritually, inspirationally, technically, and in terms of ecosystem.

Not All Big Data Projects or Solutions are Hadoop-based or Purely Analytics For that Matter

It took me some time to come around to Big Data ≠ Hadoop, though most certainly Big Data ∩ Hadoop (a Venn diagram symbol for “intersects with”). Similarly, Big Data ≠ Analytics though definitely Big Data ∩ Analytics. Perhaps this is blasphemous, but there are applications and solutions that require the use of the vast quantities of data that that don’t fit neatly into the Hadoop-based analytics project model. Here’s an example, NetApp Seismic Processing Solution. Similarly one doesn’t have to use Hadoop in order to successfully yield Big Data analytics, there are several predicative analytics engines that will use whatever data from a variety of sources in a variety of forms.

Much of the publicity and R&D around Big Data springs from or is closely related to Apache Hadoop, but to be fair and objective we should consider all use cases for Big Data technologies. I believe in the separation of Church and State, though in this case if we lacked a church (Hadoop) there wouldn’t be that much of a state (Big Data).

Nobody Knows How Big Big Data Is or Will Be, Still, Ignore At Your Own Risk

I have seen some market sizing and forecasting SWAGs of Big Data. I know it is a dirty job and someone has to do it, and those tables, charts, and graphs inevitably show up in presentations for venture capitalists, boards of directors, and product planning groups. But my approach is more straightforward and this is what I would tell business leaders and the IT partners about Big Data:

“For the first time there are technologies to help organizations take real advantage of the mountains of data they sit on, plus the ever-increasing data volumes that flow through the organization every day. Many of your competitors have already jumped into these technologies -- ignore them, both your competitors and Big Data technologies, at your own considerable risk.”

If asked, “Anyway, how big will Big Data be?” The only logical answer, “Big Data is Bigger than Big.” Should someone insist on me producing more precise Big Data forecast, how about, “Billions and Billions.”

Big Data Is THE IT Game Changer for Business and Society: You use IT to transact. You use IT to express and gather basic information. You use IT to collaborate and communicate. Now you can use IT to find out what all this means, and how you can model and perhaps implement serious change. The human mind excels at pattern recognition. Finally IT, data scientists, leaders, analysts, and decision-makers, through the many forms of Big Data-related technologies, are offered a counterpart to human nature’s approach for solving problems and reaching decisions.

RationalWiki – Pattern Recognition

“Pattern recognition is the task of classifying raw data using a computational algorithm (sometimes appropriate action choice is included in the definition). The term is from machine learning, but has been adapted by cognitive psychologists to describe various theories for how the brain goes from incoming sensory information to action selection.”

It is Still Morning at the Big Data Pool

The pool has been filled, the water heated and treated, the gates open, and people are trickling into the Big Data Pool replete with expectations of a delicious analytics dips. Not that many poolgoers actually know how to swim or tread water yet, and there remains a justified fear of the water. The fear is of drowning in complexity and costs, of projects that fail to float. Right now Big Data needs more lifeguards and swimming aides. The entire vendor side, software, infrastructure, and networking, is ramping up in order to make it easier for more businesses to do the Big Data crawl. "Making Big Data Comfortable for the Enterprise" encapsulates the market state.

Wow! What a great place for a dive if you are an industry analyst!

Next: Considering Control Points for the Hadoop-based Analytics Market

Topics: Data Platforms, Analytics, & AI