It's Performance @ Scale, Not Performance + Scale

Want to know what's trite in big data marketing today? Pitches that focus on speed for speed's sake, thinking if only they can get enough zeroes in the headline the customer is sure to buy. Second most common cliche to the "go fast" guys are the ones who rave about their scalability, as if size is all that counts. Both attributes matter, but both are kind of missing the point if taken alone. Customers' decision criteria for big data solutions includes these capabilities, but almost always as a combined function, not independent axes for evaluation.

Why is this, you ask? While there are certainly use cases that are entirely oriented around sub-second response times, or others where you need to retain petabytes of storage for many years to come, the more common scenario is wanting to combine large volumes from different data sources, do some analytics jiggery, and still get a "real time" answer. Real time itself has different definitions ranging from nanoseconds to minutes, but it's fair to say if the response comes back before the asker gets bored and wanders off, well then, it's done the job. The ultimate goal of most organizations is to have interactive insights on demand and based on as much data as is relevant to the question at hand.

The goal of big data analytics vendors and customers alike should be the combination of performance at scale, not performance and scale.

Some are starting to see this united value proposition, and they are wisely re-spinning their stories on these assumptions of requirements. A few well done examples I've heard recently include DataRPM, Velocidata, Corvil, Informatica, and the stealthy but exciting Interana.

Topics: Data Platforms, Analytics, & AI