The Most Secret V of Big Data

Everyone who has spent any time looking at the world of big data will have by now familiarized themselves with the 5 "V"s of Volume, Velocity, Variety, Veracity, and Value, and these are no doubt good descriptors of the new requirements. There is however another secret V to consider, well, ok, it's really more of an L, but cock your head 45 degrees to the right and it'll start to look a bit like a V anyway.

I'm referring here to Longevity. While there is much talk of real-time analysis, some data actually gains significance as time passes. The trick here is keeping it around long enough to have it when needed, yet instantly accessible to satisfy the eventual moment. Analysis of social media will instantly tell you if people like your product today, but what if you want more detail? You could implement some sensors and collect time-series or geospatial data, which will certainly give you more about the specific conditions of usage during incidents. But what if we're talking a lot of products with a lot of sensors over a long period of time? And what if you don't know the actual questions that will matter in a year, five years, or 20 years?

This is where Longevity comes in, and it's not merely a storage pricing issue, where cost per TB is all that matters. Nor is it as simple as dumping everything into an ever growing data lake (or data ocean perhaps). Ultimately economics will determine what can be kept. Yet more than a simple cost calculation or massive data consolidation, longevity is about the value of the sum total of knowledge for any specified era, and about assuring it can be found quickly enough on demand.This is a long-game topic, but for particular use cases it'll be critical. Think of aircraft engine service records, longitudinal health studies, or geological remote sensing data. These aren't point-in-time areas, and the Internet of Things is going to cause an exponential explosion in this kind of data.

I look forward to looking backward many years later and seeing the success of vendors that are building for these scenarios.

Tip of the hat to Peaxy here.

Topics: Data Platforms, Analytics, & AI