At the Informatica analyst conference in Menlo Park, CA during the waning days of February, James Markarian, Informatica’s CTO, reignited a line of thinking for me around the concept of “The Data Economy” – my term, not his. You might legitimately ask, “How is the data economy any different than the information technology market?” My notion of data economy overlaps with what most would consider the IT market, which I will discuss below. But first here are a few of the viewpoints I accrued from Mr. Markarian’s talk, supplemented by some of my own viewpoints, which spawned my thinking about The Data Economy, and Informatica’s role therein:
- Few organizations actually know precisely the data they possess – where, what, or why; e.g., there are massive quantities of data generated by organizations that are never or seldom used.
- Typically, organizations seriously over-provision storage and related infrastructure for data – better safe than sorry, particularly if you don’t have a really good handle on your data.
- The term “data science” is overstated, in that scientific principles are not part of a data scientist's practices – we might be better off using the term “data artist.” On one hand this is nitpicking over job titles, but it highlights how far we still have to go in terms of managing and applying our data.
- The currently hot approach to big data analytics, Hadoop, (and some are now opining that Google’s open source Dremel may soon start out-Hadooping Hadoop) involves dumping all kinds of data into a solution designed to deal with over-provisioning and to compensate for lack of data understanding: Create three haystacks of data, and start looking for a needle. If you stripped the “Hadoop” off of the label, and told somebody that is how you were going to architect a solution for analytics, they might think you were crazy.
The summarized message goes something like this: Organizations would do well to invest in understanding their data to ensure that the right information arrives at the right time for the right people in the right context of consumption, and the practices to make this happen simultaneously decreased business risk and IT costs. The more far-reaching factor is that those organizations who get their arms around their data can better participate in the data economy. Here are some bread crumbs about the data economy: