Pentaho's Three-Legged Race to Big Data (with HDS)

Pentaho World 2015 was held in sunny Orlando this year, with over 500 attendees, and was by all accounts a friendly and informative affair. About the only question no one could answer is why the company is called Pentaho, but a rose by any other name is still very nice. One thing that was quite clear is that the team is hitting its stride with HDS as a powerful running mate.

Topics: Analytics Big Data Hadoop HDS business intelligence data integration

DAAC to the Future!--Dell Annual Analyst Conference

30 years ago, Back to the Future changed the history of the world as we know it, at least for impressionable movie buffs. This year, the Dell Annual Analyst Conference (DAAC) showed what could be coming next, no DeLorean required. Dell is obviously a big company with many different lines of business to consider, but I'd like to focus here on one breakout session in particular, entitled "Helping Customers Cut through the Big Data Hype to Make Better, Faster Decisions." Snappy, eh?

Topics: Analytics Big Data Data Management & Analytics Dell data integration

Matters of Integration in the Data Economy

At the Informatica analyst conference in Menlo Park, CA during the waning days of February, James Markarian, Informatica’s CTO, reignited a line of thinking for me around the concept of “The Data Economy” – my term, not his. You might legitimately ask, “How is the data economy any different than the information technology market?” My notion of data economy overlaps with what most would consider the IT market, which I will discuss below. But first here are a few of the viewpoints I accrued from Mr. Markarian’s talk, supplemented by some of my own viewpoints, which spawned my thinking about The Data Economy, and Informatica’s role therein:

  • Few organizations actually know precisely the data they possess – where, what, or why; e.g., there are massive quantities of data generated by organizations that are never or seldom used.
  • Typically, organizations seriously over-provision storage and related infrastructure for data – better safe than sorry, particularly if you don’t have a really good handle on your data.
  • The term “data science” is overstated, in that scientific principles are not part of a data scientist's practices – we might be better off using the term “data artist.” On one hand this is nitpicking over job titles, but it highlights how far we still have to go in terms of managing and applying our data.
  • The currently hot approach to big data analytics, Hadoop, (and some are now opining that Google’s open source Dremel may soon start out-Hadooping Hadoop) involves dumping all kinds of data into a solution designed to deal with over-provisioning and to compensate for lack of data understanding: Create three haystacks of data, and start looking for a needle. If you stripped the “Hadoop” off of the label, and told somebody that is how you were going to architect a solution for analytics, they might think you were crazy.

The summarized message goes something like this: Organizations would do well to invest in understanding their data to ensure that the right information arrives at the right time for the right people in the right context of consumption, and the practices to make this happen simultaneously decreased business risk and IT costs. The more far-reaching factor is that those organizations who get their arms around their data can better participate in the data economy. Here are some bread crumbs about the data economy:

Topics: Data Management & Analytics Informatica Enterprise Software data integration