Think Economics -- Not Features -- When Evaluating Big Data Value


Traditional enterprise data warehouse solutions helped to open the eyes of many organizations to the value of their data. Although these are significant systems, organizations quickly learned to monetize the actionable insight extracted from these systems, which led the rampant growth of the industry. Big data did not get big just from data growth. It got big because of its potential value, opportunities, and savings.

The more cost-efficiently you can capture a lot of data, plus the number of ways you can analyze it, equals the more worthwhile all that data could become. Value is results divided by costs. These (pseudo-)equations of big data value now extend not only to the disruptive power of transformative technologies like Hadoop, but also to increasingly popular cloud services for databases and data warehouses.

It should be noted that ESG does evaluate and validate the economic impact of new approaches to analytics. (Vendors who are interested in how this works might like to read more here.) Simple ROI or TCO "back of envelope" calculations for an analytics initiative is a decent starting point, but often the assumptions aren't very clear or aren't very credible. We prefer to explore in depth, comparing a prior mode of operation (what you did before) with measurable outcomes from a new approach. Sometimes we also look at alternatives, where good data is publicly available.

If you are interested in a detailed and rigorous economic study, please read the report we recently did on Google BigQuery. Our models, built on the results of validation with BigQuery customers, showed that organizations can expect to save between $881K and $2.7M over a three-year period by leveraging BigQuery instead of planning, deploying, testing, managing, and maintaining an on-premises Hadoop cluster. The models also show that BigQuery’s serverless design and simple pricing can provide a solution that is simpler to manage at a total cost that is between 56% and 82% less expensive than alternative cloud-based solutions that store data and perform queries.

Here is a list of considerations we explored as part of that study. We found that the economics allow you to account for more than just feature comparisons between vendors. The economics are at least half the equation.

  • Time to Value – Can your users access their analytics environment online quickly and easily?
  • Simplicity – Can users complete all major tasks related to analytics through an intuitive interface?
  • Scalability – Can your data platform readily scale up to petabytes or down to kilobytes depending on your size, performance, and cost requirements?
  • Speed – Can you quickly ingest, query, and export massive datasets leveraging the price/scale/cost efficiency of underlying cloud infrastructure?
  • Reliability – Can you ensure always-on availability and recoverability to avoid costly outages?
  • Security – Can you protect and control access to projects and datasets to avoid costly breaches?
  • Cost Optimization – Can you predict costs with transparent flat rate and/or pay-as-you-go pricing, and contain costs through the use of project and user resource quotas?

Critical to these evaluations is understanding all of the trade-offs and costs associated with your big data solution. After all, your business does not rely on you owning and operating a big data solution, but rather by using a cloud enterprise data warehouse solution such as BigQuery, you can focus on collecting valuable data and making it more valuable by quickly extracting insight from it.

Topics: Data Platforms, Analytics, & AI