Data Makes the World Go Round...

...Or at least data can be used to model the Earth's rotational vectors and predict trajectory locations over time. A few things have got me thinking about the world of data and the data of the world. First, I watched "Hidden Figures" with my girls last night. Amazon's machine learning models accurately predicted that we would like this film and positioned it strategically on our suggested titles list. There was a lot to like, including worthy themes around:

  • Math, science, engineering, analytics, and computers.
  • Women earning respect for their ability to excel in these areas.
  • Minorities demonstrating that diversity in teams improves outcomes.
  • Healthy patriotism in the context of advancing human potential.
  • Strategic government funding of research and innovation.
  • and Kirsten Dunst (assuming that her inclusion here doesn't undermine all my previous points).
Topics: Big Data Data Management machine learning data set

Database Forecast: Cloudy with Increasing Chances

ESG has recently published an overview on IT market adoption of cloud-based databases. Shall we just call them cloudbases? Perhaps not. A major trend is emerging. While relatively few are choosing cloud as their primary mode of deployment, majorities are currently running at least some of their production workload in the public cloud. Attitudes and adoption vary considerably by age of company (and age of respondent!), reflecting how deeply entrenched traditional on-premises offerings and processes may be for different businesses. How many, how many, and how much, you ask? ESG research subscribers can read the full report here.

Topics: Cloud Computing Big Data Data Management

Cloudera Builds Strength and Agility

Haters gonna hate on Hadoop, but they've confused 'tween growing pains with weakness. The broader Hadoop ecosystem continues to mature at a very healthy pace. Even if the players are starting to outgrow the labels of "Hadoop" and "big data," leading companies in this sector will continue to build on what is now established to be a strong core. Perhaps most prominent among these young varsity athletes is Cloudera.

Cloudera has long enjoyed the popular attention of the market. More than 1,000 customers use Cloudera in more than 60 countries today. Technology vendors and channel partners have associated themselves with the cool kid. At last count, Cloudera has over 2800 partners, and that number includes 450 ISVs, of which 388 are certified, bringing 184 partner-developed solutions, of which 120 have been verified in production, and 44 are available in a ready-to-roll solutions gallery. Meanwhile big names like Intel and Michael Dell have provided significant scholarships to fund the company's development.

Topics: Data Management & Analytics Data Management Cloudera IPO big data and analytics

Strata Data Conference Gets Good

Last week we saw the rebranding of Strata + Hadoop World as the new Strata Data Conference. This name change reflects the nature of the content, which is decreasingly about specific Hadoop projects and increasingly about how to get analytics value from any data anywhere. Beyond that, the show reinforced several key themes I've been predicting for some time.

Topics: Data Management Strata+Hadoop World Strata Data conference

The Google Machine Learns to Compete

Language can be frustratingly ambiguous. Or delightfully ambiguous. When you read the title of this blog, did you parse it as Google is a machine that is learning to compete? Or that machine learning will be how "the Google" competes? Both work, and both are true.

First meaning: there is clear evidence Google is making huge progress in cloud services to better compete against its rivals. Executives at the Google Next 17 conference cited a competitive win rate of 60% in the last quarter, with best results when the company gets a fair shot and customers dig deep into the technical differentiation. Sure, Microsoft is entrenched in most enterprises, and AWS has ridiculous momentum, but Google has invested $29 billion over the last three years to innovate in its own way. Many of the services' advantages are subtle but impactful, such as more granular billing for data warehouse consumption with BigQuery, custom configured compute instances, or the potential for API access to data services already within Google's domain. These have real benefits in reducing costs and increasing value of data.  Machine learning even helps Google be more efficient, like finding ways to reduce data center cooling costs by 50%. As ESG research shows the financial cost/benefit equation is still the top perceived advantage for cloud-based databases, then Google should win simply on price efficiency for compute and storage resources. See a past comparison of costs here. Assuming buyers take the time to understand this and don't default to their Microsoft sales teams or Amazon's DevOps audience dominancy.

Topics: Data Management & Analytics Data Management google machine learning

Big Data, Database, ML and AI Spending Trends: What You Need to Know

Psst...hey buddy, are you an IT professional? Maybe interested in big data, databases, data warehouses, BI, machine learning, and AI? Planning to invest this year? Have I got a sweet deal for you! I can tell show you what your peers are planning. Top trends. Surprise insights. Hot stuff, but in the sense of interesting, not stolen. Check out this little number:

61%

Topics: Big Data Data Management database artificial intelligence

The Big Data Bubble Rises

As the father of two girls, I have deep appreciation for a fascination with bubbles. They are easy to make, they're shiny, and they float carelessly with the breeze. That said, a big bubble is much harder to create and more fragile than dozens of little ones. The loss is felt more acutely.

Topics: Big Data Data Management

Y Evolve Your Data Protection Strategy in 2017 (Video)

Today’s message is brought to you by the letter “Y” and the numbers “1” and “7.”

Most data protection conversations are evolving from “backup” to “recovery” – and to do that, you need to evolve from simply using backup mechanisms to a combination of backups, snapshots, and replicas. And some vendors are there (YAY)!

But then, many vendors, and providers, and partners, and IT teams get stuck at a crossroads – where their data protection strategy must further evolve down one of two paths, with very few vendors able to offer journeys down both long and winding roads: Data Management & Data Availability.

Here is a short video on Y and how your data protection technologies must evolve.

Topics: Backup Data Protection Data Management BC/DR (business continuity/disaster recovery) High Availability Copy Data Management

Highlights from Spark Summit East

Over the last couple of days, I've enjoyed meeting with business and technology leaders around Apache Spark. Hosted in Boston this time, the Spark Summits in general have become great forums to learn about what's new, what's working, and what's coming up next. Here are my quick takes if you weren't able to attend (and missed the thunder snowstorm, WHAT?!?) Highlights included:

Topics: Data Management big data and analytics spark summit apache spark

2017 Big Data & Analytics Predictions: Part 3: Pipelines

What's coming down the pipe in 2017? The pipe itself. Expect even more gains around streaming data pipelines, not merely for analytics but for processing too. Market conditions are right and offerings are hitting the needs of more enterprise use cases. Watch the video below to learn more:

 

 

 

 

Topics: Data Management big data and analytics 2017 predictions pipelines