ESG's Delta-V awards recognize the top 20 companies that made an impact in big data and analytics in 2015. Here's the set of awards celebrating organizations for their contributions to Hadoop and Spark. These are listed in alphabetical order, so consider rebranding your company as "Aardvark Advanced Analytics" if you really want to be "first" next year.
To learn a little more about the awards, check out this overview.
Cloudera — I'm not exactly going out on a limb here, but Cloudera really does earn all the fanfare. This year the company:
- Introduced the exciting new Kudu storage.
- Launched RecordService for some governance (yay!).
- Re-launched Xplain.io as Navigator Optimizer.
- Offered Impala and Kudu to the Apache community.
- Hosted many excellent Strata events and the Wrangle conference.
- Announced the One Platform Initiative for Hadoop & Spark development together.
- Added support for Google Cloud Platform and Microsoft Azure.
- Teamed up with Teradata on a new appliance.
- Joined EMC Select.
- Brought out Ibis for python.
- Embraced Kafka.
- Probably did a bunch of other stuff I'm forgetting at the moment.
There is no question that Cloudera is leading innovation in Hadoop and yes, Spark, too.
Databricks — Speaking of Spark, where would we be without the fine folks at Databricks? We'd be lamenting the limitations of MapReduce; that's where. Spark has incredible market momentum as an easier-to-program and versatile analytics engine for big data, offering streaming analytics, machine learning, SQL queries, and more. Databricks:
- Hired a great CMO (Hi Kav!)
- Got accredited by Amazon Web Services.
- Hosted the Spark Summits.
- Partnered with IBM and Intel.
- Added security features (yay!).
- Gained support for R.
- Introduced notebooks for collabortion.
- Brought out DataFrames.
- Oh yeah, GA'd it's own cloud platform for Spark.
After Databricks left Berkeley for San Francisco, it was so sad and desolate there I moved away myself.
Hortonworks — Still speaking of Spark, Hortonworks also took up the charge:
- Rolling 1.5.2 into the HDP distribution.
- Adding to Zeppelin.
- Building tighter links with other Hadoop components.
- Generally working on making it faster, easier, safer, and robuster. Yes, I said "robuster", deal with it.
Back on Hadoop, Hortonworks:
- Launched SmartSense for even more robusterness, while the Big Data Scorecard is an interesting exercising in planning initiatives.
- Dataflow (with Onyara's expertise on Apache Ni-Fi) helps moving data ingest, compliance, and security, all goodness for IoT.
- Another great new CMO here (Hi Ingrid!) and CTO (Now Gnau!).
- Partnerships with NEC, HDS, EMC, and 997 others.
- HDP2.3 brought better management capabilities with Ambari, SQL semantics in Hive, Solr on YARN, and more governance (yay!) including Apache Atlas.
- Hosted the Hadoop Summit.
- Big play with IBM and Pivotal in forming the ODPi alliance.
- Survived its first year as a public company!
MapR — A bit like the Rodney Dangerfield of Hadoop, MapR doesn't get no respect, yet is doing some really cool stuff. Most recently:
- Streams is the culmination if not the end of the company's vision for a unified platform, for IoT among other things.
- MapR also introduced free training for Hadoop and Spark.
- Apache Drill 1.2 got bundled.
- Data Exploration Quick Start was quickly started.
- New president and COO (Hi Matt!)
- An in-Hadoop document database.
- More SAS support.
- NoSQL database improvements.
- Cloud options for AWS and Azure.
- Spark Quick Starts.
- MapR 5.0 with real-time data transport, MapR-DB table replication and C-language API, and more governance and security (yay!) Auto-provisioning templates.
- Tie in to Teradata QueryGrid.
- Myriad to link up Mesos and YARN.
- Adding Spark 1.5.2 support
- Native JSON support in MapR-DB.
There's a lot going on there.
Individually, each of these companies is defining and executing on a distinct path to success with Hadoop and Spark. Together, they are shaping the industry and ushering in a new era of analytics.