ESG's Mike Leone discusses the results of ESG Lab's recent testing of MapR-DB, a converged data platform.
Welcome to another ESG Lab video summary where I'll be reviewing the results from our recent testing of MapR-DB, which is an integral part of the MapR Converged Data Platform. As data continues to grow at an unprecedented rate, organizations are turning away from traditional databases in favor of solutions more equipped to scale cost-effectively in order to meet performance needs of Big Data. NoSQL databases offer high performance in a simplified, scalable structure, making it an attractive choice for many organizations. In fact, recent ESG research shows that 96% of organizations already use or plan to take advantage of NoSQL. The MapR Converged Data Platform is a collection of open-source engines, tools, and applications that leverage the purpose-built MapR Data Platform to deliver a scalable, reliable, and secure infrastructure for global, data-driven applications. MapR-XD for the web-scale storage, MapR-ES for event streaming, and MapR-DB as the NoSQL database are the three core services working together to enable MapR to truly be a converged platform that supports all workloads on a single cluster.
Having already validated MapR-XD and MapR-ES, this validation focused on MapR-DB, the NoSQL database capable of handling extreme scalability to provide real-time operations and analytics without the extra complexity and cost of a traditional database architecture. MapR-DB's flexibility and reliability gives organizations a database solution with the enterprise-level size and speed to harness Big Data as part of the fully converged platform.
We looked to validate MapR-DB's ability to deliver high levels of performance in the cloud while comparing performance results to two other open-source, NoSQL database offerings, HBase and Cassandra. To compare MapR-DB's performance capabilities in the cloud with two other NoSQL databases, a common hardware configuration was created in AWS. The Yahoo Cloud Serving Benchmark was then used to evaluate the performance of key value in cloud-based data stores. The testing consisted of a set of scenarios designed to emulate real-world workloads carried out by the client. Five workloads were run for 2 hours each on all 3 platforms and consisted of a mix of 100% read, read-update, and read-insert workloads. When analyzing operations per second across the five workloads and three database offerings, MapR averaged a performance improvement of 2.5x more operations per second than Cassandra and 5.5x more than HBase. Benefits with MapR-DB were recognized across all workloads, with the largest on the mixed workload of 50% read, 50% insert, where MapR-DB outperformed HBase by over 10x and Cassandra by as much as 3.5x.
Just as important were the latency measurements. With a goal of yielding low, predictable latency at a consistent rate, latency was measured every 2 minutes during a 50% read, 50% write workload. MapR-DB delivered, never exceeding 3-millisecond latency during the 2-hour test. Cassandra delivered on predictability, but achieved an average latency 75% greater than MapR. And HBase did not achieve low or predictable latency, averaging 6.4x higher than MapR-DB.
So why does all this matter? Organizations leverage NoSQL database solutions to keep up with the growth of data, which require performance and consistency to fulfill the needs of a real-time business. MapR delivers a Big Data storage and processing infrastructure that manages data from initial generation and ingestion to consumption and real-time insight through MapR services, open-source engines, and commercial applications. We validated that MapR-DB met the demanding performance requirements of operational and analytic workloads running in the cloud. The tested MapR-DB database, consisting of billions of records, was not only the fastest solution in our analyzed tests but also the most reliable by a sizable margin. By inheriting the existing benefits of the overall MapR platform related to high availability and reliability, MapR-DB achieved speed and predictably low latency unmatched by its peers, Cassandra and HBase. For MapR, a high-performance, flexible, NoSQL database was required to create a truly comprehensive platform of fully integrated data services. MapR-DB provides the reliability organizations expect in a modern database solution, turning MapR into a highly-performing, ultra-reliable, enterprise-grade converged data platform to meet the demands of real-time operational and analytical processing on a global scale. If you'd like to learn more, read the full report.