ESG Validation

ESG Technical Validation: Performance Analysis of Databases Running on Nutanix Hyperconverged Infrastructure (HCI)

Introduction

This ESG Technical Validation documents the results of recent Nutanix performance testing that focused on real-world performance scalability and sustainability improvements in support of mission- and business-critical database workloads.

Background

For business-critical applications and RDBMS, traditional infrastructure deployments are complex; provisioning is often a slow, multi-step process that consumes days or weeks and involves multiple infrastructure teams. Lengthy upgrades require significant downtime, and it is difficult to keep up with patching across database instances. Creating and managing database copies for multiple groups—test/dev, QA, and business intelligence—takes time and consumes costly space on storage arrays. And database restore/recovery operations require hours or days of rolling back snapshots and log files across fragmented resources. Ultimately, the way database infrastructure is deployed can impact productivity, causing delays in time to value for transactional and analytical database activities. It’s no surprise that nearly two-thirds of respondents (64%) to ESG’s annual technology spending intentions survey reveal that IT is more complex compared with two years ago (see Figure 1).1

Hyperconverged technologies continue to replace legacy technology solutions and organizations’ buying criteria have continued to expand. They’re looking past the original promise of simplicity and cost savings; organizations are also prioritizing requirements such as performance, scalability, and reliability—recognizing that technologies like the cloud and software-defined storage will be far less complex and more cost-effective than a traditional siloed approach. In another ESG research study, 57% of respondents reported that they were using or planning to use hyperconverged infrastructure (HCI) solutions. This is not surprising, given the factors driving them to consider HCI. Deployment drivers most cited by respondents include improved scalability (31%), total cost of ownership (28%), ease of deployment (26%), and simplified systems management (24%).2 Organizations need a solution that can deliver both simplicity and consistent, mixed workload performance for critical database workloads without the need to tweak and tune the environment.

Nutanix Era

Nutanix offers software-defined, hyperconverged infrastructure for databases that provides simplicity, agility, high availability, and efficiency. A key feature that makes Nutanix a simple and effective platform for databases is the software tool called Era, which helps customers with complete database lifecycle management at the click of a button. Databases can be deployed in minutes, configured with disaster recovery; simple, efficient copy creation; easy patching and upgrades; automatic refresh; and simple rollback to any point in time with Nutanix Era Time Machine.

With built-in best practices, Era delivers a distinct advantage for databases running on Nutanix HCI when compared with the traditional setup and tuning that can take days or weeks of administrative effort. The simplicity enables non-DBAs to provision complex, multi-cluster databases with ease. Era fits in well with the promise of hyperconverged infrastructure, which was designed to simplify infrastructure deployments for applications.

Nutanix HCI is designed to deliver a complete, software-driven IT infrastructure stack with the agility, scalability, and simplicity of the cloud combined with the security, performance, and cost predictability of a traditional on-premises infrastructure. The architecture is a scale-out, fully distributed software platform leveraging web-scale engineering principles innovated by leading cloud companies such as Google, Facebook, and Amazon. The software integrates the compute, virtualization, and storage environments into a single solution. This integration eliminates the complexity of traditional SAN and NAS environments, costly, special-purpose hardware, and the specialized skill sets they require. Nutanix HCI platform with new Blockstore and Intel’s SPDK technology—combined with other technologies like Autonomous Extent Store (AES), which was introduced in a prior version of Nutanix HCI—brings HCI performance to the next level by capitalizing on the optimized architecture of Nutanix HCI. These innovations optimize for high throughput and low latency applications, and they are uniquely designed to deliver maximum benefits of new media such as NVMe and storage-class memory.

ESG Technical Validation

ESG validated the ease of use during a remote demonstration of Nutanix Era, including simplifying operations from a single pane of glass for provisioning, cloning/refresh of space-efficient copies, patching, and Time Machine recovery.

Simplicity

From the dashboard view, we got an overview of all database instances and details of space savings, sources, clones by age, Time Machine snapshots, and alerts.

We quickly provisioned an Oracle database using four easy screens and several mouse clicks. We clicked on Database/Provision and selected Oracle as the engine; we had the choice of selecting a single instance or a multi-node cluster. Next, we chose the Nutanix cluster on which to place the database and selected the Oracle version, followed by choosing the compute profile (templated into small, medium, and large in terms of vCPUs, cores, etc.), network profile (vLAN), and public key for access. Next, we gave the database a name, selected the disk group size, and entered the storage system password. There were spaces available to insert pre- and post-commands if desired, such as for adding data masking. Finally, we specified the Time Machine Gold policy, which was configured to save 30 days of continuous transaction logs, plus 30 daily, four weekly, 12 monthly, and four quarterly snapshots. The last step was to click Provision, and the task began.

At the end of every task, Era offers an API Equivalent button, which will bring up the complete API calls configured for various programming languages if preferred. In addition, details could be viewed from the blue icons on every screen.

Our demo also explored how easy patching was; it involved selecting a clone, clicking the Update Available message, choosing the upgrade from a list, and choosing to upgrade now or at a scheduled time. From the Operations screen, we could see the provision and patching task steps being executed, with time stamps.

The Time Machine feature provides snapshot restore by rolling back to any point in time, down to the second, by creating a clone from the snapshot. For a CRM database, we viewed the calendar of snapshots, color coded for continuous, daily, weekly, monthly, and quarterly snapshots.

Restoring was simply a matter of selecting a date on the calendar, choosing either a daily snapshot or the hour/minute/second from which to restore, choosing the location on which to create the clone, and providing a name and database profile (small, medium, or large). Pre- and post-command and the API Equivalent button were also available.

Why This Matters

Databases are critical, business-driving applications for many organizations, for both transactional and analytical use cases. Traditional infrastructure deployments for databases cause complexity in provisioning, updating, cloning, and refreshing, causing delays that inhibit time to value.

ESG validated that Nutanix with Era simplifies database provisioning, cloning, refresh, patching, and restore/recovery from a simple GUI, with options for automation using the CLI or API. The interface is so simple and intuitive that non-DBAs can easily accomplish any task across the entire database lifecycle. Also, Time Machine functionality dramatically simplifies restore and refresh to any point in time.


Performance

ESG audited complete and detailed results from performance tests using a four-node Nutanix NX-8170-G7 cluster populated with eight Intel DC P4510 Series 4TB NVMe devices per node that examined both synthetic raw performance and realistic database workloads. The testing used the Nutanix tool to demonstrate raw performance capabilities of the platform and industry-standard database workload generation tools that exercised the Nutanix HCI using live SQL Server and Oracle databases. The workloads we look at in this report include:

  • Raw Performance — This test generated random reads and random writes, with a goal of demonstrating peak burst performance.
    • I/O Profile — 8KB random reads and writes, 1MB sequential reads and writes.
  • SQL Server Performance
    • I/O Profile — Dell’s Benchmark Factory was used to generate an OLTP database workload that emulated users in a typical online brokerage firm as they generated trades, performed account inquiries, and executed market research. The workload was composed of multiple transaction types with a defined ratio of execution—some performed database updates, requiring both read and write operations, while others were read-only. The estimated read/write I/O ratio was 90% reads to 10% writes.
  • Oracle I/O Performance
    • The Silly Little Oracle Benchmark (SLOB) was used to efficiently generate realistic system-wide, random, single-block, and application-independent SQL queries. The tool exercised all components of the server and storage subsystems by stressing the physical I/O layer of Oracle through SGA-buffered random I/O, without being limited to a specific load-generating application.

ESG Testing

First, we tested the cluster’s raw IOPS performance, a common assessment of basic horsepower of the system, and compared results to testing performed in 2017. The system tested in 2017 was an all-flash Nutanix NX-3460-G5, four-node cluster running Nutanix HCI 5.0 with two Intel Xeon E5-2680v4 processors (14 cores at 2.4 GHz), 256GB RAM, and six 1.92TB SSDs per node. The current Nutanix system under test was a four-node Nutanix NX-8170-G7 cluster running Nutanix HCI 5.15 LTS with two Intel Xeon 8280 processors (28 cores at 2.7GHz), 768GB RAM, and eight 2TB NVMe devices per node.

As shown in Figure 7, Nutanix showed a 5.3x performance improvement in random reads, and a 4.3x improvement in random writes.

SQL Server Performance

Next, we compared SQL Server OLTP performance between the same two systems. The current tests used the latest software stack: Windows 2019, SQL Server 2019 CU6, and Benchmark Factory 8.3. Four agents were used to generate a total of 80 concurrent users per VM (totaling 320 cluster-wide users), so that all users interacted with the database as quickly as possible (no think time). Test runs were completed for each VM count (one to four) to highlight predictable performance scalability as the demanding OLTP workload exercised more resources in the cluster. It should be noted that IOPS and transactions/sec do not have a 1:1 correspondence. In most cases, a single transaction comprises multiple read and/or write I/Os. Another important metric difference is latency. Storage latency is often associated with IOPS, while the transaction response time as reported in this analysis is specific to the OLTP workload, which exercises both compute and storage. As shown in Figure 8, ESG analyzed the transactions/sec and average transaction response in seconds.

ESG reviewed data showing consistent performance scaling as the concurrent database instances increased from one to four, while average transaction response times remained low. The total number of transactions per second (TPS) averaged 6,559 per database instance, with the lowest-yielding SQL Server VM producing 5,844 TPS and the highest-yielding SQL Server VM producing 7,010 TPS.

This showed a twofold benefit: not only near-linear OLTP performance scalability with a remarkably small variance of just 6% between all instances as more nodes were added, but also an even workload distribution that predictably consumed resources without impacting the other SQL Server instances. Even more impressive was the average transaction response time. The Nutanix solution consistently delivered ultra-fast speeds of .012 seconds per transaction with all four nodes running the workload.

The average I/O results were gathered during the execution of the test when all four virtual machines were being tested at the same time. The small increase of write latency is most likely due to the much higher transactions per second being processed by the system. Said another way, transactions increased by 146% with an expense of 1.8% of storage write latency. Such a small value could be offset by some tweaks in SQL server or the Nutanix platform, but with a transaction response time reduction of 61%, this is an outstanding performance result.

Oracle Performance Driven by SLOB

Next, ESG compared results of an insert/update/read workload driven by SLOB running on an Oracle database between our modern cluster of four NX8170 nodes and an all-flash Nutanix NX-9460-G4 cluster tested in 2017. The NX-9460-G4 cluster contained dual Intel Haswell E5-2680v3 processors (12 cores at 2.5GHz), 256 GB of RAM, and six 1.6TB SSDs. Eight total VMs (running Red Hat Enterprise Linux [RHEL] 7.2 with six vCPUs and 32 GB of RAM) were configured with a single instance Oracle database. Each VM was given a 100GB vDisk for RHEL, a 100GB vDisk dedicated for the Oracle Cluster Registry (OCR), and 16 125GB vDisks for Oracle database data files and online redo logs. The NX8170 four node cluster ran an updated software stack: Oracle 19.3, Oracle Enterprise Linux 7.7, and SLOB 2.5.2.4.

Performance was recorded using Oracle Automatic Workload Repository (AWR) to provide the performance analysis from Oracle’s point of view, with Oracle’s data.

Additional performance highlights with one DB per Nutanix node include greater than 1,700 MB/s read throughput at 0.76 ms latency and greater than 630 MB/s write throughput at 1.23 ms latency.

Again, Nutanix demonstrated near perfect linear scalability for reads and writes. Reads showed just a 5.6% variance and writes showed 6.3%. Latency stayed quite low throughout the tests, with an increase in performance of 53% overall, again, an outstanding result.

When we increased Oracle VMs per node to 2:1, the system still improved performance significantly. We saw IOPS increase by 14% and bandwidth increase from 2,334 MB/s to 3,360 MB/s while keeping read latency below 1 ms.

Why This Matters

Delivering high levels of performance is a requirement for IT environments that rely heavily on mission- and business-critical databases. This is especially important in dynamic environments where data growth is constant and continuous accessibility is a requirement. The ability to easily meet these performance and scalability requirements is essential for anyone evaluating hyperconverged infrastructures. The challenge is that some organizations feel there is too much overhead between the virtualization and the essential underlying services that must always be running to not only ensure proper functionality of the hyperconverged infrastructure, but also meet strict application performance SLAs.

ESG confirmed that Nutanix HCI with Era significantly improves I/O efficiency and performance compared to previous generations. Nutanix has improved application performance and reduced latency, validated in synthetic and real-world testing. Our tests exercised both storage and compute to highlight the type of performance organizations can expect in their own OLTP database environments. Nutanix showed improvement in every test scenario, improving raw IOPS by nearly 5x overall, improving SQL Server transaction processing by 146%, and reducing response time by 61%. Oracle performance improved by 53% overall, while keeping response times extremely low.

This can easily translate into support for significantly higher density and performance of supported databases and applications.


The Bigger Truth

In a world where digital transformation, DevOps, and agile development are driving both efficiency and complexity, organizations want the simplicity and scalability of the cloud provided by HCI, but they need to predict costs for business- and mission-critical databases with high performance SLAs. High levels of reliable and scalable enterprise-class performance are no longer optional.

To address these not-always-aligned challenges, Nutanix has:

  • Made it simple to deploy and manage. Nutanix provides all the tools and dashboards to manage the environment from a single pane of glass, automatable via APIs.
  • Streamlined the I/O stack while leveraging technologies like NVMe, up to 100GbE top of rack (ToR) switching support, and RDMA support to maximize performance.
  • Provided for complete, simplified database lifecycle management—provision, patch, backup, clone, refresh, and other day two activities.

ESG has validated that Nutanix has addressed these issues with their latest generation HCI clusters and Nutanix HCI. Testing confirmed that Nutanix meets the demanding performance requirements of dynamic, mission-critical databases. The Nutanix HCI platform delivered significant IOPS and latency improvements in all our tests. Synthetic and real-world testing exercised both compute and storage resources to meet the high transaction and low latency demands of scalable OLTP database deployments in both Microsoft SQL Server and Oracle OLTP database environments.

The results presented in this document are based on testing in a controlled environment. Due to the many variables in each production data center, it is important to perform planning and testing in your own environment to validate the viability and efficacy of any solution.

As hyperconverged technologies continue to mature, Nutanix continues to expand the boundaries of what is possible by not only adopting and developing cutting-edge technology but also providing software that simplifies life for IT admins and DBAs alike. If you’re looking to modernize your IT infrastructure to provide the benefits of today’s most highly performant compute and storage technology to your critical databases with the simplicity of HCI, ESG recommends you take a close look at Nutanix for relational database workloads.



1. Source: ESG Research Report, 2020 Technology Spending Intentions Survey, February 2020.
2. Source: ESG Master Survey Results, Converged and Hyperconverged Infrastructure Trends, October 2017.
This ESG Technical Validation was commissioned by Nutanix and is distributed under license from ESG.

ESG Technical Validations

The goal of ESG Technical Validations is to educate IT professionals about information technology solutions for companies of all types and sizes. ESG Technical Validations are not meant to replace the evaluation process that should be conducted before making purchasing decisions, but rather to provide insight into these emerging technologies. Our objectives are to explore some of the more valuable features and functions of IT solutions, show how they can be used to solve real customer problems, and identify any areas needing improvement. The ESG Validation Team’s expert third-party perspective is based on our own hands-on testing as well as on interviews with customers who use these products in production environments.

Topics: Storage Converged Infrastructure