ESG Validation

ESG Technical Review: Hitachi Content Platform: High-performance Object Storage for Tier-1 Workloads

Abstract

This report documents ESG’s validation of Hitachi Content Platform performance testing that demonstrates predictable, scalable high performance for object storage. The testing was done using a real-world configuration and demonstrates performance that customers can achieve in their data centers.

The Challenges

Organizations need more data, faster, to deliver insights that drive business decisions. Vast amounts of data have been archived on object storage, where they gain the benefits of scalability, protection, and cost efficiency. However, today organizations are increasing their investments in object storage for analytics workloads. According to ESG research, 53% of respondent organizations expect to accelerate spending for on-premises object storage. New data sources and modern application workloads are driving some of this object storage consumption. In fact, 32% of those organizations that are accelerating their investments in on-premises storage identified AI/ML as workloads that will be responsible for storage spending growth, while 38% of those organizations identified IoT and 29% identified data warehouses as drivers of spending growth (see Figure 1).1

As analytics usage increases, organizations want faster performance to speed data-focused insights that can inform business decisions. A high-performance object storage solution can enable additional modern and cloud-native applications to gain the benefits of its scalability, fast data retrieval, and cost effectiveness.

The Solution: Hitach Content Platform (HCP)

HCP is object storage software for cost-effectively storing, protecting, and scaling unstructured data. Its hallmarks are ease of use, efficiency, and massive scalability, and now it has the horsepower to support high-performance workloads and cloud-native apps. HCP can be deployed on-premises, in a hybrid cloud, or in a multi-cloud environment to store and manage the unstructured data associated with modern S3-based applications and workflows (e.g., data lakes, analytics, AI, Hadoop, and IoT) as well as traditional applications that process massive amounts of unstructured data (e.g., file shares, email archives, Microsoft SharePoint repositories, backup data, e-discovery archives, and content distribution platforms). HCP can be deployed as a physical appliance, virtual appliance, or as software-defined, and supports both REST and Amazon S3. A single cluster can scale to more than one exabyte on-premises. Hybrid and all-flash configurations are also supported.

Recent updates to the HCP portfolio include:

  • All-flash HCP second-generation hardware models deliver faster throughput at lower cost than previous generations for applications with demanding performance requirements. The higher performance of an all-flash HCP deployment, or a tiered HCP platform configured with a mix of fast flash and cost-effective disk, appeals to the growing number of new applications that are being developed with a simple object storage protocol (e.g., AWS S3) instead of a traditional block or file storage protocol.
  • HCP S Series Enhancements include increases in performance, scale, and capacity density.

Other HCP features include:

  • Multi-cloud.
  • Intelligent policy-based tiering.
  • Flexible data protection choices, geographically dispersed erasure coding, and encryption.
  • Compression and deduplication.
  • Robust compliance and governance.
  • Cloud-native support.
  • Robust metadata and built-in search capabilities.

In addition, ESG previously validated the low TCO that HCP can provide2:

ESG Tested

ESG audited Hitachi Vantara HCP software performance testing that used a test bed located in the company’s Center of Excellence in Oklahoma City, OK. Testing was designed to demonstrate read and write performance and scalability, plus time to first byte.

The test bed was based on a real customer deployment, and included 8-, 16-, and 24-node HCP G11 all-flash configurations. Four DS120 Hitachi Unified Compute Platform (UCP) nodes were used for the COSBench test harness, an open source cloud object storage benchmarking tool. Two traffic manager units were used for load balancing.

Retrieving data from object storage is done using the GET (e.g., read) task, while transferring data to the object store uses the PUT (e.g., write) task. This testing performed GET and PUT tests with large (10MB) and small (1KB) objects. Time-to-first-byte metrics were gathered in real time from the traffic manager.

Large Object Testing

The key metric for large objects is bandwidth, measured in GB/second. This testing used 10MB objects, similar to real-world objects such as image files or backup data. Performance and scalability were tested starting with an 8-node HCP configuration and scaling to 16 and 24 nodes.

Figure 3 shows the average performance for GETs and PUTs that HCP delivered for the three configurations. GET average performance scaled from almost 19 GB/sec to close to 35 GB/sec, while PUT performance scaled from 14 GB/sec to more than 40 GB/sec. In each case, HCP delivered more than 1.6 GB/sec/node. PUT performance scaled in a near-linear fashion as the number of nodes increased.

Small Object Testing

Next, we reviewed throughput of 1KB objects measured in operations/second. A real-world example would be small data files or metadata. Metadata performance is critical to maintaining high-performance data access for object storage.

As Figure 4 shows, GET performance scaled from 44K ops/sec with eight nodes to more than 141K ops/sec with 24 nodes, exhibiting linear scalability. PUT performance increased with additional nodes as well, from 48K ops/sec to 67K ops/sec.

Time to First Byte

Time to first byte measures the responsiveness of the object store to requests made for data—that is, how long it takes for the first byte of data to reach the requester, enabling the user to begin using the data. This provides insight into how object storage affects productivity.

Hitachi Vantara established a high-performance threshold of 15 milliseconds (ms) for this testing. Note that 20 ms is considered an excellent response time that would not be noticeable by users. During both the large and small object testing, this metric was collected in real time. The HCP deployment produced results between 5 and 17 ms, with most results between 5 ms and 10 ms (see Figure 5).

Why This Matters

Object storage is no longer relegated to archival usage focused solely on scale and cost per gigabyte. Today, organizations are storing mountains of data in object storage platforms and using it for multiple purposes—not just scalable content repositories and backup, but also analytics, AI, ML, and IoT. As a result, performance now matters in object storage.

ESG validated that HCP offers high-performance object storage. GET and PUT testing demonstrated average throughput of 14 to 40 GB/s for large objects, and small object performance of 44K to 1,411K operations per second. These performance levels mean that organizations can count on HCP for production workloads, including the latest analytics applications. In addition, HCP delivered 15 ms or less in time to first byte, indicating excellent responsiveness and enabling users to be productive without delay.


Customer Examples of HCP Performance

ESG also reviewed several customer examples that demonstrate the real-world performance benefits of HCP object storage.

  • 1TB/minute with Exabyte Scale. A government customer used 54 HCP nodes to deliver 1TB/minute throughput for a 22PB Hadoop data lake using the S3A protocol. This customer was collecting huge amounts of streaming data including voice, data logs, machine logs, and security event logs for analysis. The HCP solution delivered faster, more accurate data insights across multiple data sources with cost efficiency. The customer expects to grow to more than 80 PB within two years, with all data retained for a year.
  • 1 trillion objects, 12 GB/second. Using an HCP all-flash configuration, a customer in the financial services industry with heavy growth ingested one trillion objects over 12 petabytes of storage, across 56 different applications, and maintained 12GB/sec small object (e.g., emails, PDFs, metadata) performance while ensuring full regulatory compliance. Metadata, indexing, and search functions supported all business use cases interfacing with the data, including legal hold, compliance, dispositioning, and high-performance search. Performance was critical, since this archive is part of the company’s primary customer interaction workflow. The customer extracted the data and metadata, offloaded and retired a mainframe environment, gained massive growth with performance at scale, and deployed new use cases. Furthermore, by consolidating multiple data sources onto HCP, they were able to save $100M in administrative and maintenance costs over five years.
  • 15ms response rate. Another government customer needed to host client data review processes and wanted to provide sustained performance of 50 ms or less for time to first byte, so that employees could work productively. The HCP all-flash solution overachieved this objective and delivered a sub-15 ms time to first byte to support the customer’s one PB of data, enabling higher productivity and delivering the needed customer experience.
  • High-performance consolidation. Another financial services organization wanted to consolidate regulated unstructured data from content management and data repositories, taking data from high speed Fibre Channel storage and NAS platforms. Using multiple all-flash HCP clusters, this company consolidated multiple service tiers into a single, high-performance tier.

The Bigger Truth

Traditionally, organizations have turned to object storage for highly scalable, cost-efficient, long-term storage with easy data retrieval—but not for performance. But today, organizations need to use all their data, not have some data stored in slow silos, where it is available in an emergency but not for production applications. Too much insight is available from data lakes, content repositories, and email archives for organizations to waste money storing it without usable performance.

Hitachi Content Platform offers object storage for today’s workloads, including the latest analytics applications. HCP delivers the high performance and scalability that enterprises demand of their business-driving workloads.

ESG validated:

  • Average large object GET and PUT performance of 14GB/sec to 40GB/sec throughput.
  • Average small object GET and PUT performance of 44K to 141K operations/second.
  • Time-to-first-byte performance of 15 ms or less.

These kinds of performance results change the role of object storage. Now, the object storage advantages of massive scalability, fast data retrieval, and cost efficiency can be used with tier-1 production workloads. Just look at the results some HCP customers are experiencing:

  • 1TB/minute throughput maintained for a 22PB Hadoop data lake.
  • Consolidation of 56 archives, 12 PB of storage, to deliver 1 trillion objects with 12GB/sec throughput.

Clearly, this is a long way from traditional object storage. Hitachi Vantara has long been a trusted provider of solutions for enterprise customers, including large, complex environments with distributed employees. The company’s solutions are known for reliability, security, availability, and enterprise-class features. This HCP performance validation adds to the company’s strong resume.

Of course, your mileage may vary, as these tests were run in a controlled environment, and every organization should plan and test in its own data center to ensure the efficacy of the solution. But if you are looking for a storage solution that delivers scalable high performance with cost efficiency, ESG recommends that you take a good look at Hitachi Content Platform.



1. Source: ESG Master Survey Results, 2019 Data Storage Trends, November 2019.
2. For a detailed TCO analysis, please see the ESG Economic Validation, The Economic Value of Hitachi Content Platform Storage, July 2020.
This ESG Technical Review was commissioned by Hitachi Vantara and is distributed under license from ESG.
Topics: Storage