Scale-out 2.0: Simple, Scalable, Services-Oriented Storage

It is a story often repeated because it is a challenge IT has been living with for a very long time. The amount of data businesses need to store is skyrocketing, which drives corresponding growth in overall data storage costs in the form of storage systems, floor space, power, cooling, and the people required to manage it all.

Author(s): Terri McClure

Published: June 29, 2010

Storage Environments Under Pressure

With IT under constant pressure to find ways to reduce cost, taking a long hard look at the storage environment makes sense. And so today, more than ever, IT is turning to the storage environment, investing in new technology with a clear focus on reducing operational costs.

Figure 1. Changes in Justifying IT Investments Reflect Changing Business Priorities

Managing data growth was cited as a top priority by one quarter of those IT managers surveyed in ESG's 2010 spending intentions survey, putting it among the top five priorities for IT managers and only slightly behind improving security, backup, and network infrastructure.[1]

Business as usual in the storage environment is no longer acceptable; IT is starting to collapse under the weight of the sheer amount of data it has to store.  "Spreadsheet management systems" cannot track what data lives on which LUNs when there are tens or hundreds of thousands of LUNs in the networked storage environment. Separate storage growth forecasts based on block and file protocols are not useful when, at the end of the day, all anyone wants is storage capacity to use when and where they need it.  What's more, standalone silos with sub-50% utilization rates create too much waste in terms of management, floor space, and power and cooling.

Server virtualization is driving more complexity in the storage environment-disk LUNs need to be mapped to virtual machines and users need to carefully monitor how many virtual systems are sharing a single LUN if they are to avoid performance bottlenecks.

It is incumbent on storage vendors to develop simpler, more flexible storage systems; to remove the complexity of traditional disk/LUN/volume performance management and tuning; to provide protocol agnostic storage systems that provide a single storage pool that can be leveraged in multiple ways; to offer tiers or pools of storage with differing price/performance/availability characteristics to meet varying application support requirements (and a way to migrate between the pools); and to simplify storage management by automating low level tasks like provisioning and performance tuning.

The shift is starting to happen.  ESG is encountering more and more storage companies with a vision for creating simplicity at scale that meets many of the above requirements.  It will be a while before vendors meet all of the requirements for creating a mature, services-oriented storage environment, but the move is afoot and it seems to be particularly driven by scale-out storage vendors.

The Emergence of Scale-Out Storage

Scale-out 1.0

Scale-out storage really started in the NAS space-users needed systems that could support the massive throughput requirements of high performance computing (HPC) and media and entertainment. More recently, adoption is occurring in medical and geographic imaging applications. Scale-out NAS systems can independently scale throughput and capacity by adding nodes that work in parallel to support throughput requirements and are managed within a single namespace as a single system image.

There are inherent benefits in scale-out platforms that give a path to reduction in operational costs.  They can typically scale into the multi-petabyte range under a single system image, providing an ideal platform for consolidation.  They help IT reduce management costs and footprint, which reduces floor space and power and cooling costs.  And consolidation onto a shared resource results in much higher utilization rates, so users get more bang for their storage buck.[2]

Scale-out systems are seeing more and more enterprise interest as a platform that enables users to consolidate storage and contain costs.  In fact, ESG found 75% of the IT managers surveyed in late 2008 were either planning to deploy or investigating scale-out NAS.[3]

Figure 2. Scale-Out NAS Market Drivers

Scale-out NAS offers a number of benefits, but it has shortcomings as well-typically in three key areas:

  • Scale-out storage has its roots in vertical markets that require high bandwidth throughput or fast parallel throughput for very large files, like those found in high performance computing and media and entertainment, so they are not designed for the more IO-intensive environments of traditional general purpose NAS.
  • Those verticals where scale-out storage systems are used in line-of-business applications have not had requirements for enterprise-class features, like snapshot, remote replication, automated data tiering, and multiprotocol support found in today's data centers; they've typically just been NAS, so many scale-out systems lag behind traditional scale-up systems in functionality such as multiprotocol support and synchronous remote mirroring.
  • Some systems can be complex-after all, it's not easy to build a true scale-out storage cluster that manages cache coherency, load balancing, autotuning, and tiered storage.  It is difficult to make scale-out truly simple to manage.

To truly meet enterprise needs, these systems must mature, vendors need to deliver "scale-out 2.0" systems, and we are indeed starting to see some solutions emerge.

The Rise of Scale-Out 2.0

Today, we're seeing advances in scale-out platforms that bring functionality and ease of use directly in line with enterprise IT environments-scale-out 2.0 systems are basically a mashup of enterprise unified storage functionality and scale-out architecture.  These systems are protocol-agnostic and support consolidation efforts for both block and file data, tiering across nodes with different price/performance profiles, and pooled storage that allows IT to manage storage as a shared IT resource along with features such as remote replication, thin provisioning, and snapshot.  We've also seen advances in ease of use and automation that enable IT to truly create a dynamic, responsive storage infrastructure with minimal administrative overhead.

Figure 3. Scale-Out Storage 2.0

Scale-out 2.0 platforms will deliver the benefits of traditional enterprise-class storage systems and the efficiency and scalability of scale-out 1.0. These systems will:

  • Provide secure, multitenant storage pools with varying price/performance/availability profiles.
  • Provide a flexible platform that supports application virtualization and aggregation.
  • Scale performance by adding processors or drives to pools as needed.
  • Scale from terabytes to exabytes, growing with user requirements instead of ahead of them, optimizing the cost equation.
  • Scale dynamically, always on, in any direction.
  • Be protocol-agnostic, supporting multiple data access protocols and unifying the storage environment.
  • Virtualize the storage environment in such a way that users have an always-on architecture that provides data access in the event of everything from a component outage to a lease rollover.
  • Support enterprise-class storage features that are quickly becoming jacks-or-better requirements, such as thin provisioning, deduplication, read-only snaps, and synchronous remote copy.

What does this mean for IT?  These scale-out 2.0 platforms provide the ability to deploy a dynamic storage infrastructure that is flexible and grows with its users, and that supports the transition from lots of fixed, stovepiped storage systems to a shared, services-oriented information storage infrastructure. In this new infrastructure, capacity can be quickly provisioned, shared, managed with fewer resources to give IT levels of agility in the storage infrastructure and support new heights of storage scalability, efficiency, and agility.  That's scale-out 2.0: a dynamic, unified storage architecture that provides simplicity at scale.

The Bigger Truth

Scale out 2.0 is largely aspirational right now.  A number of vendors have the roadmap in place to make this vision a reality.  ESG expects to see vendors start to fill in the scale-out 2.0 functionality checklist this year, with some completing it in the next 12 months. This rather utopian vision of a malleable shared storage environment is closer to reality than most users realize.

Humans can no longer manage storage the way it has traditionally been managed.  It is just too big, with millions of LUNs being required to get traditional storage systems to petabyte scale; there are just too many elements to manage and there is just too much inefficiency in traditional storage architectures.  This inefficiency is driving users to look at new ways of doing things that are more efficient and flexible, to make IT a business enabler rather than an inhibitor.  It is driving discussion around cloud services to reduce costs and the creation of internal IT resource clouds and service catalogs to deliver IT-as-a-service.  Public cloud storage services are built on scale-out platforms, but many don't offer enterprise features.  ESG expects to see more and more cloud service providers and enterprise IT organizations embracing scale-out 2.0 platforms over time as they prove to deliver simplicity and flexibility at scale.

[1] Source: ESG Research Report, 2010 IT Spending Intentions Survey, January 2010.

[2] For more information, see ESG Brief, Scale-Out Storage, June 2010.

[3] Source: ESG Research Report, 2008 Enterprise Storage Systems Survey, November 2008.