Virtualization is a dominant trend across the IT landscape—and for good reason. However, adoption of virtualization has occurred in concert with the expansion in seamless, automated backup. In this world, it is crucial to understand the role of agents relative to host-based backup.
Published: August 27, 2012
Organizations seek enhanced economics through virtualization, but they will do nothing to risk the safety of their data. In fact, a recent ESG survey showed that improving data backup and recovery and expanding server virtualization were both top-ranked priorities (see Figure 1).
Indeed, the tie-in with server virtualization is notable: Virtualization solves so many problems, particularly from an IT server and service delivery perspective. However, a traditional backup methodology that doesn’t understand the unique nature of how VM sprawl affects storage consumption with a great deal of duplication across virtualized operating systems) doesn’t adapt well to the needs of a virtualized infrastructure. So, although virtualization is a good thing, it can exacerbate problems in other aspects of IT, such as data protection.
In the world of virtualization, there have been three predominant ways to perform backup:
But it hasn’t always been easy to protect virtual environments; many approaches have been tried and are often still in use today (see Figure 2). Some tools back up data within each VM, using VM-specific agents on a per “guest” basis. Additionally, tools evolved that could make a copy of a VM’s system image. These tools usually operated at the level of the host (or array) as described earlier. That all sounds good, but in practice, guest-based backups were often overly complex and expensive, while host-based backups got better: they were not only providing a system image, but also enabling the restore of granular data from within whole VM backups.
Finding a better way to back up in a virtualized world has been complicated by a number of factors, not the least of which is the diversity of hypervisors, including multiple products and product versions from VMware, Microsoft, Citrix, etc.
Still, it is important to note that protection of virtualized environments is not “broken”—ESG surveys show that regardless of the method used, most IT organizations are relatively satisfied with their virtualized backups. In its 2012 Trends in Data Protection Modernization research report, ESG found the following levels of satisfaction with survey respondents’ primary method of backing up virtual machines:
Against that background, it’s important to recognize that not all host-based backups are the same.
The goal of any host-based backup solution is application-consistent protection of virtualized applications. With Windows Server-based applications such as SQL Server, Exchange, and SharePoint being among the most frequently virtualized, a closer look at host-based backups must be on Microsoft Volume Shadow Copy Service (VSS).
VSS has been part of Windows for years. It supports backup activities and snapshotting, and it provides a framework of interoperability between backup agents, production applications, and underlying storage capabilities. Along with the VSS foundation within the Windows OS, the three active components are:
The three VSS components then crunch through a series of seven steps:
That is how it works with a physical server. For this process to work in a virtualized environment, from the host perspective, two tiers of data conversations should occur—first between the backup server and the hypervisor (host), then between the guest OS(s) and the host and/or backup server.
In the case of Microsoft Hyper-V, the hypervisor has its own VSS Writer for its “workload” of virtualizing machines.
This is how the eight steps look from a host perspective (using Hyper-V as the example):
If this were a SQL Server, it would prepare its data (the database) to be backed up. Because it is a hypervisor, its data to be backed up is a virtual machine. So, its preparation involves telling the guest’s Hyper-V Integration Components’ (ICs’) VSS Requester to be a backup agent—and the whole process happens inside the guest, which is why Microsoft refers to it as a Recursive VSS operation.
Remember, in a host-based backup model, the VM as a whole is the “data” to be backed up. So, just like any other VSS‑based workload, when it is ready, the original backup process continues.
Figure 3 shows how VSS works. There are, of course, variations with regard to scheduling and manageability, deduplication of common objects across multiple VMs, integration with higher-level management, the ability to recover individual items from within a host-based backup, and most importantly, ensuring that the guest-based application knows that it has been successfully backed up—so that it can do its log truncation and other management tasks.
The key to the process is that it is VSS enabled all of the way from the host’s Hyper-V VSS Writer, through the VSS components inside the guests, and back again. It is important to note that although the VSS operations within the guest are similar, VMware’s approach in vSphere 5 and its VMware APIs for Data Protection (VADP) is very different from a host perspective.
One key difference among approaches to virtualized application backups is how transactional post-processing happens. Whereas Microsoft VSS provides a recursive handling of VSS within each guest, as well as from the host, including post-backup application log truncation, not all other hypervisors do. And while ESG expects to continue to see those API methods mature, leading backup solutions are recognizing those deficiencies and filling the gaps through their own “agents” or executable code into the guests prior or during backups. They complete this process in order to facilitate the post-process notification or other metadata/indexing functions, and that isn’t a bad thing.
It is important to understand that not all agents are the same. Some may add more complexity than value; others are “good.”
Installing traditional backup agents may be necessary for a small number of situations with unique application requirements or granular data selection for protection needs. For most of “us,” this necessity just creates excessive I/O demands, which affects that virtual machine as well as the host and other virtual machines. That’s why host-based backup is almost always better for virtualized infrastructures, and guest-based backups should be avoided in most cases.
On the other hand, while most hypervisors provide a good way to back up a VM and its applications correctly, they do not all provide the same means of notifying the application after a successful backup has been achieved. Microsoft Hyper-V environments do provide this ability due to their utilization of VSS within both the Hyper-V host and the Windows guests, but not all other hypervisors do. Applications on those hypervisors won’t truncate their transaction logs or perform other database maintenance tasks. To compensate for this lack, some virtualization protection solutions will install additional agent-like executables (“good” agents) within the guests—not for backup, but simply for necessary post-backup processes.
Virtualization makes many things in IT easier—but backup isn’t usually one of them. The real differentiation isn’t in “how” the backup was done, but rather which recovery options are available—such as instant recovery of a virtual machine from the backup storage pool or granular recovery of not only files, but also other objects from within whole-VM backups.
That being said, the “how” for VM backup matters, and while there a variety of methods in use today for protecting virtual machines, most boil down to either backing up from the “inside” of a virtual machine (guest-based) or the “outside” (host- or array-based).
In the old days, physical servers were often undersubscribed, so they had plenty of extra CPU and I/O to give when a backup began. But with virtualization, multiple virtual machines consume almost every free resource on their shared host. With most legacy-style guest-based backups, it is not uncommon for the I/O-CPU spikes incurred within one VM to affect performance of the underlying host as well as the other virtual machines within that host. So, as virtualization continues as mainstream and VM density per host continues to grow, it is important to be more efficient and less intrusive—and that means host-based or array-based backups.
Seeing that as the future, the next important note is to understand that not all backups are equal—even if they mostly use VSS within the guests. Understanding this, and selecting a backup (and recovery) solution that is truly application aware, can be the difference between those applications continuing to perform well or actually being hindered because of how they were backed up.
The question isn’t whether you have an agent (or other differently named binary) in the guest. The question is what is its job? If your hypervisor doesn’t enable all of the functionality that you need for well-behaved virtual applications, then your backup application may take up the slack; that is okay, as long as it is handling metadata and post-processing, and not actually moving data as if it were on a physical server. To ignore this process can result in unmanaged transactional applications that must be manually managed by the app-owner, and data that might only be recovered by first restoring the entire VM and its virtual disks. Clearly, this is an undesirable end.
So for now, the “how” for backup boils down to good agents and “bad” agents—and each should be recognized for what it is, and either accepted or avoided, respectively.