Salesforce Backup and Recovery Secrets Revealed...

GettyImages-646945588There is a fundamental misunderstanding in the market that SaaS providers will take care of backup and recovery, archiving, and compliance for you. While there are some exceptions…they pretty much won’t. It’s your problem as an end-user. You are in charge of your data but you are not in control. Someone used the term “Russian roulette” to describe this situation…it may be exaggerated, but you get the idea!

Saas data protection is a different animal when it comes to backup and recovery. Traditional methods just don’t apply, and let’s be candid: the instrumentation is just not what users might be used to in the on-premises world. This is where it gets interesting when looking at Salesforce.com

The age-old principles of backup include, amongst others, the ability to capture a coherent and consistent point in time copy of the data. This is not a guarantee in Salesforce environments since the platform does not lock or freeze the data per se and allows for modifications as the backup is running. You can’t "freeze" a SaaS workload!

To support these workloads, end-users and vendors must rely on APIs and/or the service itself. 

Salesforce does provide some protection for its customers with real-time and near real-time replication of data, four copies of data distributed across data centers, and global monitoring. However, contrary to what many organizations believe, Salesforce only provides limited backup and recovery options. If the service becomes unavailable, what type of RPO/RTO can you get? Incidents have happened in the past, such as the notable NA14 issue in 2016. Let me be clear, I absolutely believe that everything is being done for maximum SLAs...but stuff happens.

Salesforce maintains a copy of customer data for disaster recovery purposes, which comes at a high price tag and is considered “last resort” when no other copy of the data is available. However, it costs a few thousands dollars, and you only get partial data (in csv format) and it might take weeks to get anything back. In my humble opinion, this has very limited value.

Beyond the data recovery use cases, there are additional reasons why backing up the platform is necessary, such as abating the risk of data corruption, performing data migration rollbacks, archiving (to offload data from the production environment), replicating to other data services (analytics, warehousing, or business intelligence), and performing development/QA-related snapshots. Compliance mandates also come into play here.

The toolset from Salesforce is very manual. This means that an end-user could do a very well scheduled and successful (manual) backup/export of her Salesforce environment but miss data because she did not update her export list with new fields, for example.

In addition, the restoration process creates an asymmetry between the backup version and the restored version of the data. By design, Salesforce is likely to transform data during the restore process (audit fields for example), and customizations (e.g., workflow field updates) may also add data transformations, thus compounding the overall scope of transformation.

The other very important dimension is scale. Traditional backup tools that claim Salesforce support typically can’t back up more than 3 levels of a data model when most enterprises have dozens of levels, and sometimes many different SalesForce organizations in their infrastructure. Recovering some part of a record is not sufficient, does not get all the data and relationships back, and exposes companies to significant compliance issues. We are talking companies with over 100M records – but any org between 10M-100M is likely exposed as well.

One issue is that there are 4 SFDC APIs. Many considerations come into play as end-users/vendors decide which APIs to use. One important factor we’d like to call out is the number of API calls that the platform will allow, since those are restricted and metered (“governor limits”). This is a very critical component to take into consideration when planning backup and recovery processes, and platform consumption. Backup speed will also be influenced by a number of variables such as number of records, type of fields, API, etc. In other words, you need to be super smart about how you back up and must parallelize backups at scale. Recovery is also a complex process that requires a thorough understanding of the metadata and data models.

Most organizations have to resort to a system integrator to customize processes on top of the APIs. That’s hundreds of man hours per process and little portability. It takes the top-notch experts to build these (Certified Architects). How many SalesForce CTAs do you have on staff? 

The net net is that I suspect that no one can credibly back up SalesForce at scale among the traditional backup vendors that claim support, and that it takes a lot of expertise and thousands of customized project man hours to get it done right. 

This is just the tip of the iceberg, as we also look at compliance data processes, in particular privacy-focused regulations, like GDPR. Backup and recovery in SalesForce environments is really about the broader question of how one manages data.  

Let’s open the conversation…..

Topics: Data Protection