Why Have Different Levels of Protection?
I like the saying that comes when people refer to an old classic car '
they don't make them like that anymore'. The thing is, the statement should be followed with the statement '
thank goodness'. Can you imagine putting up with that level of reliability, quality, safety, comfort level with a car you purchased from a dealer today that you would have got in years gone by?
Same goes with technology. I still wake-up some nights and shudder when remembering the old tape backup routine. This use to go in two ways, the first was the nightly scheduled tape shuffle which thankfully I was not tasked with. The other was the pre-rollout backup we would run before pushing any new service out. This use to comprise us kicking the job off, then trotting down the road to a golf driving range to knock a few buckets of balls into no mans land to kill the time (usually hours).
So looking at that, we had two different use cases but only one method at our disposal, the inglorious, forever unreliable tape backup. Fast forward 15 years (maybe more but I keep that quiet) and we have many different forms of protection at our disposal including storage replication, snapshotting, tape backup (but hopefully VTL, not those horrible tapes).
All these options serve a purpose and work together nicely. If we were to classify the different use cases into buckets they could be:
Business Continuity (DR) = Remote Replication, constant protection driven by RPO and RTO requirements
Operational Protection = Local Snapshotting (or continuous protection as is provided with RecoverPoint) typically done by set interval (storage level) or self service (VM level)
Long Term Retention = Traditional Backup style typically run once a day
Each of these has there purpose and specific business requirements that drive their implementation. Interestingly I see a lot of the BCM and Long Term Retention being positioned but little of the Operational Protection being catered for. Back to my scenario at the start, if I could of had snapshotting at my disposal (or if I was real lucky, EMC RecoverPoint Continuous Data Protection) my golf practice (aka time wasting) would not have happened as we could have snapshotted the services that we were updating and started the rollout straight away.
Self service snapshotting is easily available these days thanks to two things:
- Virtualisation with the VM level snapshotting (Checkpointing in a HyperV world)
- Automation tools for storage (EMCs Storage Integration Suite is a good example)
So that is all cool for those known events but what about those unknown such as data corruption. That is what scheduled snapshots protect you against, providing a much more granular way to protect your systems beyond the typical 24 hour cycle of Long Term Retention. You may only retain a short set of snapshots such as 1-7 days but they can provide good peace of mind for those services deemed worthy.
I do realise that replication can also provide a way of rolling back but typically it is 'to the last change' or committed IO operation, so a corruption could easily by on the remote site as well as the source. Also replication would require that traffic to come back down the wire which adds time to the recovery / rollback process.
Another benefit of Operational Protection is that it can provide an easy / quick way for copies of datasets such as those within a database to be presented to an alternate location such as a from production to a test/dev instance.
Anyway, I got that of my chest so I feel better. Operation Protection = Good!
Just as a last note on this, I did not include High Availability (HA) in this as I am more looking at where the old functions we used tape for have evolved. There is some real cool stuff that can be done with stretched high availability that spans physical locations as is supported with vSphere Metro Storage Clusters and Microsofts Failover Clustering with products such as EMCs VPLEX, but that is a big enough topic on its own.