Wednesday 20 February 2008

Zmanda, moonwalk, S3 and offiste DR ...


Zmanda, Moonwalk and the rest are interesting because they allow you to move the data offsite. Moonwalk is perhaps the most interesting as it allows integration into an HSM style policy based system, and one could imagine a scenario where files are backed up to some local clustered storage and then you write the "last changed" files to some local storage, perhaps a tape based archive that keeps all the deltas and an offsite location that keeps only the deltas and perhas does a consolidation task.

Thus in the event of a complete disaster we can revert to a known state - well that's true for unstructured data. For structured data like databases we need to do something different, such as writing dumps in multiple locations.

And of course for data we don't care a lot about we just do an rsync (or more accurately and rdiff) to ensure that we have a recent copy. (In fact for the UK Mirror Service - basically a local cache of key internet resources - we did exactly that, no backup, rsync the servers, and if both sites lost the same files pull them again next time we do a consistency check against the location being mirrored. This of course worked as we assumed that the mirrored sites were backed up by the host sites)

Simple.

Then costs come into the equation. Backups are expensive. Infrastructure costs alone head towards the million dollar mark. For a secondary disaster recovery site double it and then add some for network traffic, which you'd pay anyway if your data crossed a public network. So the cost benefit analysis of using something like S3 comes down to "are our costs going to be less outsourcing than doing it ourselves?"

If you are a relatively small site and you have enough bandwidth to easily trickle the deltas across, I'd guess yes, after all it's only your critical unstructured data. If you're large it might be worth doing s cross hosting deal.

But the key is data classification to maintain costs - you only want to save the stuff you can't afford to lose.

The other key thing is not to get held up on particular vendor's offerings -it's the technology that's important at this stage. Zmanda and Moonwalk in combination with S3 are demonstrations of what can be done. The message is that the economies of cloud computing architectures make realtime access to offsite data stores a distinct possibility.

After all backup is only copying data and putting it somewhere safe ...

No comments: