Below you will find a short clip from my talk at VMworld Europe earlier this month. One thing it mentions is the issue of backup reliability.
Guesstimates vary, but the numbers I’ve typically seen around backup failure rates tend to fall between 15-30%. That’s always in the context of a traditional file-based backup to tape, and those numbers are frightening, even on the low end. A 15% failure rate means 15% of your data is not protected and recoverable at any given moment in time. Wow. That is a huge amount of exposure.
The complications and reliability issues of tape have a lot to do with those numbers. Media failures, drive issues, robotics problems – so many places for things to go wrong. The reliability issues of tape are what helped drive the wave of VTL adoption over the past five years or so, particularly in large enterprises. The idea was great: replace your tape with a faster and more reliable disk system that looks like tape. That had the great advantage of keeping your backup processes more or less the same, and VTLs were very successful. When disk consumption grew out of control, deduplication entered the picture.
Problem solved? Well not quite. VTLs certainly enhanced reliability compared to tape, but they had their own issues. Even “virtual” tape drives were prone to hanging. Plus, the high cost of many VTLs kept them very much a big company solution, usually found in expensive Fibre Channel SANs. But the biggest problem is that VTLs did nothing to help on the host side.
The core problem is that we continue to use technology that was developed decades ago, when a large server hard drive was 100 MB. To put it in context, that’s 95% less storage than an iPod Shuffle! Yet we’re still using backup techniques designed when that 100 MB drive was considered a lot of data.
No wonder reliability is a problem. If you tried to shovel 100 pounds of dirt with a plastic spoon, that’s not going to be very reliable either.
Lack of reliability has pernicious effects. Not only does it leave you exposed, but it eats into precious IT staff time. How much time do you spend troubleshooting backups and re-running them? Gartner recently listed backup troubleshooting and re-starts as the third biggest backup complaint reported by their customers (from their August report, “Best Practices for Addressing the Broken State of Backup,” which I recommend highly).
So what to do? In my next post, I’ll talk some more about the specific data issues that create backup problems, and how Syncsort can solve those problems for you.
{ 0 comments }
