October 2010

Below you will find a short clip from my talk at VMworld Europe earlier this month. One thing it mentions is the issue of backup reliability.

Guesstimates vary, but the numbers I’ve typically seen around backup failure rates tend to fall between 15-30%. That’s always in the context of a traditional file-based backup to tape, and those numbers are frightening, even on the low end.  A 15% failure rate means 15% of your data is not protected and recoverable at any given moment in time. Wow.  That is a huge amount of exposure.

The complications and reliability issues of tape have a lot to do with those numbers. Media failures, drive issues, robotics problems – so many places for things to go wrong. The reliability issues of tape are what helped drive the wave of VTL adoption over the past five years or so, particularly in large enterprises. The idea was great: replace your tape with a faster and more reliable disk system that looks like tape.  That had the great advantage of keeping your backup processes more or less the same, and VTLs were very successful. When disk consumption grew out of control, deduplication entered the picture.

Problem solved? Well not quite. VTLs certainly enhanced reliability compared to tape, but they had their own issues. Even “virtual” tape drives were prone to hanging. Plus, the high cost of many VTLs kept them very much a big company solution, usually found in expensive Fibre Channel SANs.  But the biggest problem is that VTLs did nothing to help on the host side.

The core problem is that we continue to use technology that was developed decades ago, when a large server hard drive was 100 MB.  To put it in context, that’s 95% less storage than an iPod Shuffle! Yet we’re still using backup techniques designed when that 100 MB drive was considered a lot of data.

No wonder reliability is a problem. If you tried to shovel 100 pounds of dirt with a plastic spoon, that’s not going to be very reliable either.

Lack of reliability has pernicious effects. Not only does it leave you exposed, but it eats into precious IT staff time. How much time do you spend troubleshooting backups and re-running them?  Gartner recently listed backup troubleshooting and re-starts as the third biggest backup complaint reported by their customers (from their August report, “Best Practices for Addressing the Broken State of Backup,” which I recommend highly).

So what to do? In my next post, I’ll talk some more about the specific data issues that create backup problems, and how Syncsort can solve those problems for you.

{ 0 comments }

My topic today is vendor responsibility.  And yes, this comes out of VMworld Europe.

On Wednesday, I visited a session given by a competitive vendor.  We all do this when given the opportunity. Indeed, a representative of that company – which I won’t name – visited my session on Tuesday. All is fair at trade show events, and if you’re a paid attendee you have the right to visit any session you want.

The problem, however, was that more than once – I think three times in fact – the presenter made the claim (I’m paraphrasing) “no other product in the market can do this,” and then proceeded to describe a feature that is available in NetApp Syncsort Integrated Backup (NSB) and in fact has been available via Syncsort BEX software for many months, and in some cases years.

Now the data protection world is a big, big space. There are at least half a dozen 800 pound gorilla players, and a load of others ranging from medium down to just-opened-the-doors start ups that have half a customer.  Believe me, I know, it’s part of my job to keep up with this, and it’s an endless and impossible task. So I don’t fault Presenter X – as I’ll call him – with not knowing the features of a product, even if it happens to be my product (sad face!).  But he shouldn’t be making such sweeping claims unless he’s actually checked every other product in the space.  

It’s one thing to offer up standard marketing superlatives like “my product is the best or biggest or tastiest.” We all know what that’s about and everyone takes that with some salt. It’s quite another thing to say “this widget can do 60 revolutions per minute and no other widget can,” when in fact another widget goes 60 revolutions per minute.

I’ll give a precise example. Presenter X brought up the subject of file-level restores in a VMware environment when you are doing image-based backups (i.e. backups based on block-level data movement and snapshots, not file-level backups). He made the point that a lot of products claim they do file restore, but really what they give you operationally is a rather tedious task of mounting a snapshot to a server and manually browsing for a file via the operating system.  I was amused by that because I had made the exact same point in my session on Tuesday!  No big deal, it’s a pretty obvious issue if you know the space. I can name three products off the top of my head that work exactly that way, and all that showed was that Presenter X is paying attention. Fair enough.

But then he said, again, “my product is the only one that can do this…”  and he said how his product can read through the snapshot and present a file tree directly in the backup GUI.  The user searches through that file tree instead of on a separate server. Nice and convenient, all done through the backup interface, and “the only product in the market” that can do this amazing thing!

Ummm… except Syncsort has been doing that for ages and does it in NetApp Syncsort Integrated Backup as well. We’ve always cataloged our block level backups.  And in fact, we go further. In the case of Product X, you have to initiate the file tree lookup for each snapshot separately. One at a time. In the case of Syncsort, ALL your snapshots have file trees directly in the product GUI and they are there all the time. I can drill into every one of them down to the file level, right through the Syncsort interface. And better still – much better still – is that you can search them. You can open a search window and using standard methods such as wildcards, file size ranges and so on, you can search right through all the snapshot backups of a server to find the file or files you’re looking for.

(NetApp Syncsort Integrated Backup Search Window)

That mount-and-peck method of file finding isn’t bad if you’re retrieving a file you just deleted on your laptop (been there, done that), but when you’re searching an enterprise server that may have millions of files on it and you only have a vague idea of just when that file got deleted, or just how many versions were saved over a time period, you can futz around for hours until you find what you’re looking for. With NSB, a couple of clicks and you’ve got it.

So to all the Presenter X’s out there – and I need to be wary of such things myself, for sure – be careful when making sweeping “we’re the only ones who can do that” statements, because if your audience finds out that’s not true, it will hurt your credibility and make them wonder what else you’re not being straight about.

Of course, it’s possible that when Presenter X said his was “the only product in the market” that did that, he was thinking “Because NSB does quite a bit more.”  Yeah, maybe that’s what he meant.

{ 0 comments }

On the Way to VMworld Europe

October 11, 2010

Another week, another plane ride.  I’m somewhere between New York and Copenhagen right now, on my way to VMworld Europe. I would provide a more precise description of my location, except I don’t know where it is. There’s no flight map on this plane because there’s no in-seat video. I think this is the first US-to-Europe flight I’ve taken in several years that didn’t offer in-seat entertainment. And the headrests don’t have those little wings you can bend out to keep your head from flopping over when you sleep (as if I could sleep).  And the wine was from out of something resembling a milk carton.  All this, and it wasn’t exactly cheap! 

Fortunately, I’m pretty certain that the VMworld event won’t be as disappointing as this tired old Boeing 757-200 (I will politely decline to mention the airline). This is my first European VMware event, and I’m very excited to be taking part. I have a speaking session this time around (SP9724, Tuesday, Oct. 12 at 15:30, Room #19), talking about our recently announced NetApp Syncsort Integrated Backup solution and the topic of VM protection in general. Please drop in and visit us if you’re at the show (Booth #13, right next to NetApp), and if you attend my session stop and say hi.  

A good portion of my talk focuses on recovery.  Speaking of, the folks at Veeam recently conducted a survey of IT managers that returned some (to me) shocking results.  As Dave Simpson at Infostor blogged, the survey showed the average user needs more than five hours to restore a virtual machine.

Five hours! I’ve been running around for over a year telling people how Syncsort can restore a virtual machine in five minutes (plus application boot time). I suppose it takes five hours if you’re still doing “traditional” (i.e. horse and buggy style) file-based backups to tape.  Slow, complicated, unreliable – no, that’s not a description of my recently replaced ancient smart phone (well yes it is), but a description of a type of data protection that just isn’t suited for the virtual world.  If you can spin up a new virtual machine in minutes, why shouldn’t you be able to protect it and restore it in minutes?

Well of course you can, if you use technology suited to the situation. NetApp Syncsort Integrated Backup gives you that ability and a lot more. If you’re at VMworld Europe this week, drop by to learn more about it, and understand why it’s time to change your old backup solution.

Meanwhile, seems I’m about the only person not sleeping on this plane. And I know that’s one thing that’s not going to change!

{ 0 comments }