Data Protection

Innovation and what succeeds in the market is an endlessly interesting idea. I was reminded of this recently when I read a New Yorker magazine profile of Clayton Christensen, the business guru most famous for his work “The Innovator’s Dilemma.”  The profile extends beyond his work: it covers his family background, his battle with cancer, his religious faith, and more. In all, it is a fascinating and inspiring profile that I highly recommend.  At the moment, it’s behind a subscription wall, so if you have access you can get it here, or you can read it in the May 14, 2012, print edition.

Christensen’s notion of “disruptive innovation” applies across any industry. An interesting example is perhaps Christensen’s most famous “miss” about the iPhone, which he predicted would not succeed because it was just a fancy cell phone. What he realized later, after its phenomenal success, was that the iPhone was actually disruptive to laptops, not just to other cell phones. A great insight, albeit after the fact.   

All of this got me thinking about changes in the backup world in the past few years, particularly two disruptive technologies, deduplication and snapshots.

Deduplication first made its mark in the form of deduplication appliances, single-purpose devices that were highly disruptive to tape as a backup target.  Disk had long been used for backup, whether as plain disk or in the form of a VTL, but it remained a niche methodology because it was just too expensive. As a result, disk was limited to only a day or two of data retention, if used at all. Deduplication radically changed the economics by providing data reduction rates of 90% or more, which is another way of saying you could get potentially twenty times as much use out of the same amount of disk.  

It changed the face of backup as far as tape was concerned, but interestingly, deduplication was not disruptive to the backup process. Users started replacing tape drives with disk, but everything else stayed the same. In the end, deduplication appliances were disruptive to only a portion of the backup process at the very end of the line. They were evolutionary, not revolutionary.

Snapshots have the potential to be truly revolutionary because they disrupt the entire traditional backup process, changing it from end-to-end, not just at the final step in the chain. But even though snapshots have been around for a long time, they are still not the leading way to protect data, despite all their advantages of speed and performance.  A survey by UBM TechWeb (commissioned by Syncsort) showed only 25% of users made use of primary storage snapshots (you can get the full survey here).

Why the limited uptake? A few key reasons: 

  • Cost: snapshots are typically done on primary disk, which is expensive.
  • Performance: many disk arrays suffer significant performance degradation as snapshots accumulate.
  • Complexity of restore: snapshots are great at capturing data, but a lot of disk systems do not have convenient, easy-to-use workflows for recovering data, do not have a catalog, etc.
  • Limited retention time: because they are expensive, you normally can’t keep weeks or months of data on snapshots.

Maybe this is why snapshots haven’t been as disruptive to traditional backup as might have been expected. So are snapshots destined to remain a limited use option, typically relegated to tier-1 applications and short retention times?

Not at all! There’s a disruptive technology in town now, and it’s called NetApp Syncsort Integrated Backup (NSB).  How does NSB change things?  It is quite simple. NSB takes the snapshots off the primary storage and puts them onto secondary storage, and then overlays it with easy recovery work-flows and a catalog. This seemingly simply change in the design solves all of the key reasons listed above for limited uptake.

I’ve written about this before here if you’re interested in more specifics.

For now, I will conclude with a concept from Clayton Christensen, who refers to the process of consumer product selection as people looking towards a way for “jobs to be done.”  Simply put, people don’t want products, they want to get something accomplished. The IT world is no different. None of us want backup software, really. What we want is for data to be protected and easily recoverable in a way that is cost-effective and reliable, and doesn’t demand too much of our attention. This is exactly what NSB delivers, as we heard recently from a user. It can do the same for you.

{ 0 comments }

When it comes to evaluating technology, nothing speaks louder than the voice of the customer. Vendors can say what they want about a product, but what matters is how it works in the day-to-day world of IT, where everything that can go wrong sooner or later does.

Recently, Syncsort and NetApp jointly sponsored a webinar that featured a user of the NetApp Syncsort Integrated Backup (NSB) solution.  Fernando Mejia is the Senior Manager of IT Infrastructure for IPC, which is a Franchisee Purchasing Cooperative for the SUBWAY restaurant stores. IPC helps the 28,000 SUBWAY restaurants in the U.S. and Canada reduce costs by leveraging their collective purchasing power. IPC supplies everything from food to paper goods to IT processes.

Mr. Mejia was kind enough to join us on a webinar that you can view here. A brief registration is required, but it’s well worth it.  And here’s a tip: the first part of the webinar is me talking. If you’re familiar with the NSB story then you can jump to the 25 minute mark where Mejia begins speaking.  It takes a minute or two for the webinar to load up.

I want to give you a sense of what IPC gained by moving to NSB.  Their environment is about 350 servers and 70 TBs of primary storage, most on NetApp FAS 6280 systems. They use NSB to back up that data to a clustered FAS 3160, which is dedicated to backup.  Prior to NSB they were using Symantec NetBackup and having major headaches. Nightly incrementals started at 6:00 p.m. and finished up around 6:00 a.m. Weekend fulls started Friday night at 6:00 p.m. and lasted until Monday morning.

This led to problems, as Mejia said:

“It was always a challenge hoping and praying there weren’t any kind of gotchas, like there always are with backups, that would cause that window to extend. And often it did extend beyond the window and ran into standard business hours. And often times depending on which systems were affected we did have an impact on the performance of our systems, and users had a problem with productivity.”

Not only that, but management was a burden.

“We did have one full-time resource dedicated to just managing backups. That person’s sole purpose was to, in essence, babysit the backup process and make sure we were getting successful backups. It was a very labor intensive process.”

Some new applications that were coming on-line and would significantly extend production hours pushed IPC to look for a solution “to meet our needs, particularly the one need of being able to potentially completely eliminate a backup window.”

They got it with NSB.  The average backup time for their servers is now between 1 and 15 minutes! The backup window is a non-issue. In addition, they gained a significant benefit from the NSB Instant Virtualization capability.  It’s not only useful for recovering systems, but it has dramatically enhanced IPC’s application development efforts.

We have an in-house development staff and we do develop a good majority of the applications we use…  We’re able to leverage Instant Virtualization to bring entire applications, that are composed of multiple servers, from production into our development and staging environment. This minimizes the drift between development staging and production, in turn resulting in much quicker time to develop applications, much more streamlined testing and QA processes, and in the end a lot less issues and problems that make it out into production.”

That’s how NSB leverages the power of snapshots – launch entire applications in minutes using your most recent backup data. And because it’s using NetApp FlexClone to do it, there’s no additional storage required other than new writes to the system.  

Maybe the best part of the new solution, however, was the management relief. Rather than the full-time IT resource required before, with NSB:

“It takes barely an hour a day to go over, manage and maintain the entire solution…  That was a great win, because I can re-assign those resources that were basically just doing maintenance and operational type work and put them into more important tasks and projects that are more crucial to the organization.”

How is this achieved?  Partly it’s how easy the solution is to use.

“NSB’s capabilities of leveraging NetApp technologies as well as being entirely integrated into virtualization technologies allowed us to collapse the amount of administration interfaces into one. That’s one great benefit of the solution. The second great benefit is that it’s extremely intuitive. It’s very easy to get in front of the interface and through a very short learning curve understand how the backup jobs are configured, understand how to perform recovery operations, understand how to generate reports. In the end that resulted in really lowering our operations overhead.”

The other key is reliability. A great deal of backup management ends up being trouble-shooting and scrambling to recover from failed backup jobs. Not with NSB.

“I must say that our job failure rate is extremely low. If I get maybe one or two a week that’s a lot. And usually when we get a job failure it’s a problem with a particular server that was getting backed up. So over time we ended up not having a lot of focus on the backup solution because it just works so well… I remember in our NetBackup days I was hyper-focused on all kinds of detailed information on the backup because it was critical that you were on top of it all the time to make sure it was functioning correctly. With the Syncsort solution that really has changed.”

There’s more I could write, but I’ll leave you to listen to the webinar where you can hear it for yourself. If you have follow-up questions, the webinar will explain how you can get them to me, or just post a comment here.

{ 0 comments }

At the end of March, Syncsort hosted a tweetchat to celebrate World Backup Day.  We were joined very actively by Jon Toigo who made a lot of thought-provoking comments. I blogged a bit on this earlier, here, but wanted to get back to some of Jon’s comments.

During the #backupjam, ESG analyst Jason Buffington launched various questions to solicit comments. For the sake of readability, I’m going to run together some of Jon’s tweets, but otherwise these are his verbatim responses. And then I’ll follow with some short comments of my own. This list is not comprehensive, and if you want to review the original tweets, you can reference Jon’s tweets on March 29, 2012.

Q: What are the top backup challenges facing customers?

@JonToigo: Identifying appropriate bu techniques based on poorly or un-defined restore targets.

My comment:  Jon makes a great point that you have to start with restore in mind. What are you really trying to achieve? From there, you can begin architecting a solution.   

Q:  How has virtualization impacted data protection strategies?

@JonToigo: My bigger concern, server virt vendors claim bu unneccesary. Just HA failover. Not true!  I have been told this over and over and so have my customers. It is wrong-headed.

My comment:  Completely agree! It seems this “all you need is failover” idea springs up every now and again, and it’s never the solution. Failover and HA schemes are there to keep applications running when hardware dies (or somebody pulls a plug). They do nothing to save you when you lose data at the logical level. 

Q: What’s the answer to the broken state of backup?

@JonToigo:  Depends entirely on what’s breaking it.  First, you need to get past the politics. Lose the Tape Sucks Move On bumper stickers for a start.

My comment: Love this response. It’s so Toigo!  First, the obvious fact: you can’t find the answer unless you know what the problem is.  Then the shift into a related issue, what you might call “sloganeering.”  While “tape sucks” might work as pay-attention-to-me style marketing, if you use it as a starting point in your solution design, you may very well be writing off a critical component.  Tape may not be the latest thing, but it remains a key part of many data protection strategies and dismissing it up front is foolish.  

Q: What are some of the main causes for lost data?

@JonToigo: User error. Malware. App error. HW error and Facility faults. In roughly that order.  And blowing through the question “Are you sure?”

My comment: A good list, and important to note that there is very little that hardware redundancy can do to help you with user error, malware or application error. You’ve got to have backup in place to deal with these issues. The final comment speaks to the issue of testing. “Are you sure?” is a simple question, but not at all easy to answer when you’re talking about your backup environment.  

Thanks again to Jon, Jason and everyone else for participating in the #backupjam. We really enjoyed it and look forward to organizing and participating in others like it in the future.

{ 0 comments }

I want to point readers to a good blog post from one of Syncsort’s most successful partners, SwishData.

The Swish team focuses on the U.S. Federal government market, including the Department of Defense, and they have enormous expertise in multiple technology areas. Plus they really understand the unique needs that you find only in certain kinds of military and government scenarios.  Simply put, Swish really gets it.

One of the things they understand really well is backup and recovery. SwishData CTO Jean-Paul Bergeaux has a good blog up about it: “Walk Before You Can Run: Ensure Backup and Recovery Aren’t Afterthoughts.”   

Ok, so he mentions my name in it! That doesn’t take away from the useful information provided, as well as links to other helpful blog posts.

It’s worth a look. And if you’re in the Federal Government space and looking for help with any current or upcoming projects, you should consider contacting the certified smart guys over at SwishData.

{ 0 comments }