primary storage

Disaster Recovery has been on many people’s minds of late, and the trade press is full of stories. American Express OpenForum has a piece rating the most dangerous cities in America in which to start a new business, based on the risk of a disaster.  I found the rankings surprising (New York City is 5th?), but worth a read.

Network World delivers on story on how DR plans are getting more urgent based on an increase in the rate of disasters. They provide this alarming quote:

“Last year was the worst year we’ve had in the history of disasters,” said Al Berman, executive director of the Disaster Recovery Institute, an industry group.

They also note how the widely reported shutdown of the Amazon cloud service put paid to the idea that cloud services were invulnerable (if anybody had that idea in the first place).

Computerworld gets more into the IT side of things with “4 tech trends in IT disaster recovery,” which discusses cloud, virtualization, mobile networking and social networks.  I didn’t see that last one coming either, but they make a good case for why social networking is something you need to consider in the context of your DR planning.

Disaster Recovery is a huge subject, especially if you broaden it by adding “Business Continuity” into the mix. Then you’re getting into just about everything: personnel, legal, facilities planning, and transportation, on and on, to say nothing of IT and effective product mixes.

My concern is the IT side of things, and even that is a huge subject for discussion and I wouldn’t attempt to scope it all out in one blog post.  Rather I’ll focus on one aspect today and then continue to delve into disaster recovery issues in the weeks ahead.

One area where there exists a significant disconnect is between backup and disaster recovery. Since those terms are flexible, let’s define backup as “making copies of data locally” and disaster recovery as “creating copies of data at an alternate location.”  Very often these two processes are separate, using different technologies and products. The most classic formulation is making tapes locally and then trucking them off to an off-site location. The down-sides of this are well understood, and the only real upside is that you are using the same backup software product at all steps of the way.

Because tape is cumbersome, more and more users have moved to some form of electronic data transport, a.k.a. data replication over IP networks. It’s safer (no lost tapes), it’s efficient (no manual shipping), it’s faster. But it’s also often handled separately from your backup process. I could write a half-dozen blog posts on the different replication models, but the point is that if you have one process for backup and something completely different for DR, then you’re not maximizing efficiency in terms of time or money. And if you’re doing array-based replication in a multi-vendor disk environment, you’ll find yourself managing multiple separate replication processes and completely different recovery workflows. It’s a mess!

That’s why with NetApp Syncsort Integrated Backup (NSB) we’ve done some very effective things to make sure DR is a seamless part of the backup environment. 

First, we consolidate backups from any primary storage environment onto NetApp disk. Replication is handled by NetApp SnapMirror.   This centralizes all your replication onto a single platform no matter what mix of primary storage you have (it even includes data from internal boot disks).  Management is simplified and you can reduce costs by eliminating primary storage replication licenses.

Second, there is no additional impact from replication. NSB uses standard backup data on secondary disk as the source data for replication. No additional backup passes, no impact on your applications or primary storage at all. This is far more efficient than doing replication off your primary arrays, for example, or from your ESX servers. Why would you want to impact your production workloads with replication overhead when you don’t have to?  

Finally, all DR recovery processes are run from the same NSB console as your regular backups, and the recovery workflows are the same. If you know how to restore locally, you know how to restore at the DR site.  This reduces learning curves and management overhead.

Of course there are a lot more details I could get into but this gives you the idea of how NSB can really drive down the cost and complexity of disaster recovery. And it is easier than you’d ever think to test your disaster recovery. I wrote about DR testing here and will be talking more about this in the future.

{ 0 comments }

Steve Duplessie at Enterprise Strategy Group recently posted a video blog entitled “The Future of Single Platform Backup Tools.” As is Steve’s way, he doesn’t mince words and he makes big statements. It’s definitely worth the five minutes it takes to watch. Here I’m going to pull two quotes and comment on them, but check out the whole thing to get the full context.

Steve D:  Data growth causes all of our problems. It eventually breaks every single process, procedure and product that you have on your floor.

Leave it to Steve to lay it on the line. Data growth changes everything, and perhaps backup and recovery most of all. I recently posted some video thoughts of my own on this, which you can see here.  

Steve D (talking about purpose-built backup tools for VMs):  People want another vendor in their shop like they want a hole in their head.

Classic Steve! He makes the point that users were driven to purpose-built, VM-only backup tools at first because the conventional backup vendors (you know the names) failed miserably out of the gate on VM backup.  At Syncsort, we were VM friendly before anybody even had VMs, because from the start our NetApp Syncsort Integrated Backup solution focused on reducing backup impact and dramatically limiting the amount of data that gets backed up.  That’s the problem with VM backup: too much backup impact, not enough free system resources.  Syncsort solved that problem back in 2005 when it first integrated its block-level backup with NetApp storage as a backup target (supports any primary storage, by the way).

I wrote more about this a year ago, ironically in another blog post inspired by Steve D.  (Ok, I admit it, I just read what Steve says and piggy back!)  

Steve goes on to make the point that organizations are looking for data protection solutions that deliver everything they need. I couldn’t agree more, and we’re finding users responding to our NSB solution because we bring it all to the table: support for physical and virtual machines; application integration; backup to disk with built-in support for tape; integrated disaster recovery and ROBO support. It’s all in there, as they say.

P.S. – Steve has one of the best (and funniest) Twitter feeds in the biz.  Follow him @stevedupe.

{ 0 comments }

Innovation and what succeeds in the market is an endlessly interesting idea. I was reminded of this recently when I read a New Yorker magazine profile of Clayton Christensen, the business guru most famous for his work “The Innovator’s Dilemma.”  The profile extends beyond his work: it covers his family background, his battle with cancer, his religious faith, and more. In all, it is a fascinating and inspiring profile that I highly recommend.  At the moment, it’s behind a subscription wall, so if you have access you can get it here, or you can read it in the May 14, 2012, print edition.

Christensen’s notion of “disruptive innovation” applies across any industry. An interesting example is perhaps Christensen’s most famous “miss” about the iPhone, which he predicted would not succeed because it was just a fancy cell phone. What he realized later, after its phenomenal success, was that the iPhone was actually disruptive to laptops, not just to other cell phones. A great insight, albeit after the fact.   

All of this got me thinking about changes in the backup world in the past few years, particularly two disruptive technologies, deduplication and snapshots.

Deduplication first made its mark in the form of deduplication appliances, single-purpose devices that were highly disruptive to tape as a backup target.  Disk had long been used for backup, whether as plain disk or in the form of a VTL, but it remained a niche methodology because it was just too expensive. As a result, disk was limited to only a day or two of data retention, if used at all. Deduplication radically changed the economics by providing data reduction rates of 90% or more, which is another way of saying you could get potentially twenty times as much use out of the same amount of disk.  

It changed the face of backup as far as tape was concerned, but interestingly, deduplication was not disruptive to the backup process. Users started replacing tape drives with disk, but everything else stayed the same. In the end, deduplication appliances were disruptive to only a portion of the backup process at the very end of the line. They were evolutionary, not revolutionary.

Snapshots have the potential to be truly revolutionary because they disrupt the entire traditional backup process, changing it from end-to-end, not just at the final step in the chain. But even though snapshots have been around for a long time, they are still not the leading way to protect data, despite all their advantages of speed and performance.  A survey by UBM TechWeb (commissioned by Syncsort) showed only 25% of users made use of primary storage snapshots (you can get the full survey here).

Why the limited uptake? A few key reasons: 

  • Cost: snapshots are typically done on primary disk, which is expensive.
  • Performance: many disk arrays suffer significant performance degradation as snapshots accumulate.
  • Complexity of restore: snapshots are great at capturing data, but a lot of disk systems do not have convenient, easy-to-use workflows for recovering data, do not have a catalog, etc.
  • Limited retention time: because they are expensive, you normally can’t keep weeks or months of data on snapshots.

Maybe this is why snapshots haven’t been as disruptive to traditional backup as might have been expected. So are snapshots destined to remain a limited use option, typically relegated to tier-1 applications and short retention times?

Not at all! There’s a disruptive technology in town now, and it’s called NetApp Syncsort Integrated Backup (NSB).  How does NSB change things?  It is quite simple. NSB takes the snapshots off the primary storage and puts them onto secondary storage, and then overlays it with easy recovery work-flows and a catalog. This seemingly simply change in the design solves all of the key reasons listed above for limited uptake.

I’ve written about this before here if you’re interested in more specifics.

For now, I will conclude with a concept from Clayton Christensen, who refers to the process of consumer product selection as people looking towards a way for “jobs to be done.”  Simply put, people don’t want products, they want to get something accomplished. The IT world is no different. None of us want backup software, really. What we want is for data to be protected and easily recoverable in a way that is cost-effective and reliable, and doesn’t demand too much of our attention. This is exactly what NSB delivers, as we heard recently from a user. It can do the same for you.

{ 1 comment }

When it comes to evaluating technology, nothing speaks louder than the voice of the customer. Vendors can say what they want about a product, but what matters is how it works in the day-to-day world of IT, where everything that can go wrong sooner or later does.

Recently, Syncsort and NetApp jointly sponsored a webinar that featured a user of the NetApp Syncsort Integrated Backup (NSB) solution.  Fernando Mejia is the Senior Manager of IT Infrastructure for IPC, which is a Franchisee Purchasing Cooperative for the SUBWAY restaurant stores. IPC helps the 28,000 SUBWAY restaurants in the U.S. and Canada reduce costs by leveraging their collective purchasing power. IPC supplies everything from food to paper goods to IT processes.

Mr. Mejia was kind enough to join us on a webinar that you can view here. A brief registration is required, but it’s well worth it.  And here’s a tip: the first part of the webinar is me talking. If you’re familiar with the NSB story then you can jump to the 25 minute mark where Mejia begins speaking.  It takes a minute or two for the webinar to load up.

I want to give you a sense of what IPC gained by moving to NSB.  Their environment is about 350 servers and 70 TBs of primary storage, most on NetApp FAS 6280 systems. They use NSB to back up that data to a clustered FAS 3160, which is dedicated to backup.  Prior to NSB they were using Symantec NetBackup and having major headaches. Nightly incrementals started at 6:00 p.m. and finished up around 6:00 a.m. Weekend fulls started Friday night at 6:00 p.m. and lasted until Monday morning.

This led to problems, as Mejia said:

“It was always a challenge hoping and praying there weren’t any kind of gotchas, like there always are with backups, that would cause that window to extend. And often it did extend beyond the window and ran into standard business hours. And often times depending on which systems were affected we did have an impact on the performance of our systems, and users had a problem with productivity.”

Not only that, but management was a burden.

“We did have one full-time resource dedicated to just managing backups. That person’s sole purpose was to, in essence, babysit the backup process and make sure we were getting successful backups. It was a very labor intensive process.”

Some new applications that were coming on-line and would significantly extend production hours pushed IPC to look for a solution “to meet our needs, particularly the one need of being able to potentially completely eliminate a backup window.”

They got it with NSB.  The average backup time for their servers is now between 1 and 15 minutes! The backup window is a non-issue. In addition, they gained a significant benefit from the NSB Instant Virtualization capability.  It’s not only useful for recovering systems, but it has dramatically enhanced IPC’s application development efforts.

We have an in-house development staff and we do develop a good majority of the applications we use…  We’re able to leverage Instant Virtualization to bring entire applications, that are composed of multiple servers, from production into our development and staging environment. This minimizes the drift between development staging and production, in turn resulting in much quicker time to develop applications, much more streamlined testing and QA processes, and in the end a lot less issues and problems that make it out into production.”

That’s how NSB leverages the power of snapshots – launch entire applications in minutes using your most recent backup data. And because it’s using NetApp FlexClone to do it, there’s no additional storage required other than new writes to the system.  

Maybe the best part of the new solution, however, was the management relief. Rather than the full-time IT resource required before, with NSB:

“It takes barely an hour a day to go over, manage and maintain the entire solution…  That was a great win, because I can re-assign those resources that were basically just doing maintenance and operational type work and put them into more important tasks and projects that are more crucial to the organization.”

How is this achieved?  Partly it’s how easy the solution is to use.

“NSB’s capabilities of leveraging NetApp technologies as well as being entirely integrated into virtualization technologies allowed us to collapse the amount of administration interfaces into one. That’s one great benefit of the solution. The second great benefit is that it’s extremely intuitive. It’s very easy to get in front of the interface and through a very short learning curve understand how the backup jobs are configured, understand how to perform recovery operations, understand how to generate reports. In the end that resulted in really lowering our operations overhead.”

The other key is reliability. A great deal of backup management ends up being trouble-shooting and scrambling to recover from failed backup jobs. Not with NSB.

“I must say that our job failure rate is extremely low. If I get maybe one or two a week that’s a lot. And usually when we get a job failure it’s a problem with a particular server that was getting backed up. So over time we ended up not having a lot of focus on the backup solution because it just works so well… I remember in our NetBackup days I was hyper-focused on all kinds of detailed information on the backup because it was critical that you were on top of it all the time to make sure it was functioning correctly. With the Syncsort solution that really has changed.”

There’s more I could write, but I’ll leave you to listen to the webinar where you can hear it for yourself. If you have follow-up questions, the webinar will explain how you can get them to me, or just post a comment here.

{ 1 comment }