primary storage

Innovation and what succeeds in the market is an endlessly interesting idea. I was reminded of this recently when I read a New Yorker magazine profile of Clayton Christensen, the business guru most famous for his work “The Innovator’s Dilemma.”  The profile extends beyond his work: it covers his family background, his battle with cancer, his religious faith, and more. In all, it is a fascinating and inspiring profile that I highly recommend.  At the moment, it’s behind a subscription wall, so if you have access you can get it here, or you can read it in the May 14, 2012, print edition.

Christensen’s notion of “disruptive innovation” applies across any industry. An interesting example is perhaps Christensen’s most famous “miss” about the iPhone, which he predicted would not succeed because it was just a fancy cell phone. What he realized later, after its phenomenal success, was that the iPhone was actually disruptive to laptops, not just to other cell phones. A great insight, albeit after the fact.   

All of this got me thinking about changes in the backup world in the past few years, particularly two disruptive technologies, deduplication and snapshots.

Deduplication first made its mark in the form of deduplication appliances, single-purpose devices that were highly disruptive to tape as a backup target.  Disk had long been used for backup, whether as plain disk or in the form of a VTL, but it remained a niche methodology because it was just too expensive. As a result, disk was limited to only a day or two of data retention, if used at all. Deduplication radically changed the economics by providing data reduction rates of 90% or more, which is another way of saying you could get potentially twenty times as much use out of the same amount of disk.  

It changed the face of backup as far as tape was concerned, but interestingly, deduplication was not disruptive to the backup process. Users started replacing tape drives with disk, but everything else stayed the same. In the end, deduplication appliances were disruptive to only a portion of the backup process at the very end of the line. They were evolutionary, not revolutionary.

Snapshots have the potential to be truly revolutionary because they disrupt the entire traditional backup process, changing it from end-to-end, not just at the final step in the chain. But even though snapshots have been around for a long time, they are still not the leading way to protect data, despite all their advantages of speed and performance.  A survey by UBM TechWeb (commissioned by Syncsort) showed only 25% of users made use of primary storage snapshots (you can get the full survey here).

Why the limited uptake? A few key reasons: 

  • Cost: snapshots are typically done on primary disk, which is expensive.
  • Performance: many disk arrays suffer significant performance degradation as snapshots accumulate.
  • Complexity of restore: snapshots are great at capturing data, but a lot of disk systems do not have convenient, easy-to-use workflows for recovering data, do not have a catalog, etc.
  • Limited retention time: because they are expensive, you normally can’t keep weeks or months of data on snapshots.

Maybe this is why snapshots haven’t been as disruptive to traditional backup as might have been expected. So are snapshots destined to remain a limited use option, typically relegated to tier-1 applications and short retention times?

Not at all! There’s a disruptive technology in town now, and it’s called NetApp Syncsort Integrated Backup (NSB).  How does NSB change things?  It is quite simple. NSB takes the snapshots off the primary storage and puts them onto secondary storage, and then overlays it with easy recovery work-flows and a catalog. This seemingly simply change in the design solves all of the key reasons listed above for limited uptake.

I’ve written about this before here if you’re interested in more specifics.

For now, I will conclude with a concept from Clayton Christensen, who refers to the process of consumer product selection as people looking towards a way for “jobs to be done.”  Simply put, people don’t want products, they want to get something accomplished. The IT world is no different. None of us want backup software, really. What we want is for data to be protected and easily recoverable in a way that is cost-effective and reliable, and doesn’t demand too much of our attention. This is exactly what NSB delivers, as we heard recently from a user. It can do the same for you.

{ 0 comments }

When it comes to evaluating technology, nothing speaks louder than the voice of the customer. Vendors can say what they want about a product, but what matters is how it works in the day-to-day world of IT, where everything that can go wrong sooner or later does.

Recently, Syncsort and NetApp jointly sponsored a webinar that featured a user of the NetApp Syncsort Integrated Backup (NSB) solution.  Fernando Mejia is the Senior Manager of IT Infrastructure for IPC, which is a Franchisee Purchasing Cooperative for the SUBWAY restaurant stores. IPC helps the 28,000 SUBWAY restaurants in the U.S. and Canada reduce costs by leveraging their collective purchasing power. IPC supplies everything from food to paper goods to IT processes.

Mr. Mejia was kind enough to join us on a webinar that you can view here. A brief registration is required, but it’s well worth it.  And here’s a tip: the first part of the webinar is me talking. If you’re familiar with the NSB story then you can jump to the 25 minute mark where Mejia begins speaking.  It takes a minute or two for the webinar to load up.

I want to give you a sense of what IPC gained by moving to NSB.  Their environment is about 350 servers and 70 TBs of primary storage, most on NetApp FAS 6280 systems. They use NSB to back up that data to a clustered FAS 3160, which is dedicated to backup.  Prior to NSB they were using Symantec NetBackup and having major headaches. Nightly incrementals started at 6:00 p.m. and finished up around 6:00 a.m. Weekend fulls started Friday night at 6:00 p.m. and lasted until Monday morning.

This led to problems, as Mejia said:

“It was always a challenge hoping and praying there weren’t any kind of gotchas, like there always are with backups, that would cause that window to extend. And often it did extend beyond the window and ran into standard business hours. And often times depending on which systems were affected we did have an impact on the performance of our systems, and users had a problem with productivity.”

Not only that, but management was a burden.

“We did have one full-time resource dedicated to just managing backups. That person’s sole purpose was to, in essence, babysit the backup process and make sure we were getting successful backups. It was a very labor intensive process.”

Some new applications that were coming on-line and would significantly extend production hours pushed IPC to look for a solution “to meet our needs, particularly the one need of being able to potentially completely eliminate a backup window.”

They got it with NSB.  The average backup time for their servers is now between 1 and 15 minutes! The backup window is a non-issue. In addition, they gained a significant benefit from the NSB Instant Virtualization capability.  It’s not only useful for recovering systems, but it has dramatically enhanced IPC’s application development efforts.

We have an in-house development staff and we do develop a good majority of the applications we use…  We’re able to leverage Instant Virtualization to bring entire applications, that are composed of multiple servers, from production into our development and staging environment. This minimizes the drift between development staging and production, in turn resulting in much quicker time to develop applications, much more streamlined testing and QA processes, and in the end a lot less issues and problems that make it out into production.”

That’s how NSB leverages the power of snapshots – launch entire applications in minutes using your most recent backup data. And because it’s using NetApp FlexClone to do it, there’s no additional storage required other than new writes to the system.  

Maybe the best part of the new solution, however, was the management relief. Rather than the full-time IT resource required before, with NSB:

“It takes barely an hour a day to go over, manage and maintain the entire solution…  That was a great win, because I can re-assign those resources that were basically just doing maintenance and operational type work and put them into more important tasks and projects that are more crucial to the organization.”

How is this achieved?  Partly it’s how easy the solution is to use.

“NSB’s capabilities of leveraging NetApp technologies as well as being entirely integrated into virtualization technologies allowed us to collapse the amount of administration interfaces into one. That’s one great benefit of the solution. The second great benefit is that it’s extremely intuitive. It’s very easy to get in front of the interface and through a very short learning curve understand how the backup jobs are configured, understand how to perform recovery operations, understand how to generate reports. In the end that resulted in really lowering our operations overhead.”

The other key is reliability. A great deal of backup management ends up being trouble-shooting and scrambling to recover from failed backup jobs. Not with NSB.

“I must say that our job failure rate is extremely low. If I get maybe one or two a week that’s a lot. And usually when we get a job failure it’s a problem with a particular server that was getting backed up. So over time we ended up not having a lot of focus on the backup solution because it just works so well… I remember in our NetBackup days I was hyper-focused on all kinds of detailed information on the backup because it was critical that you were on top of it all the time to make sure it was functioning correctly. With the Syncsort solution that really has changed.”

There’s more I could write, but I’ll leave you to listen to the webinar where you can hear it for yourself. If you have follow-up questions, the webinar will explain how you can get them to me, or just post a comment here.

{ 0 comments }

ESG’s Steve Duplessie has a great new blog post this week titled IT Chasms, Gaps and A New World Order.  Featuring Steve’s classic, straight shooting style, it is well worth your while to give it a read. It focuses mostly on networking (the kind with routers, not meeting people for a drink), but he makes a very interesting point about storage that I think are important and want to explore further.

After discussing how important it is for vendors to help customers develop applications faster, Duplessie says this:

The bigger truth is telling a storage buyer that your stuff is awesome because he can go faster running VMware is cool, but telling the App owner that your storage features will enable them to cut test and Q/A time by 30% is where the money is.

Hats off to that!  Steve is dead-on here. And one of the ways to do this – I would argue the best way – is by using your backup storage.

Let’s step back a bit.  Normally, when you hear vendors talking about using storage for test/dev tasks they start talking about snapshots and clones, and that usually means doing this with your primary storage.  Does it work? It does, but there’s a price to pay. 

First, primary storage is expensive, and using up high-speed disk resources for tasks that do not require high-performance is spending money you’d rather not spend. Second, it impacts performance.  Many disk array snapshots create quite a bit of impact on performance because the copy-on-write model means two writes and one read every time a block is written. To provide a hypothetical example, if “Barry the Unruly Developer” wants to do a lot of test/dev work off your primary disk, you risk serious impact to production performance.  

If you happen to use NetApp for your primary storage, you happily avoid this performance penalty because not all snapshots are created equal. But what if you don’t have NetApp primary storage?

That’s where NetApp Syncsort Integrated Backup (NSB) can help.  NSB lets you back up from any primary storage environment to a NetApp FAS device.  When NSB captures data, it stores it using NetApp Snapshots. And guess what? You have full access to cloning capabilities. The benefits of this are many.

1.  Everything is running on secondary storage. That means low-cost SATA drives with loads of capacity.

2. Everything is running on secondary storage. That means that no matter how many clones you spin up, no matter how hard “Barry the Unruly Developer” bashes away at the system, the impact to your production environment is zero, as in none whatsoever!

3. Everything is running on secondary storage.  That means it’s all consolidated onto a single hardware platform, no matter what mix of primary disk you have. It even protects boot drive data that’s not on a SAN, so Barry has access to all the application information, not just the data volumes.

4. Everything can also run on tertiary storage. Just use SnapMirror replication to send your backups to a DR site, and you can do all your test/dev over there.

5. It’s all super easy. NSB overlays the NetApp Snapshot and FlexClone features with super-simple workflows.  That means the person dishing out the storage to the test/dev folks doesn’t have to know how a NetApp FAS works.  How many steps does it take to provision a 2 TB SQL database volume clone to a dev?  A couple of mouse clicks. You can see how it’s done here

6. It’s physical. It’s virtual. It’s virtu-physical!  NSB can take any backup from any server and boot it from a FlexClone into a new VM in about ten minutes start to finish. That’s right.  When “Barry the Unruly Developer” demands a SQL Server instance to work on, you can say “Ten minutes Barry!”  And ten minutes later Barry has a brand new VM running he can play with all he likes. All running off a FlexClone, using zero extra storage footprint. And running – did I mention this? – from secondary storage or even tertiary, if you’d rather have Barry as far away as possible! To see how this works, click here

I could go on, but I think you get the idea. We have users doing this every day, leveraging their backup data for tasks beyond recovery: development, testing, data mining, reporting, even virus scanning. Anything you want to do that requires copies of your data and you would prefer to off-load from production hardware.

Saves time. Saves money. So easy that your most inexperienced IT person can be designated as “the guy that Barry gets his data from.”  (And not to worry inexperienced IT person – you can schedule NSB to deliver Barry his data every day, automatically).

It makes you smile. It makes Barry smile. What’s not to love?

{ 0 comments }

There’s a rather shocking story from Computerworld today, about email outages and other IT failures at the White House.  It seems that shortly after the new CIO came on board in 2009, they suffered a 21 hour email outage. Yikes!  Several more outages followed.  Eventually they got their systems under control, but it should make you ask yourself: how resilient are my systems? How long would it take my organization to recover from a major application outage?

The article provides no specifics on what went wrong or what was done to restore services. I’d rather not speculate since it could be a hundred different reasons.  The larger point is that they clearly didn’t have a mechanism in place to get things back fast.  So let’s try a little thought experiment: What if the White House had been using NetApp Syncsort Integrated Backup?

Let’s use an imaginary but plausible timeline and scenario to see how things might have gone…

1:00 p.m.   Email goes down hard, a sudden catastrophic failure of the application. Email admins are looking at the problem.

1:05 p.m.  It’s been determined that there was a massive failure on the storage hardware that will take many hours to resolve. Critical communications are down and will be for a long time according to the storage admin. The IT manager asks for ideas.

1:06 p.m.  The backup admin notes that the physical email server is being backed up hourly by NSB and the last backup ran 22 minutes earlier.  He tells the team they can restart the server as a virtual machine.

1:08 p.m.  The backup admin opens the NSB console. He already has a recovery job created that will always use the most recent backup image. With one click he starts the Instant Virtualization recovery job which boots a new VM from a NetApp FlexClone snapshot. This is running from backup storage, not the primary storage which will take many hours to repair.

1:18 p.m.  Within ten minutes, the virtualized email server is up and running. Email flow restarts. The crisis over, the IT team begins the lengthy process of fixing the primary storage problem.

Sound impossible?  NSB customers do this every day.  Any application that goes down, whether from a physical or virtual source, can be restarted in minutes as a virtual machine. When the most important thing is recovery time, then recovery time is the most important thing. How fast can you recover an application today? How fast could you do it if you had NSB?

{ 0 comments }