November 2011

It’s that time of year again, when we look into our rearview mirrors and our crystal balls.  I’ll save my 2012 data protection predictions for a future blog post, although I will note that my “100% Accurate, Guaranteed!” predictions for 2011 all came true!

For now, I want to reflect on the past year of blogging. Rather than picking the topics myself, I’ll let the blog readers do it.  By that, I mean I will now reveal the top five most read data protection blog posts on the Syncsort blog in 2011 as calculated by Google Analytics.

#5 Are Snapshots Backups? Yes Indeed!

I’ve done a number of posts around snapshots, and will continue to do so as I think they are critical to data protection. This post may have gotten a bit of additional eyeballs because at the time I was involved in a bit of back and forth with EMC blogger Mark Twomey, better known as Storagezilla.  Mark continues his opinionated and always-interesting blogging, though I haven’t had any reason to respond to him lately. However, I’m still watching!

 #4 Talking with Netgain at VMworld 2011

Scott Baynes, CTO of Netgain, tells a great story about how his company, a service provider, is gaining serious data protection benefits from the NetApp Syncsort Integrated Backup (NSB) solution.  I really like these customer testimonial videos because the best advocates of NSB are the people that use and rely on it every day.  If you like Scott Baynes’ video, check out this video with Campbell Alliance, also filed at VMworld.

#3 Getting VMware Alignment Right

VMware alignment is a nice technical topic, the kind of thing I’d like to do more of in 2012.  I spent a lot of time in 2011 blogging about the higher-level benefits of NSB.  I’ll continue to do that, of course, but I think it’s time to dig a bit deeper into some topics.  This is also why we recently opened up our Syncsort Community site  to let Syncsort users and partners have detailed technical discussions around our products.  Join us!

#2EMC Replaces 3 Solutions with 4

I have to say, this is a personal favorite!  I had a lot of fun with it, and I hope I drew some compelling comparisons between the EMC data protection portfolio and NSB.  At the time, I made the statement that NSB is beating EMC 80% of the time when customers conduct product evaluations. That was back in May, and we’re still maintaining that win rate.  That said, the folks at EMC are formidable competitors and they continue to work on their products. Of course, we do as well and I’m sure 2012 will find us knocking heads more than once. We might have a surprise or two in store, as well.

#1P2V Migration in 10 Minutes

Perhaps it was the title that drew people in, with its “I can’t believe they can do that” effect. It’s true that NSB does indeed let you do 10 minute P2V migrations along with its data protection capabilities. What I find most interesting is how people continue to struggle with migrating to virtual machines despite the wide array of tools available and the fact that nearly everyone has at least some experience with the process by now. However, there’s always a better way of doing things, especially if you can do them in only ten minutes.

A sincere thanks to all of the loyal readers of our blog so far in 2011. We look forward to sharing our thoughts and exchanging ideas with you in the weeks ahead and as we move into 2012.

{ 0 comments }

I was on the underground in London last week on my way back from visiting a financial services customer when I heard a couple of well dressed gents carrying brollies (is it only us Brits that leave the house assuming it will rain no matter how nice it is outside?) musing over the old adage that the only thing you can count on is taxes, death and trouble (as captured in this Marvin Gaye song).

Their conversation got me thinking that instead of trouble, there is actually another thing you can rely on today – that data is only going to get bigger.  I would argue that the amount of useful information to be gleaned from this data is not growing at the same exponential rate. However, regardless of whether you consider your data ‘Big Data’ or not, you actually have to do a lot more “work” to your data as it grows to get business relevant and valuable information from it.

A good example of this is close at heart to those of us impacted by the Eurozone (I’m intentionally avoiding the long debate as to if the UK is actually a member given we’ve kept our own currency but are paying to support the euro). The financial crisis worldwide caused the rapid acceleration of new regulations and controls on markets and companies. In Europe, we already had Solvency II and Basel I, Basel II and now Basel III. These regulations are getting incredibly complex.

Calculations on “extreme” data volumes are required to remain compliant and keep senior executives from going to jail. In this case, picking the right ETL tool can be like receiving a “get out of jail free” card in Monopoly.

So why are the calculations required so complex? For starters, here in Europe we love them as evidenced by European Commission regulation (EC) 2257/94 which states – bananas must be “free from malformation of abnormal curvature.”  In the case of “extra class” bananas, there is no wiggle room but “class 1” bananas can have “slight defects of shape” while “class 2” bananas can have full-on “defects of shape.” Yes, that’s right. We have regulations about the shape and curvature of bananas and don’t even get me started on cucumbers (Commission Regulation (EEC) No 1677/88), where “class I” and “extra class” cucumbers are allowed a bend of 10mm per 10cm of length. Class II cucumbers can bend twice as much. So you can imagine how detailed our calculations must be for something like risk!

About 2 years ago, I was heavily involved with a very smart team working on industry models. To keep up with them, I decided I had to read and understand the Basel II regulations. All I will say is that whenever someone mentions they are working on a Basel project, it brings back horrible memories. I remember it being 4 a.m. on the first day of my “reading project” when I realised my brain hurt and that the scroll bar on the document didn’t look like it had moved. Tying this back to data integration, the point is that it’s definitely not just the volume of data that causes the problems for customers. More often than not, it’s the complexity of calculations or transforms they are dealing with.

Often when I’m speaking with people about data integration acceleration (a good example was the bank I visited earlier this week), they will respond that “our data isn’t really that big.” When pressed on how long it takes them to process their data and whether this satisfies the business, people usually pause and you can see the wheels turning in their head. This is regularly followed by an admission that they are in fact exceeding their service level agreements. The next question is to ask them how much data growth they are seeing and are they prepared for it. After an even longer pause, something like “we plan for 20 percent growth” (a commonly accepted average). However, I’ve heard numerous companies admit that actual data growth could range from 10 percent all the way up to 600 percent! But no one ever says their data isn’t growing. Inevitably, the conversation ends up focusing on how much time they spend tuning their existing environment, how much hardware they are buying, and how they have no better option than to push transformations into the database.

It is always a bit amusing and always very satisfying when the same people who were saying they don’t have ‘Big Data’ are suddenly advocating for why data integration acceleration is needed and makes a lot of sense. Instead of reminding them of what they said in the first place, I simply smile and mention the amount of money they will be able to save from it, as well.

Perhaps I should revisit the title of my post. Three things that are guaranteed are death, taxes and data breaking your data integration infrastructure. If you are already using DMExpress, you can forget about the last one since we have you covered. Everyone else, you are invited to have one less thing to worry about. The whole death and taxes things…we are sorry but can’t help there!

{ 3 comments }

Thanks to a recent post by my colleague Steven Totman, some of you may already know that I had the chance to participate on a panel about “The Breakpoints of Big Data” at the FIMA conference in London. One thing that really caught my attention during the entire conference was the number of attendees from the business side of the house. I think this represents further evidence that the information technology landscape continues to change dramatically. More than ever, the business stakeholders are willing to get their hands dirty to learn, participate and immerse themselves in the world of data management and information technology.

Some IT organizations may feel threatened by this trend and may choose to ignore it or minimize it. However, many others are already embracing it with significant benefits to the business. Ultimately, true collaboration between business users and IT empowers companies to make sense of ‘Big Data.’ I believe this holds true for most aspects of IT, especially those that are more strategic like analytics, business intelligence and, of course, data integration.

By enabling greater levels of collaboration between business users and IT, organizations can make a great leap forward and unleash the opportunities of ‘Big Data’ by:

  • Bringing together the right set of skills to understand the data and focus on the right areas
  • Accelerating development cycles, providing the business with the agility it needs to capitalize on opportunities faster than the competition
  • Understanding the business impact of diverse data services to prioritize resources and work towards a common goal
  • Generating more knowledgeable and satisfied business users, resulting in greater ROI and user adoption

That said, there is no perfect world. The new dynamics will require proper tools that allow and facilitate greater levels of collaboration, self-service, and reusability. That is why these concepts are core to Syncsort’s ETL 2.0 approach. Even then, there will always be sources of friction and areas of inefficiency. In this regard, it is really not any different than a couple that has been married for a long time.

The business user has arrived and there is no turning back. The good news for IT is that they just might have found the perfect ally for taming ‘Big Data’ and capitalizing on the opportunities it presents to organizations of all sizes.

{ 0 comments }

The Breakpoints of Big Data

November 9, 2011

Since this is my first post on the Syncsort blog, please allow me to briefly introduce myself as Steven Totman, Data Integration Business Unit Executive for Syncsort based out of the U.K.

I’m very lucky this week to be joined by Hal Lavender from Cognizant and Syncsort’s own Jorge Lopez (who you’ll no doubt recognize as one of our frequent bloggers) for the FIMA conference in London and a panel discussion we are hosting on “The Breakpoints of Big Data.” Syncsort also has a booth at FIMA, so if you are attending please make sure you stop over to see us and learn more about DMExpress.

Over dinner last night (there were no takers for a traditional British meal so we ended up at a curry house), the three of us along with Nejde Manuelian, Syncsort’s director of data integration sales for EMEA, got into an interesting discussion around when data actually became “big data.” One of our intro slides for our FIMA session charts back to the 1970s when data was stored on punch cards of 880 bytes each. Having a “big data” problem at that time meant you needed a bigger cupboard for your cards and you had paper cuts from handling too many of them!

In the 1980s, when 3.5 inch disks storing a massive 1.44 MB each were the norm, a “big data” problem meant your stack of disks with Monkey Island and Wing Commander spread across 20 of them fell over. In the commercial world, IBM came out with the 3380 storing an amazing 2.5 GB. As Jorge pointed out, Google processes approximately 23 petabytes of new data a day! When did “big data” actually start breaking our IT infrastructures?

Well, the reality is that “big data” has been breaking stuff for a while. At Syncsort, we regularly see customers who find a mere terabyte of data. For context, Hal has a few terabytes on his home desktop back in Texas which he accesses from his iPad and is the breakpoint for many systems. It reminded me of a recent discussion with a CIO for a telecommunications company who explained that thanks to issues with doing ELT (where you push transformations into the database), he was going to have to ask the CFO for 40 percent more nodes (at $500k a pop) on his data warehouse database to handle the annual 10 percent growth. When the CF0 asks him what he’s going to get for $2 million, the CIO is going to have to tell him that he will continue to get the same report he got yesterday with no improvements. Not surprisingly, this CIO was not exactly excited about presenting this “business case” to his CFO.

“Big data” that breaks IT infastructure (especially ETL tools!) has been a dirty little secret for years and is just now generating mainstream awareness. The amount of customers using DMExpress to “accelerate” their existing ETL tools is testimony to that and in my view “big data” is as valid a description for a 5 person team with 10 terabytes of data as it is for a 500 person team with a petabyte.

If your company is combining data from multiple sources and it takes the IT team more than three months to add a new data source or create a new report (which was well above the average in a recent BeyeNETWORK survey) then chances are you have “big data.” The good news is that since back in the 1970s when punch cards had just been phased out, Syncsort has been enabling customers to seamlessly drop our software into their existing environments to accelerate and solve “big data” problems.

If “big data” is a new name for a long existing problem, then Syncsort with our ETL 2.0 approach can be a key part of the solution. Please come visit us at FIMA. We’ve been solving “big data” breakpoints for years.

{ 0 comments }