March 2012

Syncsort kicked off our celebration of World Backup Day yesterday by hosting a Tweetchat with Enterprise Strategy Group Senior Analyst Jason Buffington. If you missed it, you can easily go back and check out the conversation by searching #backupjam.

One of the main participants that joined in was industry watcher and self-described “vendor watchdog” Jon Toigo. Jonmade a number of interesting points during the discussion, and I want to touch on some of them. I’ll pick a few at a time over the next week or so and share some additional thoughts.

Jon’s opening remark was almost philosophical:

Backup defined: A humble acknowledgement of the fallibility of all technology, and of our increasing dependency upon it

World Backup Day 2012That is certainly an interesting way to define “backup,” and there is nothing there I would argue with.  If you live and breathe in the backup space like I do, sometimes it all starts to seem very normal. But think about it: literally billions of dollars are spent every year to protect information, to protect intangible, digital assets that exist ultimately as zeros and ones. And why do we spend that money? We do it because every digital tool fails eventually. Every single one!

Whether it’s your iPhone, a laptop or a multi-billion dollar data center project like the one Jon referenced during the Tweetchat, sooner or later something goes wrong.  The result is that something very valuable to somebody vanishes, whether that’s your only copy of beloved vacation photos, your collection of five thousand songs, or a customer database that a business cannot function without. And if you’ve ever suffered through this – and who hasn’t at least once? – you know the panic it causes, the desperate too-late feeling of “I knew I should have backed that up.”

If you lose data and do have a backup of it, the sense of relief when those files start coming back is incredible. If you’re an IT person responsible for backup, it’s more than relief. It is not an exaggeration to say it can mean saving your job, or even saving your company and the jobs of many other people. It’s that important.

That’s why World Backup Day was started, so that we can all think about what we’d be losing if the inevitable decided to pick us as its next victim.  And that’s the thing about something that happens to everyone: it happens to you, too.

{ 0 comments }

Last week,  I attended two days of sessions at GigaOM’s Structure:Data Conference  in New York City where over 700 attendees came together to discuss the business and industry-transformative nature of Big Data, and the latest technologies and approaches to best manage it all.

What struck me this year is that the conversation has evolved from Big Data being an infrastructure-only issue to now the realization that the Big Data stack requires contribution from everyone from the bottom layer of the infrastructure up through the top application layer.

The following key themes emerged from the onsite discussions and will be the focus as the community continues to develop the Big Data stack:

1) It’s all about high performance computing and speeding up analytics as data volumes grow exponentially. The pain points for unstructured versus structured data are different. While unstructured data requires better visualization of the data, structured data requires more cleansing making filtering and grouping much more critical. One of the speakers referenced a quote from Clay Shirky that, “Information overload is not the problem. It’s filter failure.”

2) The line between personal and business behavior is blurring as analytics moves out of the IT realm and into the hands of business users, and as a result there is an expectation that delivery of data can be more easily consumed, such as through visualization capabilities and collaboration.

3) Real-time decision making through predictive analytics and machine learning is becoming essential with sensor data, digital exhaust and need to get ‘insight’ to consumer behavior.

As such, there’s a realization that the Big Data market is fragmented, and there is plenty of opportunity to contribute to building the Big Data stack. Software packages and tools need to be built on top of Hadoop for example to increase enterprise adoption. Currently most of the available enterprise software is proprietary. Offering applications layered on top of Hadoop will spur the Big Data market leading to more open source contributions and additional opportunities for startups.

Syncsort has a lot to offer in the areas of performance, data integration, and processing – all critical components to the Big Data stack. We can deliver and run ETL over Hadoop without requiring a brand new development team and skill set. One of the speakers suggested that businesses should consider adopting Hadoop only if they are willing to dedicate a separate team. Syncsort’s offering eliminates this requirement for the enterprise. We can also efficiently move the data in and out of Hadoop which as John Webster points out in his CNET post continues to be an issue.

To reach the holy grail of Big Data management – the focus needs to be on building a top to bottom Big Data stack which will require different segments of the market to come together.

{ 0 comments }

Join us at the #Backupjam

March 29, 2012

Get your questions and comments ready, because it’s #backupjam time!

Syncsort is hosting a Tweetchat today in advance of World Backup Day.  We’ll be joined by Jason Buffington, Senior Analyst from ESG, and from 1:30 to 2:30 EDT we’ll be taking any and all questions and comments around data protection.  Wondering about the best new technologies to protect your data? Need to know how you can recover critical applications faster?  Have questions about the impacts of Big Data, Cloud and virtualization?  It’s all open for discussion.

To join us today:

Looking forward to “seeing” as many of you as possible there!

{ 0 comments }

I recently came across a blog post from Susan Hall over at IT Business Edge on the “Seven Keys to Becoming a Data Integration Expert.” Naturally, the headline caught my attention and I soon learned that it was based on a recent post from David Linthicum on “Obtaining Mad Data Integration Skills.”

As I read through both of these posts, I started thinking. Instead of the order that the seven keys had been originally listed, what if I tried to rank them by how much time and money these things cost organizations during an average month. Here is what I came up with:

  1. Performance
  2. Data governance (this could arguably be number 1, but most organizations I have seen aren’t really doing wholesale data governance)
  3. Security
  4. Rules and routine
  5. Database concepts
  6. Interfaces to data
  7. Data mediation and transformation

David’s performance criteria states that “…the ability to define how a data integration solution will perform over time.  This is very important.”  I couldn’t agree more!  Building performance and scalability into a DI approach is not only important today, but also for the Big Data requirements of the future. David goes on to say that many DI approaches “become useless after several years.”

We see this every day with our customers and partners.  When they’ve hit the wall with their current approach, they often try one or more of the following:

  • Add hardware (CPU, memory) – this is expensive and adds to the software cost, and usually does not scale linearly
  • Fine tune the approach/tool – this requires very senior IT staff and/or highly-skilled (read: expensive) consultants from the vendor
  • Rip out the logic and push it into the database – now you have an ELT approach pushing the cost and complexity into hundreds of lines of SQL and PL/SQL

Syncsort helps customers solve their performance and scalability issues without needing to resort to stop-gap measures that accelerate costs.

Thanks to Susan and David for their posts and the inspiration they provided me to write this one.  I look forward to following the discussion on their blogs and reading what they write about next. In the meantime, feel free to leave a comment or challenge me if you want to debate the way I’ve ranked the list above.

{ 2 comments }