Hadoop, Mainframe & Syncsort: Simply the Best, Part 1
Hi, I’m Paige Roberts, the new Product Manager for big data. As the designated “new guy” here at Syncsort, I’ve been hearing a lot about Syncsort being “Simply the Best” at Hadoop mainframe integration. I sat down with Arnie Farrelly, VP of Global Support and Services, and asked him some pointed questions about what puts Syncsort in a whole different league when it comes to mainframe access and integration for Hadoop data lakes. Arnie has been with Syncsort for 26 years in various roles from support engineer to various management positions and currently manages the professional services, product support and pre-sales engineering teams for the big data business unit. Since Arnie’s job puts him in the middle of nearly every Syncsort customer engagement, he seemed like exactly the right person to fill me in on what makes Syncsort tick.
The interview was so chock full of great information, we are delivering it as a three-part series. Hope you enjoy!
Let’s start with the motivation behind this big push. Why is it so urgent to customers to get their mainframe data into Hadoop?
There are a couple of major motivators. First, it’s the economics. Customers want more of their data accessible so they can do more analytics and Hadoop is the most cost-effective platform to do this. Historically, mainframe data storage has been very expensive. Tape storage was good for storing large amounts of data, but it wasn’t accessible. Data was moved to tape just because it was too expensive to store it all in normal DASD storage. What we’re seeing now is people wanting to move even the tape stored data into Hadoop. The economics of storing that data on Hadoop is excellent, and it’s also far more reliable. Over time, tape deteriorates. And most of all, this means that data is accessible in Hadoop where they can do analytics. Of course, getting that data into Hadoop to unlock the value in the data is difficult.
So, having an active, accessible archive is one goal?
Yes, exactly. To have that data in a place where it can be analyzed quickly and cost-effectively. But that’s just one thing, and not even the main advantage of having mainframe data in Hadoop. The best thing customers get is insight into information that they may never have seen before. Mainframes contain the key data in a lot of businesses, but combining that data with other enterprise data is where the real value lies. Companies can make better and more informed decisions for the future, impacting their future profitability, growth and more. The economics and scalability of Hadoop change the game. Things that were impossible before, or not cost-effective to do… Now, anything is possible.
Is Syncsort DMX-h frequently used to access mainframe data and bring it into Hadoop? Are companies having trouble with that?
Yes, we’re seeing a lot of customers wanting to do that, and we’re also seeing them struggle with it. There’s not a lot of utilities out there that are easy to use. There are some good tools in the Hadoop stack itself, but they require Hadoop skills to use, and those can be hard to find. Companies don’t necessarily want to spend money and time training a lot of their people in Hadoop development. They look to us to provide an easy-to-use utility that can very quickly, easily and cost-effectively move data from the mainframe, and a variety of other data sources, into Hadoop. We handle complex mainframe COBOL, VSAM and DB2 data head and shoulders better than anyone else. And what we’re hearing from customers is that compared to other products like Informatica, the learning curve on our product is far shorter. In a POC we were working on recently, a customer said, “It took me months to learn the Informatica interface.” He took up our product and in a day, he was working with it, using it effectively. The simplicity of our interface is unique in the market today. It really lets you easily move data off the mainframe and into Hadoop without a lot of hard-to-find skills.
So, finding people with Hadoop skills is a challenge that’s holding customers back. Is that a problem on the mainframe side at all?
Absolutely. In the mainframe space, you’ve got a diminishing pool of people with the right skills. That’s probably another reason why moving to a new environment for data processing is appealing. And why Syncsort’s long history with mainframe, and deep expertise in that area puts us so far ahead.
Are there other mainframe related challenges you’re seeing in the market?
One of the challenges that we see a lot is difficulty with metadata. Data in COBOL applications tends to drift over time, and if the copybooks aren’t kept up-to-date, then it can take weeks for a company to figure out why the data isn’t matching the metadata. One of the things our tool provides is a quicker way to identify where those mismatches are. Customers need solutions that can reduce the amount of time it takes to do that, so that they can get the data into Hadoop in a form that can be blended with other data sources and analyzed.
Interested in hearing more from Arnie? Stay tuned for Part 2 of our blog series, where you’ll learn why DMX-h is “Simply the Best” for mainframe to Hadoop!