Syncsort Big Data Integration – Streaming and Kafka and Spark, Oh My!
In navigating big data integration technology, companies might start to feel like Dorothy in Oz, overwhelmed with the complexities of this new world, and bewildered by the constantly changing landscape. Syncsort’s goal is to simplify your job of navigating this wild maze of technology by providing the software equivalent of a yellow-brick road. Today, we’re announcing that the brand new version 9.0 of Syncsort DMX and DMX-h will be generally available in June 2016. Here are some of the ways it can make your big data integration job easier.
When Syncsort set out to build DMX/DMX-h 9.0, we had one overall goal — make big data integration tasks simpler for our customers. To do that, we felt it was important to do some specific things.
Ensure DMX/DMX-h had the best Mainframe access and integration capabilities in the world.
First, Syncsort’s heart and greatest strength is in working with mainframes. Bringing mainframe data into your enterprise data hub or data lake should be just as smooth as bringing in all the rest of the data in your enterprise. And once in a Hadoop cluster, or other inexpensive scalable data processing infrastructure, you need to able to manipulate that mainframe data, and blend it with the rest of your enterprise data.
We already announced the unique, new way we let companies save an unchanged archive copy of mainframe data on Hadoop. Once there, Syncsort teaches Hadoop to process mainframe data, as if it were in native Linux and Hadoop encodings. We’ve packed the last two releases full of lots of features that make mainframe data a first class citizen in modern data architectures, including support for complex mainframe data structures, taking the 64K limit off record lengths, and expansion of supported mainframe data types. We are simply the best at mainframe access and integration with big data infrastructures, and each release just makes us better.
Enable DMX/DMX-h to handle streaming data sources
Syncsort also had the courage to take on the challenges of streaming data. DMX/DMX-h 9.0 has full support for Kafka as both a consumer and producer of Kafka topics. We also worked with MapR to help beta test MapR Streams, and were one of the first to receive MapR Streams certification.
Syncsort customers deal with huge streams of game event data flowing in from MMORPG data centers, financial data flowing in from live, online money transfers, and IOT data flowing in from hospital machines. That data needs to be consumed, processed, and in many cases pushed back out at streaming data speeds. Syncsort took on that challenge and conquered it in version 9.0.
The thing that will simplify your life most about this new feature is that streaming data processing jobs can be defined in the same easy interface as batch. You can even combine streaming and stationary data sources together in the same workflow. There’s no need to learn yet another programming language or interface to handle streaming workloads.
Make it possible to customize and extend DMX/DMX-h quickly and seamlessly
When Dorothy found herself suddenly not in Kansas anymore, she had to be very adaptable and find her way back home. Customers facing the complex challenges of big data technology need to also have the ability to adapt to their unique challenges. Syncsort added new frameworks to make DMX/DMX-h 9.0 highly extensible, without complicating job design.
This allows customers to call out to any other application that has a command line interface, and pass command line parameters right in the job designer interface. It also allows special small applications to be created for functionality needed only in specific instances. In both cases, the new plug-in functionality will show up in the user interface, and you can drop it into the job just like any built-in task.
Already, this extensibility has been put to use by Syncsort professional services. They’ve created new packages for special rounding, advanced math, and three different data pivot styles. Data scientists can use this framework for specialized algorithms. Once created, if you wish to share your new bit of functionality, you can place it in the Syncsort knowledge base. Already, the new capabilities that our own people have created are available in that library under Utilities. Over time, the library will expand, and when you need something unusual to take on your particular breed of flying monkeys, that will be the first place to look.
Add Apache Spark to the frameworks supported by Intelligent eXecution
The brains of DMX/DMX-h 9.0 is in the new addition to its Intelligent eXecution layer. IX already insulates the data processing job designer from the type of execution framework. If you’re trying to design a data integration job in a modern data architecture, you never know what that will have to be executed on. The last thing you want to have to do is re-build all your design work when a new framework matures.
A few years back, MapReduce 1.x was the only way to execute data processing on a Hadoop cluster. Then, MapReduce 2.x with YARN became the standard. Now, Spark is the dominant framework for new cluster data processing development. Syncsort customers had the advantage of designing once, executing in MapReduce 1.x, then moving to MapReduce 2.x without doing anything more than changing a few settings. DMX/DMX-h 9.0 gives you the same ability with Spark. Syncsort’s goal is to support new frameworks as they become needed, so Syncsort customers don’t have to worry about what twists or turns in the path will come next.
Syncsort simplifies your big data integration challenge by future-proofing job designs. Design once, and you can deploy the same job on Linux, UNIX or Windows, on a server, in an on-premise cluster, or in the cloud, and now in MapReduce or Spark. Using Syncsort means never having to re-do your data processing job design work, regardless of how much the big data technology landscape shifts.
And More …
Version 9.0 had lots more new features, improved support for security, governance, and compliance, and expanded cloud capabilities with Google Cloud Storage support, among other things.
Simply follow the yellow-brick … or simply use Syncsort DMX/DMX-h 9.0 and you’ll find your way.