Beyond MapReduce: An Easier Path to Hadoop!
After yet another snow storm in the northeast, I’m escaping to nicer weather in San Jose, California, where things are warming up at the Strata conference. I’m personally looking forward to seeing the huge round of announcements that usually accompany the event.
This year is particularly special for us at Syncsort, as we just unveiled DMX-h Release 8; our biggest and most comprehensive Hadoop release, aimed to accelerate mainstream Hadoop adoption. That means, Hadoop for (almost) everyone; Hadoop within reach!
Design Once & Deploy Anywhere!
So, what is so special about this new release? Well, for starters it features a new Intelligent Execution Layer that allows you to visually design data transformations once and then run them anywhere, while maintaining the performance of a native implementation. This represents a huge step not only for organizations currently using Hadoop, but also for those concerned about how to leverage emerging compute frameworks in the future.
Syncsort DMX-h introduces an intelligent execution layer that allows users to design sophisticated data transformations, focusing solely on business rules, not on the underlying platform or execution framework.
Let me explain why this is so big. One of the most significant challenges of Hadoop adopters is selecting the right processing frameworks for Hadoop. At the beginning, there was MapReduce, the batch-oriented, general-purpose framework for processing Big Data. Organizations had to quickly learn about Mappers and Reducers to take advantage of Hadoop’s massively scalable processing capabilities. However, it soon became evident that one framework was not enough to solve all the emerging use cases for Big Data. Moreover, organizations demanded easier tools to scale adoption and development across their IT teams.
Fueled by new and existing requirements, the pace of innovation that surrounds the Hadoop ecosystem exploded. Today, new capabilities are added by the day, new processing frameworks evolve and mature in just a few months; what once were considered state-of-the-art approaches are quickly becoming obsolete. So, where should you invest your money and effort? How can you make sure what you develop today is still usable tomorrow?
These are all relevant questions organizations need to ask sooner rather than later. For instance, teams using MapReduce today may have to start thinking already about a viable migration plan to Spark!
As Mark Grover describes in his article – Processing frameworks for Hadooop – “with the breadth of options now available, it can be tough to choose which framework to use for processing your Hadoop data”. But, what if you didn’t have to commit to a particular processing framework?
Well, that’s the idea behind DMX-h Release 8 and its Intelligent Execution Layer. You can think of it as an abstraction framework that isolates users from the underlying complexities of Hadoop and its multiple execution frameworks, including MapReduce, Spark and Tez. As described by Mark, one of the key advantages of abstraction frameworks is that users can change underlying processing frameworks without the need to re-write your applications. This is huge!
However, Syncsort DMX-h Release 8, goes even beyond that. How? I’ll just mention a few things:
- DMX-h provides a drag-and-drop graphical developments environment – common for distributed and non-distributed environments. This means your applications can run with or without Hadoop
- Most other abstractions frameworks introduce overhead, necessary to “translate” jobs to execute on the processing framework. DMX-h installs on every node, jobs run natively within Hadoop which means literally no overhead. Moreover, DMX-h will dynamically pick the best algorithms and execution path to execute the job at hand, depending on the underlying execution framework and run-time conditions.
- DMX-h provides enterprise-level capabilities such as security and metadata management.
Break Free from Hadoop Complexity!
As we make our way into 2015 – IT teams need to start thinking beyond MapReduce to a more comprehensive strategy for their data management architectures. Hadoop will certainly take an even more important role in this strategy. By providing a common user experience across single and distributed environments, our newest DMX-h release allows many users, not just highly skilled developers, to create sophisticated data flows without worrying about mappers, reducers; big or small data. Moreover, this layer sets a solid architectural foundation to future-proof your applications. With support for Windows, Unix, Linux, Hadoop MapReduce and upcoming support for Spark, Tez and more, you no longer have to make tomorrow’s decisions with today’s sight. After all, you would never do that to your business, Would you?
If you’re lucky enough to be in San Jose, stop by our Strata booth #923 and see DMX-h 8 in action, or you can check it out at www.syncsort.com/DMXh!