Data infrastructure optimization, availability & security software
Data integration & quality software
The Next Wave of technology & innovation

Break Down Barriers to Hadoop Adoption with Syncsort DMX-h

Plenty of organizations are making significant Hadoop investments, but many of these projects are still in their earliest stages. That means the astounding potential business impact of analytics-based business processes is still going largely unrealized.

Data is flowing like never before, and organizations are trying to figure out how to get the maximum value from it in a reasonably timely manner. Hadoop represents a major step forward in leveraging big data, but making the most of Hadoop has largely been limited to organizations with big budgets and access to specific skill sets.

There are ways of overcoming architectural limits when implementing Hadoop.

An October, 2013 survey of big data practitioners by the Sandhill Group found that the top uses of Hadoop were in basic analytics, business intelligence, and data preparation. The survey found the top challenges related to Hadoop adoption were knowledge / experience, skills availability, and development effort.

An informal poll taken by Merv Adrian and Nick Heudecker of Gartner during a January, 2014 webinar found that top barriers to Hadoop adoption included “undefined value proposition,” cost of acquiring skills, and integration with the rest of the infrastructure. Assuming a company has defined their big data value proposition, skills and infrastructure integration are likely to figure prominently as barriers to Hadoop adoption.

Breaking Barriers with DMX-h for Hadoop ETL

Organizations want to make the most of the tsunami of data they have access to, but they’re reaching the architectural limits of their data processing infrastructure. So they’re increasingly turning to the Hadoop MapReduce framework to reduce cost while scaling data collection and processing. The problem is, most extract-transform-load (ETL) tools simply generate code that’s executed in Hadoop, without integrating Hadoop into their architecture. That forces organizations to chase down hard-to-find MapReduce skills, manually maintain mountains of code, and continue to add hardware.

Syncsort’s DMX-h for Hadoop ETL changes all that. DMX-h for Hadoop ETL is high performance ETL software that lets users maximize the benefits of MapReduce without giving up capabilities, ease of use, and use cases of traditional data integration tools.

What DMX-h Does

DMX-h for Hadoop allows organizations to connect to just about any data source (including mainframe), move data into and out of Hadoop up to six times faster (compared to using the Hadoop “put” and Hive load commands), without writing scripts, create MapReduce ETL processes graphically without coding, and integrate Hadoop seamlessly for sort and ETL operations. In other words, it allows organizations to unleash the immense potential of Hadoop using an architecture that runs ETL processes natively within Hadoop. Without writing code.

With DMX-h, create MapReduce ETL processes without writing code.

MapReduce ETL Without Writing Code

DMX-h for Hadoop means you don’t have to be a developer to create ETL tasks that execute within the MapReduce framework. Instead of using complex Java programming or Pig scripting, users can employ an easy-to-use graphical environment. You can simplify development of applications that load data into the Hadoop Data File System (HDFS), or pull data from HDFS and load it into other systems.

DMX-h is not a code generator, but is used by MapReduce at runtime, executing natively as part of the Hadoop framework. It automatically optimizes CPU, memory, and input / output so you get the best performance every time, processing more data in less time, with fewer servers.

Conclusion

Businesses very much want to use big data now that the data spigots are being fully opened, but many of them face technical and skills barriers when adopting Hadoop. Syncsort’s DMX-h for Hadoop ETL lowers those barriers significantly. With DMX-h, organizations can use their existing ETL skills to ramp up their Hadoop initiatives, creating ETL tasks that execute within MapReduce without writing code. Skills availability and infrastructure integration are two major barriers to adoption of Hadoop for making the most of big data, and Syncsort DMX-h for Hadoop ETL addresses those two barriers, putting the power of Hadoop into the hands of more organizations of all sizes.

Hadoop and MapReduce experts can be hard to come by as big data grows in importance. Syncsort addresses the emerging need for tools that make Hadoop initiatives less technically daunting, and less expensive, while helping organizations make the most of what Hadoop offers. By helping customers overcome the architectural limits of the ETL and Hadoop environments, Syncsort empowers customers to get more from big data at a lower cost, while driving better business outcomes.

Related Posts