Data infrastructure optimization software
Data integration and quality software
Data availability and security software
Cloud solutions

Offloading ELT Workloads with Hadoop. A No-Coding Approach.

I often interact with Big Data and Data Integration architects at meet ups, and find moving or offloading expensive ELT (Extract, Load & Transform) processing to Hadoop seems to be associated with the need for highly skilled programmers in Java, Hive and other Hadoop technologies. This impression feels even more prevalent when the data and processing involves complex data structures like on the mainframe and complex Data Warehouse SQL processes that need to be re-written to run on Hadoop. The following article offers an alternative approach and may assist in dispelling this myth.

There is indeed a very simple, comprehensive, graphical approach to offloading batch ELT processing and associated data from your Enterprise Data Warehouse (EDW) and other source systems such as mainframe to inexpensive Hadoop storage and processors, then load the processed data from Hadoop back to the EDW after performing all of the graphically defined Hadoop processing.

To illustrate my point I built a simple, yet powerful video that shows how we took an ELT job with more than 900 lines of SQL and converted it to a simple, graphical Hadoop job. All of this is done in a single workflow with Syncsort DMX-h, Syncsort’s Hadoop ETL software, without writing a single line of Hive, Pig, or Java.

After all, not all companies have an entire army of highly skilled Java, Hive, Pig, and SQL programmers at their disposal. And even if you do, you may want to invest their time and effort doing the cool, next generation analytics; the kind that can help you find your next business opportunity or best-selling product. In all those cases, you can leave the complex ETL tasks to Syncsort.

So here you have it. If you like what you see, I would recommend you get the free guide which describes in a lot more detail, a framework for offloading ELT workloads from the enterprise data warehouse as well as step-by-step instructions to easily create your own Hadoop ETL jobs without coding, using Syncsort DMX-h.



Related Posts