Propelling Big Data: Data Wrangling for Hadoop and the Enterprise Data Warehouse
New types of data and a lot more of it are impacting the traditional enterprise data warehouse (EDW) and driving adoption of Hadoop to process the combination of structured and unstructured data. New applications are demanding more development flexibility. After all, businesses are constantly changing; and the data and workloads that drive business insights are changing even faster. Putting it simply, for data and workloads “the only thing that is constant is change”.
So how can data scientists and programmers develop and maintain thousands of data transformations that are constantly evolving? How can they manage mixed environments of programming languages and graphical tools? As my colleague – Nikhil Kumar – described it in his previous post, the answer relies on on-demand, metadata-driven development with DTL.
At Syncsort, we work hard to give ETL developers, programmers and data scientists the tools they need to work and process data quickly and efficiently. Syncsort Data Transformation Language (DTL) delivers the goods for the programmer set. With DTL, we provide a domain-specific, full-featured programming language, specifically designed for data transformations. ETL in the EDW and “data wrangling” on Hadoop, can now be fully created and deployed using the time-tested Syncsort engine. Developers now have complete flexibility to design sophisticated data flows in either a graphical environment using a Job and Task Editor, or a scripting environment using DTL.
But, we have talked about DTL before on the Syncsort blog (here and here) that’s why this time I wanted to show a taste of DTL in a short video instead. The video includes a demo and pointers to where you can find more detailed information.
Please let me know what you think of the video, and any questions you may have about DTL. I hope you’re as excited as I am about having this new tool in your arsenal!