DMExpress “Pudding” Series: Reducing Elapsed Processing Time by 93 Percent
Earlier this year, I kicked off the “proof is in the pudding” blog series as a way to share results that DMExpress is achieving during proof of concepts (POCs) in real customer environments. The idea is to wow the loyal readers of the Syncsort blog with information about DMExpress’ speed, efficiency and ease of use.
It has been too long since I contributed to the series, but I promise to start posting more frequently. We’ve got a lot of exciting work going on behind the scenes and plenty of information to share.
For this post, I want to focus on a recent POC involving a customer that was running up against their nightly batch window. If there was any failure at all during the evening, the customer would not be able to refresh the data warehouse leaving business users with data that is 24+ hours old. This was simply not acceptable to the business and we knew that DMExpress was just the right solution for the job.
For this POC, the environment consisted of a four-core UNIX box with ETL coded in PL/SQL (while that’s really ELT, please forgive the semantics for right now). Another challenge this customer had was that this particular ETL flow involved nearly 900 lines of PL/SQL which was incredibly complex and nearly impossible to maintain. In fact, they really only had one person capable of maintaining it. What happens if he goes away? Hopefully this doesn’t sound too familiar to you!
The stated goal of the POC was to reduce elapsed processing time by 33%. Additionally, we were looking to demonstrate that DMExpress could significantly reduce the complexity of building and maintaining the ETL.
The particular job involved 5 data sources, identifying changed records, performing multiple joins, enhancing the information via lookup, and loading the database. The POC ran on approximately 350,000 records, a relatively small amount of data. However, as you are about to find out, the results were quite impressive!
The original process was taking 90 minutes, so the 33% reduction that the POC targeted meant that we had to reduce it to 60 minutes. How did DMExpress do? How about only 6 minutes! That’s a 15x improvement in throughput and 93% reduction in elapsed time for those of you keeping score at home.
How about the 900 lines of PL/SQL? We took that and converted it into just 2 DMExpress jobs, now built and able to be maintained in a simple, easy-to-use graphical user interface. Needless to say, the customer was impressed.
Stay tuned for more results in the days and weeks ahead. In the meantime, don’t be shy about posting comments and questions.
We are also still open to taking on any challengers willing to put their solutions up head-to-head versus DMExpress in a benchmark. Of course, with results like the ones I’ve shared above, I guess it’s not a big surprise that we haven’t had any takers on that just yet…