The Proof is in the Pudding
The majority of technology sales, particularly in software, require some sort of proof of concept (POC) intended to prove out the product(s) based on a customer’s requirements. Syncsort is no stranger to POCs and we have a record of producing some really impressive results. Recently, we had the opportunity to present some of these results to a respected industry analyst. He suggested we share some of our POC results on a regular basis on the Syncsort blog. What a great idea!
In writing the first in what will be a series of posts throughout the year on POC results, I was inspired by my colleague and fellow Syncsort blogger Dave Nahmias. One of the phrases that those of us who work with Dave have no doubt heard him speak at one time or another is, “the proof is in the pudding.” We have seen time and time again situations where prospects are pleasantly surprised (and even amazed!) when they get their hands on DMExpress and an up close and personal look at just how fast, efficient and simple it really is to use.
Let’s start with a relatively straight forward POC performed on a Windows machine with 8 cores. Clearly this is not a large, powerful box. This will be important to keep in mind as I share the results. This particular job joined two data sources, performed two aggregations, and then loaded the data into SQL Server and Oracle as well as wrote to a compressed file.
Here are some of the specifics:
- One of the data sources consisted of more than 100 million records (15GB of compressed data). The second data source was small (1,200 records). The reading of both files and the join took 2 minutes 25 seconds (about the same as the amount of CPU time). Only 35MB of memory was used!
- The first aggregation took just under 19 seconds, and used only 3 of the cores and 50 seconds of CPU time. This included the write to the compressed file and the load into Oracle.
- The second aggregation took 40 seconds, using only 2 cores and 45 seconds of CPU! This included the load into SQL Server.
Total job time: 3 minutes, 5 seconds!
So, what were we trying to beat? How about almost 4 hours of processing that was running in the database! Not only did we beat the times by orders of magnitude, the customer can now use a graphical interface to build and maintain the ETL. Perhaps more importantly, the customer can also offload expensive database cycles and staging tables.
Since we do this for a living and see results like this from DMExpress all the time, it is easy to lose sight of the impressive results consistently coming from POCs. However, what we believe makes these results even more impactful is that they were achieved without consuming the entire box.
Stay tuned for more results in the days, weeks and months ahead. We’d also love to hear from anyone interested in learning more or who has seen similar results with their tools (please feel free to post a comment). We are also willing to take on those interested in challenging us to a benchmark!