Why the Mainframe Matters to Your Data Ecosystem
The venerable mainframe has proven to be a stubborn and persistent animal. Predictions of its imminent demise have been circulating for years now, but the mainframe just refuses to go away. Seventy percent of Fortune 500 companies still use mainframes for their most critical business operations. And in a recent “State of the Mainframe” survey, more than half of the respondents indicated that their companies are continuing to create new mainframe applications.
The fact is that with all the drastic changes that have occurred in the IT landscape over the last half-century, the mainframe has managed to adapt so well that it remains a key part of the IT ecosystem. Not only has it maintained its role as the premier engine for large-scale batch and transaction processing tasks, but it is also becoming an increasingly rich data source for external big data and analytics platforms such as Hadoop.
The Mainframe and Big Data
When it comes to important parameters such as reliability, availability, scalability, data security, and concentrated computing power, the mainframe simply has no peer. Says Harvey Tessler, one of the founders of Syncsort, “The mainframe is the go-to computing platform for the overwhelming majority of our largest customers, who leverage its ability to be the backbone for billions of computational intensive transactions for business-critical applications.”
All that activity generates huge amounts of data. And in an age when big data analytics and machine learning are at the forefront of IT innovation, the data accumulated by mainframes is of incalculable value to the companies that own those systems. That’s why the integration of mainframes with Hadoop has taken center stage in many corporate data centers.
The Mainframe-Hadoop Partnership
With the mainframe’s monthly licensing charge (MLC) keyed to peak CPU usage and comprising 30% or more of costs, businesses are interested in offloading as much processing from the mainframe as possible. Some CPU-intensive batch jobs, such as sorting and filtering, can be run much more cost-effectively in Hadoop without impacting performance.
More importantly, tasks associated with storing, retrieving, and correlating huge datasets to enable big data analytics, machine learning, and business intelligence applications can be accomplished much more efficiently in Hadoop, which was designed specifically for that purpose. Plus, because a Hadoop cluster, employing low-cost commodity disk drives, can store data more economically than can mainframe DASD (Direct Access Storage Devices), archival or seldom-used data that the mainframe previously had to commit to tape can often be kept online in Hadoop for immediate use as needed.
How Syncsort Enables the Mainframe-Hadoop Partnership
The partnership between the mainframe and Hadoop has great promise, but faces a major hurdle – the two don’t speak the same language and can’t communicate with one another. Hadoop was designed for the distributed processing environment, with no consideration given to the mainframe at all. In a fundamental way, Hadoop doesn’t even know that mainframes exist.
But Syncsort’s DMX-h knows all about both mainframes and Hadoop, and is specifically designed to bridge the gap between the two. DMX-h shines at integrating mainframe data into Hadoop’s file system in such a way that all formatting and other properties remain unchanged, but Hadoop is able to distribute, replicate, and process it just as it would any other data.
Mainframes Are Firmly Entrenched in the Modern Data Ecosystem
Because it remains the go-to platform for handling business-critical workloads in the most cost-effective manner, and because it is the source and repository for huge volumes of invaluable current and historical data used for business intelligence, big data analytics, and machine learning, the mainframe continues to occupy a central position in the data ecosystems of many companies.
Download our white paper and learn how to get the most out of your mainframe!