Syncsort Continues to Drive Mainframe and Distributed Data Integration
This post is an update of an article that originally appeared on the Dancing Dinosaur blog.
In the fall of 2015, Syncsort, a leading mainframe ISV, introduced a set of tools to facilitate data integration through Apache Kafka and Apache Spark, two of the most active Big Data open source projects for handling real-time, large-scale data processing, feeds, and analytics. Syncsort’s primary integration vehicle then revolved around the Intelligent Execution capabilities of its DMX data integration product suite with Apache Spark. Intelligent Execution allows users to visually design data transformations once and then run them anywhere – across Hadoop, MapReduce, Spark, Linux, Windows, or Unix, both on premise or in the cloud.
Since then Syncsort, in March, announced its big data integration solution, DMX-h, is now integrated with Cloudera Director, enabling organizations to easily deploy DMX-h along with Cloudera Enterprise on Amazon Web Services, Microsoft Azure, or Google Cloud. By deploying DMX-h with CDH, Syncsort explained, organizations can quickly pull data into new, ready-to-work clusters in the cloud – vastly accelerating how quickly they can take advantage of big data cloud benefits, including cost savings and Data-as-a-Service (DaaS) delivery.
A month before that, this past February, Syncsort introduced new enhancements in its Big Data integration solution. Again, DMX-h enables organizations to accelerate business objectives by speeding development, adapting to evolving data management requirements, and leveraging rapid innovation in big data technology. In addition, new integrated workflow capabilities and Spark 2.0 integration would dramatically simplify Hadoop and Spark application development, enabling organizations to extract maximum value from all their enterprise data assets regardless of where it resides, whether on the mainframe, in distributed systems, or in the cloud.
Syncsort’s new integrated workflow capability also gives organizations a simpler, more flexible way to create and manage their data pipelines. This is done through the company’s design-once, deploy-anywhere architecture with support for Apache Spark 2.0. This makes it easy for customers to take advantage of the benefits of Spark 2.0 and integrated workflow without spending time and resources redeveloping their jobs.
Building an end-to-end data pipeline can be time-consuming and complicated, with various workloads executed on multiple compute frameworks, all of which need to be orchestrated and kept up to date. For example, an organization might need to access a data warehouse or mainframe, run batch integration for large historical reference data in Hadoop MapReduce, and tap streaming analytics and machine learning workflows with Apache Spark. Delays in development, however, prevent organizations from getting the timely insights they need for effective decision-making.
Syncsort’s Integrated Workflow helps organizations manage various workloads, such as batch ETL on large repositories of historical data. This can be done by referencing business rules during data ingest in a single workflow, which simplifies and speeds development of the entire data pipeline, from accessing critical enterprise data, to transforming that data, and ultimately analyzing it for business insights.
Finally, in October 2016 Syncsort announced new capabilities in its Ironstream® software that allows organizations to access and integrate mainframe log data in real-time to Splunk® IT Service Intelligence (Splunk ITSI). The integration of Ironstream with Splunk ITSI provides users with new levels of visibility into the health and performance indicators of mainframe and mainframe-to-distributed IT services.
Syncsort followed-up by announcing integration between Ironstream and Corporation’s Application Audit software to deliver the audit data to Splunk® Enterprise Security (ES) for Security Information and Event Management (SIEM). The new integration dramatically improves an organization’s ability to detect threats against critical mainframe data, correlate them with related information and events and satisfy compliance requirements.