New Application of Change Data Capture Technology Revolutionizes Mainframe Data Access and Integration into the Data Lake
Introducing Connect Change Data Capture (CDC) the only product available with mainframe “capture” AND Big Data “apply…”
Syncsort’s New CDC Product
Yesterday, we announced our new product offering, Connect Change Data Capture. This unique new offering, which works with Connect for Big Data, allows for reliable, low impact (I’ll explain this more) capture from mainframe sources and automatically applied to Hadoop data stores. I’d like to use this blog to explain what we announced and why it’s unique in the industry.
Our unique, high-performance, mainframe access and integration provided by the Connect portfolio of products understands mainframe data including databases and complex copybooks. I once saw an 86-page mainframe copybook that Connect read in without any problem…could any ETL tool do that?… I highly doubt it!.
Connect Change Data Capture is a unique application of CDC technology that provides unrivaled mainframe data access and integration to continuously, quickly and efficiently populate Hadoop data lakes with changes in mainframe data.
Achieving Low Impact on Resources AND Performance
Another key differentiator that saves an incredible amount of time and resource is the ability for Syncsort’s data integration leverages our UI and dynamic optimization, which eliminates the need for coding or tuning.
Anyone associated with a mainframe is always concerned with mainframe MIPS/CPU impact because of the way mainframe usage is metered and charged for by IBM. Connect CDC doesn’t use database triggers, which can negatively impact performance and can also have an impact on MIPS.
Not only does Connect CDC have a small footprint on the mainframe to capture the logs, but the CPU impact – like everything we do – is minimal to keep costs low.
Connect for Big Data and Connect CDC is a single offering to “capture” the changes on the mainframe (getting the changes from the logs, not triggers). The “apply” is to Hadoop Hive with create, updates and deleted records (yes, updates!) to any Hive file data store including Avro, ORC and Parquet.
This is also a very reliable transfer of data, even during a loss of connectivity between the mainframe and the Hadoop cluster. Connect for Big Data and Connect CDC can pick up the transfer and update where the transfer stopped, without restarting the entire process.
Our initial support will be for IBM DB2/z and VSAM data sets. We’ll add more data stores both on and off the mainframe. Mainframe data stores are in and of themselves unique, but when you can liberate that data AND add in the automatic apply to Hive (including updates!), this is truly a one of a kind offering. I’m proud of what the team has been able to accomplish in solving challenges in getting critical mainframe data into the data lake to support real-time business insights!
For more information, download our eBook: Strategies for Change Data Capture