Unlocking Big Data Analytics Potential with Real-Time Mainframe Data
Splunk® provides the leading software platform for real-time Operational Intelligence. Splunk software and cloud services enable over 9,500 organizations to search, monitor, analyze and visualize machine-generated Big Data coming from websites, applications, servers, networks, sensors and mobile devices. Until recently, Splunk customers were unable to source real-time data from the mainframe as well, specifically IBM z/OS®.
This means the top banks, insurers, retailers, telcos and healthcare providers that rely on IBM z/OS are likely missing the z/OS view from their Splunk applications. After all, up to 80% of corporate data from these companies originates the mainframe.
Take an ATM transaction. In many cases it takes a trip through the mainframe to check the balance or for fraud detection or some other reason. If all of the log data for that transaction were included into the analytics data lake, you could get a complete picture of that transaction. But loading mainframe data into Splunk can be challenging. There are many “special to mainframe” data types such as packed decimal and various binary formats especially from IBM’s “super” log data generator, Systems Management Facility (SMF). The integration, data conversion and general complexity of these mainframe data structures make the task of forwarding data to Splunk far from easy. Mainframe users also have concerns around security. If data is being moved off the mainframe platform, it has to be encrypted and transported securely.
Splunk Enterprise makes it simple to collect, analyze and act upon the untapped value of big data.
Perhaps most importantly, what is the cost of moving the data to Splunk? Is there overhead incurred in moving 100’s of GBs to even TBs a day to Splunk Enterprise? The last thing any of these organizations can tolerate is added latency, bottlenecks or additional mainframe processing requirements.
Those are some of the problems. But why do they need to be solved? Let’s say an organization has visibility and analytics capabilities for its mainframe systems. Separately, it uses Splunk to gain visibility into its distributed platforms applications –namely applications like the ATM. Between the two, you might think you could solve any and all problems that come up. That’s not typically the case. What is lacking is one single end-to-end repository where analysis can be accomplished in a holistic and timelier manner.
Let’s go back to the ATM example – problem: a customer in front of an ATM couldn’t get the withdrawal transaction completed. A call comes in to the Help Desk. The Help Desk gets in touch with the applications team responsible for the distributed platforms hosting the front-end ATM application. They don’t see any issues that could be causing the incomplete transaction – other than DB2 on the mainframe rejecting the transaction. So they then call the z/OS team who look into the DB2 performance monitor. Things look fine to the z/OS team as DB2 is running at a performance level of 10,000 transactions a second. Error messages for a rejected transaction are in a log – but that log has already been written to a file. It may take a separate analysis that evening to really understand what went wrong, instead of having the full set of transaction data, including the mainframe data, in Splunk to investigate the problem as it was happening.
In many cases, an ATM transaction takes a trip through the mainframe to check the balance or for fraud detection.
Later that evening, it is discovered that the DB2 transaction log in z/OS showed that the transaction was rejected in DB2 due to a table being locked out. As it turns out, there was an unrelated batch job using the same table as the ATM app, but the job control had a bad password. And the job was resubmitted 3 times before anyone noticed – so DB2 locked the table and the ATM app for this customer could not complete. To fix the problem, the password is reset, the app is updated with the new password, the batch Job Control is fixed with the correct password and the customer can now complete the ATM transaction. If you have monitors for each silo, you can fix one thing, but you are not able to view the impact of issues that span across all systems. If all the log data is in one place, cross platform transactions can be correlated so any issues are resolved quickly – in minutes rather than a day.
So now, let’s look at what could be termed the final frontier in successfully integrating mainframe log data into Splunk – being able to achieve this goal in real time or near real time. Much of the value to be gained from Splunk is the fact that it analyzes streaming real time data. Users want their data now. They want it delivered in seconds. Timeliness is critical in matters of security, IT operations and application performance – and that’s what Splunk is especially good at.
The best way to make that happen in z/OS is to intercept the data at the moment it is created, not once it is written to flash, disk or tape. By accessing the data from memory via standard IBM exits before it is written to a log, the data can be prepped (converted, integrated and encrypted) using IBM zIIP specialty engines to minimize any processing cycles before being forwarded to Splunk. With this approach, neither mainframe performance nor cost is impacted.
Many large-scale IT organizations – government, banking, insurance, financial services, package delivery, telecommunication, retail, etc. – have built their enterprise systems around IBM z/OS. Syncsort Ironstream® software is a z/OS forwarder that ticks all the boxes for mainframe users – it simplifies data integration, provides security, intercepts data efficiently – and it executes with minimal CPU cycles, eliminating added processing cost and maintaining z/OS application performance.
Harvey Tessler at Splunk.conf 2014 on theCUBE
By combining Ironstream® software with Splunk, organizations can form a Big Data strategy based on combining Splunk elastic search technology and its existing Splunk data with mainframe data. Just imagine the insights possible by combining mainframe transactional data with clickstream data, web logs, sentiment analysis and more?