If you’re like most organizations, you collect a lot of dark data – which means data that you don’t put to work. But your data doesn’t have to stay dark. Keep reading for examples of ways you can put dark data to use by using it to gain new insights.
The Sources of Dark Data
Before discussing use cases involving dark data, let’s take a quick look at where dark data comes from and why it is so prevalent in many businesses.
Infrastructure and business operations generate much more data than many companies are equipped to interpret.
For example, your networking devices probably generate huge amounts of information. Even if you take the time to collect all that machine data, it remains dark unless you analyze it.
In other cases, an inability to work with data efficiently is the reason the data stays in the dark. If the data is stored in a format that your analytics tools don’t support, you lack the ability to turn it into actionable information. In other cases, dark data may be stored on devices from which it is difficult to offload into analytics platforms.
Putting Your Dark Data to Work
The crucial point to understand about dark data is that it doesn’t have to remain dark. The minute you take dark data and leverage it to gain insights, the data becomes actionable and is no longer dark.
To illustrate the point, consider the following examples of ways in which common forms of dark data can be used:
- Networking machine data. As noted above, servers, firewalls, network monitoring tools and other parts of your environment generate large amounts of machine data related to network operations. Avoid dark networking data by using this information to analyze network security, as well as to monitor network activity patterns to ensure that your network infrastructure is never under- or over-utilized.
- Customer support logs. Most businesses maintain records of customer-support interactions that include information such as when a customer contacted the business, which type of communication channel was used, how long the engagement lasted and so on. Don’t make the mistake of leaving this data in the dark, or using it only when you need to research a customer issue. Instead, build it into your analytics workflows by leveraging it to help understand when your customers are most likely to contact you, what their preferred methods of contact are and so on.
- “Legacy” system log. If you have mainframes or other older types of systems running in your environment, you may think that there is no way to use modern analytics tools to understand them. But you can. By offloading system logs and other data from these systems into an analytics platform like Hadoop, you can make sure you are not leaving this “legacy” data in the dark.
- Non-textual data. Most data analytics workflows are built around textual data, which is easier to ingest. You can also make use of video, audio or other non-textual files, however. You can analyze the meta data associated with them, or, if appropriate, translate speech to text in order to gain more insight into the content of the data itself. The effort required in this regard may not be worth it in all cases, but the bigger point worth keeping in mind is that your non-textual data doesn’t have to be dark data. There are ways to make it actionable if you need it to be.
Related: Making Dark Data Light Again
Meeting the Dark Data Challenge
No matter which types of dark data your organization collects, or how it is stored, the key to keeping data out of the dark is to ensure that you have a means of translating data from one form to another and ingesting it easily into whichever analytics platform you use.
Syncsort’s suite of Big Data solutions, which includes data access, translation and integration tools like DMX-h, provides that functionality. It allows you to move data easily into Hadoop from environments that are traditionally very dark ones for data, like mainframes.
To learn more, check out Syncsort’s eBook, Bringing Big Data to Life, to discover best practices for managing dark data with Hadoop.