A Primer on Governing Unruly Data
Just as government officials probably wish their constituents always got along and never disagreed, you likely wish that your company’s data were easy and simple to manage. But it’s not. That’s why you need a data governance process for managing all the unruly data that your organization collects, stores and analyzes.
This article explains the basics of developing an effective data governance process for taming unruly data.
Defining Data Governance
Before we dive into the specifics, though, let’s define what we’re talking about.
Broadly speaking, data governance is what it sounds like: A set of processes and approaches that define how data is managed within an organization.
And by “unruly” data, I mean data that is difficult to manage because it is inconsistent in format, stored on legacy systems from which it is hard to offload, filled with errors and so on.
In a perfect world, unruly data would not exist. But in the real world, where companies rely on infrastructure built using a mix of modern and legacy systems, and where mistakes committed by both humans and machines are a fact of life, unruly data is virtually unavoidable.
3 Steps for Establishing Data Governance
Fortunately, with an effective data governance process, you can tame unruly data. To do so, you need to establish a data governance process by following these steps:
- Identify governance authorities. Once data governance is established, everyone in your organization should learn about and adhere to them. But you need a group of people within the organization who are responsible for formulating data governance procedures and enforcing them. Ideally, these will be people who have a grasp both technical challenges associated with unruly data – like the myriad different data storage formats and systems on which data exists – as well as an understanding of how data relates to the needs and operations of the organization.
- Identify best practices and procedures for managing data. The second step in establishing data governance is to plan how you want to manage your unruly data. Decide what your goals will be when working with unruly types of data, and the types of workflows that will allow you to achieve them.
- Implement the tools necessary for achieving your governance goals. Finally, identify and set up the tools you require to work with unruly data. Finding and implementing tools is the last step in the process because you should decide what you want to achieve first, then find the solutions you need to achieve it, rather than choosing a toolset first and then limiting yourself to the functionality it provides.
Taming Data with Syncsort
The exact types of tools and procedures that you build into your data governance plan will vary according to your specific needs, of course. But there’s a good chance Syncsort’s data management and data quality tools will assist you in meeting your goals.
That’s particularly true when it comes to handling unruly data because Syncsort offers Big Data solutions that provide both data offloading functionality (via tools like DMX-h, which streamlines the process of accessing and data from legacy mainframe environments and integrating with modern platforms like Hadoop for next generation analytics) as well as data quality (through Trillium, a solution for finding and fixing errors and well as enriching data within databases).
For more information on achieving data governance, read the Trillium Software white paper “Start Small, Gain Support, Collaborate, and Quantify – The Keys to Data Governance Success”