Big Data, Big Challenge: Got Governance for Hadoop?
This blog was originally posted on Cloudera VISION.
In the past few years, Apache Hadoop has become a central component of the enterprise data architecture and has changed the way organizations store and process their data. As Hadoop matures into an enterprise data platform, organizations are using it to store and process significantly more data (structured and unstructured), and, in turn, more users and tools are accessing the data. The opportunity to drive greater insights is remarkable. However, more data, users, and tools create a big governance challenge.
Keeping track of data, data security, data access, and regulatory compliance are more critical and more challenging than ever before. Data governance in Hadoop — including auditing, lineage, and metadata management — requires a scalable approach that is easy to interoperate across multiple platforms.
The first challenge customers see is that the traditional data governance frameworks are difficult to deploy and do not scale well with Hadoop. They also fail to work with many Hadoop specific metadata repositories such as HCatalog, HDFS, etc. The second challenge is that many of the legacy metadata management frameworks have implemented their own proprietary security and access protocols, which are hard to integrate with Hadoop security protocols. With Hadoop’s promise of faster-time-to-value and increased business agility, trying to fit these traditional governance frameworks into the emerging data architecture simply doesn’t work.
Cloudera recognized these challenges early on and developed Cloudera Navigator, the leading Hadoop-based metadata management and data governance solution. As a long-time Cloudera partner and contributor to Hadoop open source projects including MapReduce, Sqoop, and Spark, we’re excited to certify our DMX-h data integration product on Cloudera Navigator. Because Syncsort DMX-h is natively running in Hadoop, it seamlessly integrates with Cloudera Navigator, allowing users to search for DMX-h jobs across a unified metadata repository and view data lineage within the Navigator user interface out-of-the-box.
Our joint customers can now leverage DMX-h’s integration with Navigator to provide a single interface for accessing all enterprise data, including IBM z mainframes, enabling audit, tracking and lineage for all data across multiple platforms. DMX-h can be deployed via Cloudera Manager and supports Hadoop-based security protocols, such as Kerberos, for data security and privacy.
We look forward to continue working closely with Cloudera to provide our joint customers a best-of-breed data management platform and an open data governance solution.
Learn more about Syncsort’s solutions for Cloudera, and experience them for yourself with a free test drive at www.syncsort.com/try.