Tony Baer, Principal Analyst at Ovum, recently spoke to us about trends in Big Data, the future of Hadoop and GDPR.
Part 1 discussed trends in Hadoop and Cloud data. In this part, Tony Baer discusses what’s next for Hadoop and cloud and whether we’re prepared for GDPR.
How do you see Hadoop continuing to evolve?
There are also some interesting things coming down the pike. They’re not out yet, but they’ll probably be part of a future Hadoop 3.x point release.
One is that YARN is getting a little more granular in managing specific resource types. We’re also seeing better support for managing Docker containers with YARN – which is especially critical if we’re running in the cloud.
I haven’t yet heard much about Hadoop and Kubernetes, but that has to be coming soon, too. As soon as you start talking about containers, you have to handle container orchestration.
Let’s talk GDPR. How prepared are businesses for it, especially when it comes to data analytics?
I’m not an expert on the GDPR. But I’ll say this: It creates two major challenges for data workloads: data privacy and data sovereignty. These will necessitate far more granular strategies for managing Big Data.
We’ll have to come up with granular strategies for how we can do wide-area operations where we’re analyzing a large amount of data, and it’s spread across multiple countries, under multiple jurisdictions.
I’m seeing a lot of innovation from the cloud folks in handling these challenges through a multi-master replication architecture. That hasn’t happened yet in the Hadoop world, but we’ll need to see similar innovations there in order to handle the GDPR requirements. It’s a very complicated nut to crack, but we’ll need either vendors or the Apache community – or both – to address it.
What Research are you Currently Working On?
My research is going in two areas. First, I’m continuing to study cloud transformation. I’m interested in how data management changes when you go cloud-native: Databases are architected differently, application architectures change, we rely more heavily on REST APIs, and so on.
I’m also looking at how to operationalize data science. I want to know how to integrate data science into the business, and how to help data scientists explain and evangelize what they do.
Make sure to check out our on demand webinar, 2018 Big Data Trends: Liberate, Integrate & Trust Your Data, to discover 5 Big Data trends to watch for in the coming year.