Data infrastructure optimization, availability & security software
Data integration & quality software
The Next Wave of technology & innovation

Expert Interview (Part 3): James Kobielus on the Future of Blockchain, AI, Machine Learning, and GDPR

Since Syncsort recently joined the Hyperledger community, we have a clear interest in raising awareness of the Blockchain technology. There’s a lot of hype out there, but not a lot of clear, understandable facts about this revolutionary data management technology. Toward that end, Syncsort’s Integrate Product Marketing Manager, Paige Roberts, had a long conversation with Wikibon Lead Analyst Jim Kobielus.

In the first part of the conversation, we discussed the basic definition of what the Blockchain is, and cut through some of the hype surrounding it. In the second part, we dove into the real value of the technology and some of the practical use cases that are its sweet spots. In this final part, we’ll talk about the future of Blockchain, how it intersects with artificial intelligence and machine learning, how it deals with privacy restrictions from regulations like GDPR, and how to get data back out once you’ve put it in.

Roberts: Where does Blockchain go from here? What do you see as the future of Blockchain?

Kobielus: It will continue to mature. In terms of startups, they’ll come and go, and they’ll start to differentiate. Some will survive to be acquired by the big guys, who will continue to evolve their own portfolios, while integrating those into a wide range of vertical and horizontal applications.

Nobody’s going to make any money off of Blockchain itself. It’s open source. The money will be made off of cloud services, especially cloud services that incorporate Blockchain as one of the core data platforms.

Believe it or not, you can do GDPR on Blockchain but, here’s the thing: the GDPR community is working out exactly what you can do to delete the data records consistently on the Blockchain. Essentially, you can encrypt the data and then delete the key.

Right. If you can’t decrypt it, you can’t ever read it.

Yeah. Inaccessible forever more in theory. That’s a possibility of harmonizing Blockchain architecture with the GDPR and other mandates that require the right to be forgotten. The regulators also have to figure out what is Kosher there. I think there will be some reconciliation needed between the techies pushing Blockchain, and the regulators trying to enforce the various privacy mandates.

Just as important in terms of where it’s going, Blockchain platforms as a service, PAAS, will become ever more important components of the data providers overall solutions. Year by year, you’ll see the Microsofts, IBMs and Oracles of the world evolve Blockchain-based Cloud services into fairly formidable environments.

There are performance issues, in terms of speed of updates with Blockchain now, but I also know that there is widespread R & D to overcome those. VMWare just announced they’re working on a faster consensus protocol, so that different nodes on the Blockchain can come to consensus rapidly, allowing more rapid updates to the chain. Lots of parties are looking for better ways to do that. So, maybe it might become more usable for transactional applications in the future.

Blockchain deployment templates are going to become the way most enterprise customers power this technology. AWS and Microsoft already offer these templates for rapid creation and deployment of a Blockchain for financial or supply chain or whatever. We’re going to see more of those templates as the core way in which people buy, in a very business friendly abstraction. There will be a lot of Blockchain-based applications for specific needs. We’ll see a lot of innovation in terms of how to present this technology and how to deliver it so that you don’t have to understand what a consensus protocol is or really give a crap about what’s going on in the Blockchain itself. It should be abstracted from the average customer.

More in terms of going forward, you’ll see what I call “Blockchain domain accelerators.” There are Blockchain consultants everywhere now. There are national Blockchain startup accelerators. There are industry-specific Blockchain startup accelerators. There are Blockchain accelerators in terms of innovation of cryptocurrency and Internet of Things. We’re going to see more of these domain accelerator industry initiatives come to fruition using Blockchain as their foundation. They’ll analyze and make standards of how to deploy, secure and manage this technology specific to industry and use case requirements. That definitely is the future.

As I mentioned before, it will become a bigger piece of the AI future, because of Blockchain-based distributed marketplaces for training data. Training data for building and verifying machine learning models for things like sentiment analysis has real value. There’s not many startups in the world that would have massive training datasets already. To build the best AI, you’ll need to go find the best training datasets for what you’re working on.

Debugging Data - Why Data Quality Is Essential for AI and Machine Learning Success, blockchain

I talked about that a little with Paco Nathan at Strata, how labelled, valid, useful training datasets were incredibly valuable now, and AI companies recognize that. They will share their code with you, but not their data, not for free.

I really think you’ll see a lot more AI training dataset marketplaces with Blockchain as the backing technology. It’s going to become a big piece of the AI picture.

Blockchain security is another big thing going forward. The Blockchain is the weak link is in protecting your private keys, which provide you with secure access to your cryptocurrencies that are running out of the chain. What we’re going to see is that there will be more emphasis on security capabilities that are edge-to-edge in terms of securing Blockchains from the weakest link, which is the end-user managing their keys. I think you’ll start to see a lot of Blockchain security vendors that help you manage your private keys, and also smart contracts. Smart contracts on the Blockchain have some security vulnerabilities in their own right. We’ll see a lot of new approaches to making these tamper-proof. There’s already a lot of problem with fraud.

I think I’ve covered most of the big things I see coming. That is the really major stuff.

One more thing, I’m curious about since Blockchain is still fairly new to me. There’s a lot of conversation about how you store data on the Blockchain, and a lot of research into things like securing it, and speeding up update speed, but storing data is only half the story with data management. Once you’ve put all this data in, you have to then get it out. If I’ve got a Blockchain, it has all this information I need, how do I go find and retrieve information from it? Do I use SQL?

There’s a query language in the core Blockchain code base.

So, it has its own specific query language, and people will have to learn a whole other way to retrieve data?

Basically, the core of Hyperledger has got a query language built in. It’s called Hyperledger Explorer. Hyperledger, in itself, is an ecosystem of projects just like Hadoop is and was, that will evolve. It’ll be adopted at various rates, some projects will be adopted widely, and some very little during production Blockchain deployments.

There’s some parallels with early Hadoop. Some of the early things that Hadoop had under their broad scope, they had an initial query language that didn’t take off, they updated that, and improved it with HiveQL. Same thing with Spark. They started out with a query language Shark, and switched to another one, Spark SQL.

We have to look at the entire ecosystem. Over time, some pieces may be replaced by proprietary vendor offerings, or different open source code that does these things better. It’s part of the maturation process. Five years from now, I’d like to see what the core Blockchain Hyperledger stack is. It may be significantly different. It may change as stuff gets proved out in practice.

Yeah, Hadoop changed a lot over the last decade.

Hadoop has become itself just part of a larger stack with things like Tensorflow, R, Kafka for streaming. Innovation continues to deepen the stack. The NoSQL movement, graph databases, the whole data management menagerie continues to grow. We’ll see how the core protocol of Blockchain evolves too. It’s a work in progress, like everything else.

I’ve written a bunch of articles on this. It’s changing all the time.

I’ll be sure to include some links in the blog post, so folks can learn more. I really thank you for taking the time to speak with me. It was really informative.

No problem. I enjoyed it.

Jim is Wikibon’s Lead Analyst for Data Science, Deep Learning, and Application Development. Previously, Jim was IBM’s data science evangelist. He managed IBM’s thought leadership, social and influencer marketing programs targeted at developers of big data analytics, machine learning, and cognitive computing applications. Prior to his 5-year stint at IBM, Jim was an analyst at Forrester Research, Current Analysis, and the Burton Group. He is also a prolific blogger, a popular speaker, and a familiar face from his many appearances as an expert on theCUBE and at industry events.

Also, make sure to download our white paper on Why Data Quality Is Essential for AI and Machine Learning Success.

Related Posts