Data infrastructure optimization, availability & security software
Data integration & quality software
The Next Wave of technology & innovation

Expert Interview (Part 2): James Kobielus on Blockchain’s Sweet Spot in Practical Business Use Cases

Since Syncsort recently joined the Hyperledger community, we have a clear interest in raising awareness of the Blockchain technology. There’s a lot of hype out there, but not a lot of clear, understandable facts about this revolutionary data management technology. Toward that end, Syncsort’s Integrate Product Marketing Manager, Paige Roberts, had a long conversation with Wikibon Lead Analyst Jim Kobielus.

In the first part of the conversation, we discussed the basic definition of what the Blockchain is, and cut through some of the hype surrounding it. In this second part, we dove into the real value of Blockchain technology and some of the practical use cases that are its real sweet spots.

Roberts: The hype cycle tends to make all kinds of wild claims. It will do everything but wash your socks. Which claims for Blockchain do you feel have some validity?

Kobielus: First of all, since it doesn’t support CRUD, it’s not made for general purpose database transactions. It’s made for highly specialized environments where you need to have a persistent immutable record like logging. Logging of security related events, logging system events for later analysis correlation, etc. Or, where you have an immutable record of assets, video, music and so forth, in a marketplace where these are intellectual property that need protection against tampering. If you have a tamper-proof distributed record, which is what Blockchain is, it’s perfect for maintaining vast repositories of intellectual properties for downstream monetization. Or, for tracking supply chains.

A distributed transaction record that can’t be repudiated, that can’t be tampered with, that stands up in legal situations is absolutely valuable. So, Blockchain makes a lot of sense in those kinds of applications. In addition to lacking the ability to delete and edit the data, Blockchain is slow. It’s not an online transactional database. Updates to the chain can take minutes or hours depending on how the chain is set up, and how extensive the changes are, so you can’t have a high concurrency of transactions. It’s just not set up for fast query performance. It’s very slow.

Also, in a world moving towards harmonization around privacy protection, consistent with what the European Union has done with the General Data Protection Regulation (GDPR), and the recent California privacy regulation that is similar to GDPR. GDPR requires that any personally identifiable information (PII) must be capable of being forgotten, meaning people have the right to request deletion of their personal data, or to edit it if it’s wrong. In Blockchain, you can’t delete, and you can’t edit a record that’s written in Blockchain. There’s a vast range of enterprise applications that have personally identifiable information. The bulk of your business, sales, marketing, customer service, HR, etc. has tons of PII data.

So, Blockchain is not suitable for those core transaction processing applications. Any application that demands high performance queries will not be on the Blockchain. It’s not suitable for highly scalable real-time transactions of any sort, whether or not they involve PII data.

The way I see it, Paige, is there’s a range of fit for purpose data platforms in the data management space. There’s relational databases, all the NoSQL databases, there’s HDFS, there’s graph databases, key-value stores, real-time in-memory databases, and so on. Each of those is suited to particular architectures and use cases, but not to others. Blockchain is fundamentally a database, and it’s got its uses. It’s not going to dominate all data computing like a monoculture, no matter what John McAfee says. That’s not going to happen. It’s already limited technologically, and with regulatory limitations. It’s a niche data platform that’s finding its sweet spot in various places.

Debugging Data - Why Data Quality Is Essential for AI and Machine Learning Success

You mentioned a couple of good use cases like supply chain management. I’ve heard of uses like tracking diamonds from the mine to the jewelry store to be certain of their origins, that they’re not blood diamonds. All of the examples I had heard of in the past were based on the concept of Blockchain as a transactional ledger or even a sensor log. For example, you keep sensors on your food from the farm to the market to make sure that it never went above a certain temperature for a certain amount of time, that sort of thing. One of the use cases you mentioned was actually news to me, that you could store other sorts of data like application code, so you could do code change management with it. What other use cases do you see coming?

Actually, there are a few pieces that I published recently for vertical application focused supply chain management. Blockchain startups are trying to grab a piece of the video streaming market. Essentially these services, a lot of which are still in alpha or beta pre-release phase, use Blockchain in several capacities. One for distributed video storage. Number two, for distributed video distribution from a peer-to-peer protocol.

Distributed video monetization using a Blockchain-based cryptocurrency that’s specific to each environment to help the video publishers monetize their offering. Blockchain for distributed video transactions, and for contracts. Blockchain for distributed video governance.

So are you talking about having something like Netflix bucks?

More and more Blockchain applications aren’t one hundred percent on the Blockchain. They handle things like PII off the chain, for instance, and put that in a relational database. Most architecture is using fit-for-purpose data platforms for specific functions in a broader application. That is really where Blockchain is coming into its own.

Another specialized Blockchain use case is artificial intelligence, one of my core areas. I’ve been reading for a while now about the AI community experimenting with using Blockchain as an AI compute brokering backbone; there’s a company called Cortex. You can read my article on that. They use Blockchain as a decentralized AI training data exchange. They have data that has the core ground truths a lot of AI applications need to be trained on.

Expert Interview (Part 2) - James Kobielus on Blockchain’s Sweet Spot in Practical Business Use Cases - quote

So you’re saying they basically create really solid, excellent training datasets, doing all the data engineering to make sure these are good training datasets for AI ground truths, and then use Blockchain to exchange them to other AI developers?

It’s a Blockchain for people who built and sourced their training data to store it in a ledger so that others can tap into that data from an authoritative repository.

Right. Okay. That makes sense. Seems like a valuable commodity to the AI community.

Several small companies are doing this. They’re converging training data into an exchange or marketplace for downstream distribution to data scientists, or whoever will pay for the training data. Blockchain is used as an AI middleware bus, an AI audit log, an AI data lake.

What I’m getting at, Paige, is that there are lots of industry-specific implementations of Blockchain. Industries everywhere are using this, some in production, but many of them are still piloting and experimenting with Blockchain in a variety of contexts including e-commerce, AI, video distribution, in ways that are really fascinating.

These are the same kinds of dynamics that we saw in the early days of Hadoop and NoSQL and other technologies. Each technology market grows by vendors finding a sweet spot, an application that their approach is best suited to.

We see a lot of hybrid data management approaches in companies that use two or more strategies in a common architecture.

One thing that’s missing from all that stuff is real-time streaming, continuous computing applications. Blockchain is very much static data, it’s almost the epitome of static data. You won’t see too many real-time applications for Blockchain alone, but that’s okay. Blockchain is good for the things that it’s good for.

Blockchain will find its niche given time?


Be sure not to miss Part 3 where we’ll talk about the future of Blockchain, how it intersects with artificial intelligence and machine learning, how Blockchain deals with privacy restrictions from regulations like GDPR, and how to get data back out of the Blockchain once you’ve put it in.

Jim is Wikibon’s Lead Analyst for Data Science, Deep Learning, and Application Development. Previously, Jim was IBM’s data science evangelist. He managed IBM’s thought leadership, social and influencer marketing programs targeted at developers of big data analytics, machine learning, and cognitive computing applications. Prior to his 5-year stint at IBM, Jim was an analyst at Forrester Research, Current Analysis, and the Burton Group. He is also a prolific blogger, a popular speaker, and a familiar face from his many appearances as an expert on theCUBE and at industry events.

Make sure to download our white paper on Why Data Quality Is Essential for AI and Machine Learning Success.

Related Posts