Expert Interview (Part 2): Dr. Sourav Dey on Augmented Intelligence and Model Explainability
At the recent DataWorks Summit in San Jose, Paige Roberts, Senior Product Marketing Manager at Syncsort, had a moment to speak with Dr. Sourav Dey, Managing Director at Manifold. In the first part of our three-part interview Roberts spoke to Dr. Dey about his presentation which focused on applying machine learning to real world requirements. Dr. Dey gave two examples of matching business needs to what the available data could predict.
Here in part two, Dr. Dey discusses augmented intelligence, the power of machine learning and human experts working together to outperform either one alone. In particular, AI as triage is a powerful application of this principle, and model explainability is the key to making it more useful.
Roberts: One of the big themes I’m seeing here, what the keynote talked about this morning, is that the best chess machine can beat the best human chess player, but both can be beaten by a mediocre chess player with a really good chess program working together. One of the things you talked about was that kind of cooperation between people and machines speeding up triage, and how that works.
Dey: Yeah, so this is what many people call augmented intelligence. I would say almost 50% or more of the projects that we do at Manifold fall into the business pattern that I call “AI as Triage”. The predictions that AI is doing helps to triage a lot of information that a single human can’t process. Then, the AI presents it in a way that a human can make a decision on it. That’s a theme that I’ve seen over and over again. Both of the examples I gave before fit that, for instance.
In the baby registry example, our client was collecting all of these signals that no single human can understand, all the web clicks, mobile clicks, marketing data, etc. The AI is triaging that and distilling it down so that a marketing person or the product person can make decisions on it.
In the oil and gas company example, it’s the same. The machines are generating fine-tick data from 54 sensors from thousands of locations across the country, no person (or even team of people) can look at that all the time.
Nobody can make sense of that.
Yeah, but the AI can crush it down, and present it to humans in an actionable way. That can really speed up that triage process. So that’s the goal there.
I was impressed by one example you mentioned. You have these decision trees making a decision that something would fail, and that was kind of useful. But the person still had to figure out from scratch why it would fail, and how to repair it. Whereas, if the AI explained … how was that done?
The TreeSHAP algorithm, yeah. It explains how a decision tree came to a particular decision. It’s relatively recent that people are doing some good research into this. Essentially, there is the model that’s making the prediction. Then, you can make another model of that model that explains the original model. It tells you why it made that prediction.
That WHY can be key.
There have been a few competing techniques out there. All of them had some issues, but this group at the University of Washington, inspired by game theory from economics, they made a consistent explanation. It’s called the Shapley Metric. What’s nice, is that they developed a fast version of it that can be used with tree-based models called TreeSHAP. It’s fantastic. We use it all the time now for explanations of why the model is making a particular individual prediction. For instance, today, you predicted .91 probability of failure. Why? You could also use it at the aggregate level, for something like: On the whole, thousands of machines over five years, what was importance of this feature in making the prediction?
And then the person going to repair that equipment knows WHY it was predicted to fail, and therefore has a pretty good idea of what they have to fix.
Well at least they have a much better idea. The maintenance engineers have a web app that they can then dig deeper into looking at the historical time series. In addition, they can VPN into the physical machine. All in all, the explainable model allows them to do triage much faster, and, in turn, do the repair more quickly.
Model explainability is incredibly useful for a lot of things. I know Syncsort has been doing a lot of work around GDPR, and I talked to a data scientist in Germany, Katharine Jarmul about this. For example, if a person wants a loan, and you’ve got a machine learning model that says no, you can’t have that loan, you have to be able to explain why.
Totally, yeah. There are laws about that for important civil rights reasons.
For what you’re doing, the reasons are less legal and more practical. If I’m going to use this prediction in order to take an action, such as a repair, it helps a lot if I know how the prediction was reached.
I can give another example of that. We did work for a digital therapeutics company. They make an app along with wearables that helps people get their diabetes under control. We were making predictions of whether or not, in 24 weeks, is the patient’s blood sugar going to go below a certain level. There’s a human in the loop, a human coach that you get as a part of this program. They didn’t know what to do with the raw prediction probability. When we put in an explainable algorithm, that let them know why that number was high or low, they could have much better phone calls with the patients.
Because they knew WHY the blood sugar was likely to dip.
They could say things like, hey, I see that you’re not doing this food planning very much, or you haven’t logged into the app in a while. You used to log in seven times a week. What’s going on? They have the knowledge ready to have a high bandwidth interaction with the patient.
So, I think there’s a lot there.
The more I learn about model explainability, the more I see where it’s hugely useful.
There are a lot of folks doing cool things with deep learning. It’s far harder to explain, but there’s work being done on that. Hopefully, in the next few years, there will be better techniques to explain those more complex models as well.
Tune in for the final part of this interview where Roberts and Dey speak about the effect of data quality as well as Entity Resolution in conjunction with machine learning.
Check out our white paper on Why Data Quality Is Essential for AI and Machine Learning Success.