When we purchase veggies at the local grocery shop, the criteria for our choice of one over the other is rarely ever driven by "how tasty" it is, or "how colorful it looks". The veggies must pass several tests that we and in most cases our experienced moms have created, to end up in our refrigerator. But for Data Scientists, when choosing models for a task, there are only a handful of criteria, like accuracy and precision, top 5 and top 1 error rate, which aren't really a very robust measure for why one should prefer one over the other.

But what if there was a way to know exactly on what data points are two models varying, just like that between a spoilt tomato and a healthy one might require you to evaluate them on the basis of touch. The importance of this can't be understated in the Data Science marketplace, where most of the prebuilt models are black box i.e. it is very difficult if not impossible to interpret and compare the WHYs and the HOWs of their particular choices, since they have been trained on completely different datasets.
MODEL DIFFERENCING
Model differencing allows us to move past the "accuracy" only paradigm for model selection. It allows us to see precisely where two models differ and on what aspects. Suppose an Image recognition model gets half of the cats wrong, while another gets most of them right. So looking at just the overall accuracy won't really help us here. Model differencing through proper techniques can crudely show what actually these two models are doing under the hood. Is the first one confusing cat ears to be those of dogs, or is it confused by the image-angle ?Interpretable Model differencing leads us there.

Superficially Model Differencing isn't rocket science, the difference between two models is listed on the basis of something known as Joint Surrogate Tree.
Let's dissect that word, What is a Surrogate Tree Anyway? This is just like a flowchart.
Comments