Burning down the black box of ML using SHAP

Max Ortega
Jan 23, 2020 12:00:00 AM

Spoiler alert: you should use SHAP to understand your complex machine learning algorithm.

In some projects, finding the features that are crucial for a model, and describing how they impact the model, can be just as important as the performance of the model itself. For example, consider the following case:

A company’s KPI is reported via a number. The company is interested in finding the features that affect the value of the KPI, and they hire a data scientist for this task. The data scientist that they hire decides to do ML model that predicts the value of the KPI from a set of features (input variables), and the performance of this model is satisfactory. How can we learn from this model, in order to tackle the goal of the company? How can we tell which features are the most influential for the KPI? And how exactly are they influential?

The problem of feature importance is particularly relevant nowadays, when the algorithms used to solve problems can be extremely complex, and knowing the parameters that describe the model does not translate to an intuitive understanding of the workings of the algorithm. With simple models, such as linear regression, the impacts of a features are clear from the parameters. But this is not the case for more complex models, and it will never be: a feature may not have a single contribution on the model, but rather the contribution it has on the outcome depends on the combination of the values of all the features.

Shapley values

Imagine you are solving binary classification problem. The model is working wonderfully, since the predictions are accurate.

However to obtain such good results, you have to use an ensemble model, namely, a Random Forest Classifier. How can we figure out the inner workings of this model?

We can figure this out by calculating the Shapley values. Shapley values express the contribution that features have on the output of a model. This post focuses on providing the elements needed to understand, interpret and construct Shapley values in the context of supervised machine learning projects. Here we go:

  • For every feature and every sample in the training set, we obtain a Shapley value. Therefore, there is no Shapley value of a feature, but rather a feature has as many Shapley values as there are samples.

  • For a sample s, our model outputs the value model(s). For feature f, the Shapley value ϕ(s,f) is the contribution that feature f has on the output model(s). These contributions are given with respect to a base value, namely, the average value of model(s) for all training samples s. In mathematical notation, we have:

From the formula we can see that the Shapley values give the degree at which the features contribute to the value model(s).

The Shapley values of a single sample can be shown via a force plot shown below. This plot gives an idea of the contribution of features. In the plot, we observe that the base value is .3935

The output value of our model for this sample is 1, and the forceplot shows the positive Shapley values ϕ(s,f) in red and the negative ones in blue. The features worst concave pointsmean concave points and worst radius contributed to an increase in the value of the predictions, while worst texture contributed to a decrease.

The real question is, of course, how to calculate the values ϕ(s,f)? The formula, as the Shapley values, come from Game Theory, and you can find it here, as well as the mathematical properties of Shapley values. It can take some time to understand the concepts on which the formula is based, and the motivations for it, but it is intuitively sound. Here we highlight two crucial considerations in the context of machine learning:

  • The theoretical properties of Shapley values are very desirable: efficiency (which is the formula above), symmetry (the Shapley values are calculated with no influence of the order of the features), additivity (Shapley values of ensemble models are calculated intuitively) and dummy detection (features with no influence have a Shapley value of 0). These properties together are referred to as a fair payout. Obtaining a feature importance algorithm that is a fair payout was a mathematically challenging aspect that motivated the creation of Shapley values.
  • Even for small projects it can be unpractical to calculate the Shapley values. The calculation of a single Shapley value ϕ(s,f) needs exponentially many computations with respect to the number of features; this calculation needs to be done for all samples, all features.


SHAP brings Shapley values from the computationally expensive to the practical realm. SHAP is a method to estimate Shapley values, which has its own python package that provides a set of visualizations to describe them (like the plot above). With this tool we are able to disclose the feature importance of the model. The mathematics behind these methods can be summarized as:

      • SHAP contains algorithms that focus on specific types of models, and often calculate approximations of Shapley values.

      • To calculate a Shapley value, the number of calculations is not exponential on the number of features anymore! The exact number of calculations done depends on the type of model being analysed.

      • Works excellent for tree-based models. The variant of SHAP which deals with trees (TreeSHAP) calculates exact Shapley values and does it fast.

      • The visualizations can be very expressive, one of them is the force plot in the figure above. Perhaps the most thorough visualization provided is the following, which shows all the Shapley values.

In this plot, the features are ordered top to bottom by how large their contributions are, so that the most influential feature for our model is worst concave points. At each feature f, the plot shows the distribution of Shapley values ϕ(s,f) for all s in the training set (since we are using a tree-based model, SHAP and Shapley values are the same). The value that each sample has at the specific feature is represented with the color.

Therefore, from the plot we can see that large values of worst concave points typically contribute to the probability of belonging to class 1 by around .1 and in some cases up to .2. We say typically because, as we have emphasized, the impact of the value of the feature depends on the whole sample, which is why we still have some red dots on the left side of worst concave points. Depending on the project, these cases may be interesting to look into further.

Remember from the force plot that the base value for our model is .3935. So that

If a high value of worst concave points usually makes a sample to have an output value increase of .1 or .2 then for a sample that has a high value of worst concave point the expected output value would typically increase from .3935 to around .55 (if no extra information is given about the values of the other features).

Final note

So that’s how you can interpret the plots above and understand better your algorithm. I would like to add the following two warnings for interpretation:

      • For tree-based algorithms, drastically different models can bring very similar results, therefore the Shapley values for two tree-based models can be very different for very similar models. If the algorithms used have a random component, I would suggest to build several of them and compare the Shapley values obtained for each model. In that way, you will be able to tell whenever a variable is influential in every model.

      • Shapley values do not necessarily explain causation. Observe correlations among the variables before drawing conclusions from the plots. A quick check of a correlation map can shed a light at the real causing variables.

So remember, the next time you want to describe how your algorithm works, be sharp about it. Be SHAP.

The code can be found here