Shapley Feature Importance and Understanding
As there are new algorithms coming into market frequently that are changing the whole format of data science modelling in terms of accuracy, but on the other side the explainability of models to business becomes very tricky.
If we are using linear models, then we can at bare minimum can explain that for every unit change in x, there is beta change in y where x is independent variable, y is dependent variable and beta is the coefficient.
For Tree based models, the explainability is further more difficult because, the algorithm finds the best tree split and then branches out for better models. Here we do not have any coefficients to explain to business people like we have in linear model. So how do we tackle this problem?
Shap values or shapley values come to our rescue here.
Shap is package that can be installed through pip or conda. To keep this article simple about shap and its interpretations, I will just concentrate on its interpretations.
As a data scientist or consultant, question comes to mind is what are shapley values? How to interpret them? How are they calculated?
In short, how shap values are calculated is using a logic where to find importance of feature f, we take some random samples N, and calculate model prediction with feature f and without feature f.
Shap Value of f = Prediction with f — prediction without f
Detailed procedure how shap value is calculated is as follows:
Shap takes model M and Data X where from X, an instance x is selected, feature f and a loop of N iterations are done to find shap value. In each iteration i, a random instance z is chosen whose values are replaced in x and another instance xf+1 with feature value f is chosen as another instance. Both the instances are same except the fth feature in xf+1
Use model M to find prediction with f and without f and take an average of all predictions in N iterations and this value is the shap value of feature f.
As we have shap values now, different plots that shap package gives are shap feature importance, shap summary plot, shap force plot and shap dependency plot.
- shap feature importance plot is the absolute shap value arranged in descending order, where high absolute value is regarded as most important feature
The first feature has high shap value hence it is regarded as most important feature.
2. shap summary plot is a combination of feature importance plot along with shap values for each instance.
Here the features are ranked in the order of their importance and each point is a shap value for a particular feature and instance. First feature tells that as the feature value increases, the probability of prediction is higher.
3. shap dependency plots tells us about the particular feature and its shap value. As the value of feature is lower, the chances of predicting target is higher is what it means from the below plot
4. shap clustering plots, clubs all instances and shap values. X-axis tell about the instances and y axis tells about the shap values. Red plot indicates that the predicted values are higher and blue plot indicates that predicted values are lower.
Please forgive me for masking the features as they are claimable. Please clap if you like my explanation.