Venue
Neural Information Processing Systems
Domain
machine learning, artificial intelligence
Understanding why a model made a certain prediction is crucial in many applications.However, with large modern datasets the best accuracy is often achieved by complex models even experts struggle to interpret, such as ensemble or deep learning models.This creates a tension between accuracy and interpretability.In response, a variety of methods have recently been proposed to help users interpret the predictions of complex models.Here, we present a unified framework for interpreting predictions, namely SHAP (SHapley Additive exPlanations), which assigns each feature an importance for a particular prediction.The key novel components of the SHAP framework are the identification of a class of additive feature importance measures and theoretical results that there is a unique solution in this class with a set of desired properties.This class unifies six existing methods, and several recent methods in this class do not have these desired properties.This means that our framework can inform the development of new methods for explaining prediction models.We demonstrate that several new methods we presented in this paper based on the SHAP framework show better computational performance and better consistency with human intuition than existing methods.
The paper presents a unified framework for interpreting model predictions known as SHAP (SHapley Additive exPlanations), addressing the challenge of making complex model predictions interpretable. The authors argue that while complex models like ensemble and deep learning can achieve high accuracy, they often lack interpretability, which is crucial for user trust, particularly in fields like medicine. Through their framework, they unify six existing methods of feature importance attribution and demonstrate that there is a unique solution in this class that adheres to desired properties such as local accuracy, missingness, and consistency. The paper introduces SHAP values as a novel measure of feature importance and proposes several new methods for estimating these values that outperform existing methods in terms of computational efficiency and alignment with human intuition, as validated through user studies. The authors also emphasize the implications of their framework for informing the development of future explanatory methods in machine learning.
This paper employs the following methods:
- SHAP
- LIME
- DeepLIFT
- Layer-wise relevance propagation
- Shapley regression values
- Shapley sampling values
- Quantitative Input Influence
The following datasets were used in this research:
- SHAP values provide a unique measure of feature importance that satisfies local accuracy, missingness, and consistency.
- Several new estimation methods for SHAP values show better computational performance and greater consistency with human intuition compared to existing methods.
The authors identified the following limitations:
- The computational challenges of exact SHAP value computation.
- Number of GPUs: None specified
- GPU Type: None specified
model interpretability
SHAP
explainability
feature importance
machine learning models