← ML Research Wiki / 1705.07874

A unified approach to interpreting model predictions

Scott M Lundberg School of Computer Science University of Washington Seattle 98105WA, Paul G Allen School of Computer Science University of Washington Seattle 98105WA School of Computer Science University of Washington Seattle 98105WA (2017)

Paper Information

arXiv ID

1705.07874

Venue

Neural Information Processing Systems

Domain

machine learning, artificial intelligence

SOTA Claim

Yes

Reproducibility

7/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

Understanding why a model made a certain prediction is crucial in many applications.However, with large modern datasets the best accuracy is often achieved by complex models even experts struggle to interpret, such as ensemble or deep learning models.This creates a tension between accuracy and interpretability.In response, a variety of methods have recently been proposed to help users interpret the predictions of complex models.Here, we present a unified framework for interpreting predictions, namely SHAP (SHapley Additive exPlanations), which assigns each feature an importance for a particular prediction.The key novel components of the SHAP framework are the identification of a class of additive feature importance measures and theoretical results that there is a unique solution in this class with a set of desired properties.This class unifies six existing methods, and several recent methods in this class do not have these desired properties.This means that our framework can inform the development of new methods for explaining prediction models.We demonstrate that several new methods we presented in this paper based on the SHAP framework show better computational performance and better consistency with human intuition than existing methods.

Summary

The paper presents a unified framework for interpreting model predictions known as SHAP (SHapley Additive exPlanations), addressing the challenge of making complex model predictions interpretable. The authors argue that while complex models like ensemble and deep learning can achieve high accuracy, they often lack interpretability, which is crucial for user trust, particularly in fields like medicine. Through their framework, they unify six existing methods of feature importance attribution and demonstrate that there is a unique solution in this class that adheres to desired properties such as local accuracy, missingness, and consistency. The paper introduces SHAP values as a novel measure of feature importance and proposes several new methods for estimating these values that outperform existing methods in terms of computational efficiency and alignment with human intuition, as validated through user studies. The authors also emphasize the implications of their framework for informing the development of future explanatory methods in machine learning.

Methods

This paper employs the following methods:

SHAP
LIME
DeepLIFT
Layer-wise relevance propagation
Shapley regression values
Shapley sampling values
Quantitative Input Influence

Models Used

None specified

Datasets

The following datasets were used in this research:

None specified

Evaluation Metrics

None specified

Results

SHAP values provide a unique measure of feature importance that satisfies local accuracy, missingness, and consistency.
Several new estimation methods for SHAP values show better computational performance and greater consistency with human intuition compared to existing methods.

Limitations

The authors identified the following limitations:

The computational challenges of exact SHAP value computation.

Technical Requirements

Number of GPUs: None specified
GPU Type: None specified

Keywords

model interpretability SHAP explainability feature importance machine learning models

Papers Using Similar Methods

External Resources

Funding: Not specified
References: 10
Influential Citations: 2112

A unified approach to interpreting model predictions

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers