WindowShap：一个有效的框架，用于根据Shapley值解释时间序列分类器

论文标题

WindowShap：一个有效的框架，用于根据Shapley值解释时间序列分类器

WindowSHAP: An Efficient Framework for Explaining Time-series Classifiers based on Shapley Values

论文作者

Nayebi, Amin, Tipirneni, Sindhu, Reddy, Chandan K, Foreman, Brandon, Subbian, Vignesh

论文摘要

打开和理解黑盒机器学习算法如何做出决定是研究人员和最终用户的持续挑战。解释时间序列预测模型对于具有高风险的临床应用非常有用，以了解预测模型的行为。但是，解释此类模型的现有方法通常是功能没有时间变化组件的数据所独有的。在本文中，我们介绍了WindowShap，这是一种模型不合时宜的框架，用于使用Shapley值解释时间序列分类器。我们打算进行窗框，以减轻计算长时间数据数据的Shapley值的计算复杂性，并提高解释的质量。 WindowShap是基于将序列分配到时间窗口中的基础。在此框架下，我们使用扰动和序列分析指标介绍了三种固定的固定，滑动和动态窗框的不同算法，每个窗口映射，每个窗口图。我们将框架应用于来自专门的临床领域（创伤性脑损伤-TBI）以及广泛的临床领域（重症监护医学）的临床时间序列数据。实验结果表明，基于两个定量指标，我们的框架在解释临床时间序列分类器方面优越，同时还降低了计算的复杂性。我们表明，对于具有120个时间步长（小时）的时间序列数据，与内部涂层相比，将10个相邻时间点合并可以将窗帘的CPU时间降低80％。我们还表明，我们的动态窗口图算法更多地关注最重要的时间步骤，并提供更易于理解的解释。结果，窗帘不仅加速了时间序列数据的沙普利值的计算，而且还提供了更高质量的更易于理解的解释。

Unpacking and comprehending how black-box machine learning algorithms make decisions has been a persistent challenge for researchers and end-users. Explaining time-series predictive models is useful for clinical applications with high stakes to understand the behavior of prediction models. However, existing approaches to explain such models are frequently unique to data where the features do not have a time-varying component. In this paper, we introduce WindowSHAP, a model-agnostic framework for explaining time-series classifiers using Shapley values. We intend for WindowSHAP to mitigate the computational complexity of calculating Shapley values for long time-series data as well as improve the quality of explanations. WindowSHAP is based on partitioning a sequence into time windows. Under this framework, we present three distinct algorithms of Stationary, Sliding and Dynamic WindowSHAP, each evaluated against baseline approaches, KernelSHAP and TimeSHAP, using perturbation and sequence analyses metrics. We applied our framework to clinical time-series data from both a specialized clinical domain (Traumatic Brain Injury - TBI) as well as a broad clinical domain (critical care medicine). The experimental results demonstrate that, based on the two quantitative metrics, our framework is superior at explaining clinical time-series classifiers, while also reducing the complexity of computations. We show that for time-series data with 120 time steps (hours), merging 10 adjacent time points can reduce the CPU time of WindowSHAP by 80% compared to KernelSHAP. We also show that our Dynamic WindowSHAP algorithm focuses more on the most important time steps and provides more understandable explanations. As a result, WindowSHAP not only accelerates the calculation of Shapley values for time-series data, but also delivers more understandable explanations with higher quality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题