在机器学习库中表征和理解软件安全漏洞

论文标题

在机器学习库中表征和理解软件安全漏洞

Characterizing and Understanding Software Security Vulnerabilities in Machine Learning Libraries

论文作者

Harzevili, Nima Shiri, Shin, Jiho, Wang, Junjie, Wang, Song

论文摘要

在许多领域，包括自主驾驶系统，医疗和关键行业在内的机器学习（ML）库的应用已大大增加。这些图书馆的脆弱性会导致无法弥补的后果。但是，软件安全漏洞的特征尚未得到很好的研究。在本文中，为了弥合这一差距，我们朝着表征和理解五个著名ML库的安全漏洞迈出的第一步，包括Tensorflow，Pytorch，Sickit-Learn，Pandas和Numpy。为此，我们总共收集了596个与安全有关的承诺，以探索五个主要因素：1）脆弱性类型，2）根本原因，3）症状，4）固定模式和5）修复ML库中安全脆弱性的工作。这项研究的发现可以帮助开发人员更好地了解不同ML库的软件安全漏洞，并更好地了解其弱点。为了使我们的发现可行，我们进一步开发了DeepMut，这是一种自动突变测试工具，作为我们发现的概念验证应用。 DeepMut旨在评估ML图书馆现有测试套件的充分性，以抵抗从这项工作中研究的漏洞中提取的安全性突变操作员。我们将DeepMut应用于Tensorflow内核模块上，发现现有测试套件未考虑的超过1K活着的突变体。结果证明了我们发现的有用性。

The application of machine learning (ML) libraries has been tremendously increased in many domains, including autonomous driving systems, medical, and critical industries. Vulnerabilities of such libraries result in irreparable consequences. However, the characteristics of software security vulnerabilities have not been well studied. In this paper, to bridge this gap, we take the first step towards characterizing and understanding the security vulnerabilities of five well-known ML libraries, including Tensorflow, PyTorch, Sickit-learn, Pandas, and Numpy. To do so, in total, we collected 596 security-related commits to exploring five major factors: 1) vulnerability types, 2) root causes, 3) symptoms, 4) fixing patterns, and 5) fixing efforts of security vulnerabilities in ML libraries. The findings of this study can assist developers in having a better understanding of software security vulnerabilities across different ML libraries and gain a better insight into their weaknesses of them. To make our finding actionable, we further developed DeepMut, an automated mutation testing tool, as a proof-of-concept application of our findings. DeepMut is designed to assess the adequacy of existing test suites of ML libraries against security-aware mutation operators extracted from the vulnerabilities studied in this work. We applied DeepMut on the Tensorflow kernel module and found more than 1k alive mutants not considered by the existing test suits. The results demonstrate the usefulness of our findings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题