论文标题
超新星:使用基于风险的测试和机器学习,在AAA视频游戏中选择和预防缺陷预防自动化测试选择
SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video Games Using Risk Based Testing and Machine Learning
论文作者
论文摘要
测试视频游戏是一项越来越困难的任务,因为传统方法无法通过不断增长的软件系统进行扩展。手动测试是一个非常密集的过程,因此很快就会变得高昂。使用脚本进行自动测试是负担得起的,但是脚本在非确定性环境中无效,并且知道何时运行每个测试是另一个问题。现代游戏的复杂性,范围和玩家的期望正在迅速增加,而质量控制是生产成本和交付风险的很大一部分。降低这种风险并使产量实现是当前行业的巨大挑战。为了保持生产成本最新成本,我们专注于预防质量保证策略以及测试和数据分析自动化。我们提出了超新星(在外部存储库中选择测试和通用缺陷预防,以进行新的软件异常客观验证),这是一个负责测试选择和预防缺陷的系统,同时也充当自动化中心。通过将数据分析功能与机器和深度学习能力相结合,超新星协助质量保证测试人员在减少缺陷方面找到了错误和开发人员,从而提高了生产周期中的稳定性,并将测试成本控制在控制成本中。已经观察到这是直接影响的直接影响是未公开的体育游戏标题的55%或更多测试时间减少,该游戏使用了这些测试选择优化。此外,使用半监督机器学习模型产生的风险评分,我们能够以71%的精度检测到,77%的人回想起诱发变更名单的可能性,并将这种推论详细分解对开发人员。这些努力改善了工作流程,并减少开发游戏标题所需的测试时间。
Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems. Manual testing is a very labor-intensive process, and therefore quickly becomes cost prohibitive. Using scripts for automated testing is affordable, however scripts are ineffective in non-deterministic environments, and knowing when to run each test is another problem altogether. The modern game's complexity, scope, and player expectations are rapidly increasing where quality control is a big portion of the production cost and delivery risk. Reducing this risk and making production happen is a big challenge for the industry currently. To keep production costs realistic up-to and after release, we are focusing on preventive quality assurance tactics alongside testing and data analysis automation. We present SUPERNOVA (Selection of tests and Universal defect Prevention in External Repositories for Novel Objective Verification of software Anomalies), a system responsible for test selection and defect prevention while also functioning as an automation hub. By integrating data analysis functionality with machine and deep learning capability, SUPERNOVA assists quality assurance testers in finding bugs and developers in reducing defects, which improves stability during the production cycle and keeps testing costs under control. The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title that has shipped, which was using these test selection optimizations. Furthermore, using risk scores generated by a semi-supervised machine learning model, we are able to detect with 71% precision and 77% recall the probability of a change-list being bug inducing, and provide a detailed breakdown of this inference to developers. These efforts improve workflow and reduce testing hours required on game titles in development.