大规模预训练的自动化代码审查活动

论文标题

大规模预训练的自动化代码审查活动

Automating Code Review Activities by Large-Scale Pre-training

论文作者

Li, Zhiyu, Lu, Shuai, Guo, Daya, Duan, Nan, Jannu, Shailesh, Jenks, Grant, Majumder, Deep, Green, Jared, Svyatkovskiy, Alexey, Fu, Shengyu, Sundaresan, Neel

论文摘要

代码审查是软件开发生命周期的重要组成部分，因为它旨在保证代码的质量。现代代码审查活动需要开发人员查看，理解甚至运行程序来评估逻辑，功能，延迟，样式和其他因素。事实证明，开发人员必须花费太多时间来审查同行的代码。因此，自动化代码审核过程是很大的需求。在这项研究中，我们专注于在代码审查方案中使用预训练技术来完成任务。我们从九种最受欢迎的编程语言中的开源项目中收集了真实代码更改和代码评论的大规模数据集。为了更好地理解代码差异和评论，我们提出了CodeReviewer，这是一个预先训练的模型，该模型利用四个专门针对代码审查方案量身定制的预训练任务。为了评估我们的模型，我们专注于与代码审核活动有关的三个关键任务，包括代码更改质量估计，评论评论生成和代码改进。此外，我们基于我们为这三个任务收集的数据建立了高质量的基准数据集，并在其上进行了全面的实验。实验结果表明，我们的模型在所有任务中都优于先前的最新预训练方法。进一步的分析表明，我们提出的培训预训练任务和多语言预训练数据集有益于理解代码更改和评论的模型。

Code review is an essential part to software development lifecycle since it aims at guaranteeing the quality of codes. Modern code review activities necessitate developers viewing, understanding and even running the programs to assess logic, functionality, latency, style and other factors. It turns out that developers have to spend far too much time reviewing the code of their peers. Accordingly, it is in significant demand to automate the code review process. In this research, we focus on utilizing pre-training techniques for the tasks in the code review scenario. We collect a large-scale dataset of real-world code changes and code reviews from open-source projects in nine of the most popular programming languages. To better understand code diffs and reviews, we propose CodeReviewer, a pre-trained model that utilizes four pre-training tasks tailored specifically for the code review scenario. To evaluate our model, we focus on three key tasks related to code review activities, including code change quality estimation, review comment generation and code refinement. Furthermore, we establish a high-quality benchmark dataset based on our collected data for these three tasks and conduct comprehensive experiments on it. The experimental results demonstrate that our model outperforms the previous state-of-the-art pre-training approaches in all tasks. Further analysis show that our proposed pre-training tasks and the multilingual pre-training dataset benefit the model on the understanding of code changes and reviews.

下载PDF全文

下载文献需遵守相关版权规定

论文标题