自然语言在交互式数据科学笔记本中代码生成

论文标题

自然语言在交互式数据科学笔记本中代码生成

Natural Language to Code Generation in Interactive Data Science Notebooks

论文作者

Yin, Pengcheng, Li, Wen-Ding, Xiao, Kefan, Rao, Abhishek, Wen, Yeming, Shi, Kensen, Howland, Joshua, Bailey, Paige, Catasta, Michele, Michalewski, Henryk, Polozov, Alex, Sutton, Charles

论文摘要

计算笔记本（例如jupyter笔记本电脑）是交互式计算环境，在数据科学家中无处不在，可以执行数据争吵和分析任务。为了衡量AI对程序员的性能，这些程序员会自动合成从用户的自然语言（NL）意图的这些任务的程序，我们构建了Arcade，这是使用Data Science Notebook中的PANDAS数据分析框架的1082代码生成问题的基准。 Arcade具有来自同一笔记本的多个回合NL-TO-CODE问题。它需要一个模型来了解丰富的多模式上下文，例如现有笔记本电池单元及其执行状态以及以前的交互作用。为了在这项具有挑战性的任务上建立强大的基准，我们开发了Pachinco，这是Python计算笔记本的62B代码语言模型（LM），该模型大大优于公共代码LMS。最后，我们探索了很少的促使策略以逐步分解和NL的解释来引发更好的代码，从而表明了提高模型预测的多样性和解释性的潜力。

Computational notebooks, such as Jupyter notebooks, are interactive computing environments that are ubiquitous among data scientists to perform data wrangling and analytic tasks. To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using the pandas data analysis framework in data science notebooks. ARCADE features multiple rounds of NL-to-code problems from the same notebook. It requires a model to understand rich multi-modal contexts, such as existing notebook cells and their execution states as well as previous turns of interaction. To establish a strong baseline on this challenging task, we develop PaChiNCo, a 62B code language model (LM) for Python computational notebooks, which significantly outperforms public code LMs. Finally, we explore few-shot prompting strategies to elicit better code with step-by-step decomposition and NL explanation, showing the potential to improve the diversity and explainability of model predictions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题