Saibench：为科学进行基准测试

论文标题

Saibench：为科学进行基准测试

SAIBench: Benchmarking AI for Science

论文作者

Li, Yatao, Zhan, Jianfeng

论文摘要

科学研究社区正在采用基于AI的解决方案，以针对可牵引的科学任务并改善研究工作流程。但是，这种解决方案的开发和评估散布在多个学科中。我们正式化了科学AI基准测试的问题，并提出了一个名为Saibench的系统，以期统一努力并实现新学科的低分配。该系统以SAIL（一种特定于领域的语言）将研究问题，AI模型，排名标准以及软件/硬件配置分解为可重复使用的模块。我们表明，这种方法是灵活的，可以适应以不同角度定义的问题，AI模型和评估方法。项目主页是https://www.computercouncil.org/saibench

Scientific research communities are embracing AI-based solutions to target tractable scientific tasks and improve research workflows. However, the development and evaluation of such solutions are scattered across multiple disciplines. We formalize the problem of scientific AI benchmarking, and propose a system called SAIBench in the hope of unifying the efforts and enabling low-friction on-boarding of new disciplines. The system approaches this goal with SAIL, a domain-specific language to decouple research problems, AI models, ranking criteria, and software/hardware configuration into reusable modules. We show that this approach is flexible and can adapt to problems, AI models, and evaluation methods defined in different perspectives. The project homepage is https://www.computercouncil.org/SAIBench

下载PDF全文

下载文献需遵守相关版权规定

论文标题