剩余的复发性CRNN，用于端到端的光学音乐识别单声学分数

论文标题

剩余的复发性CRNN，用于端到端的光学音乐识别单声学分数

Residual Recurrent CRNN for End-to-End Optical Music Recognition on Monophonic Scores

论文作者

Liu, Aozhi, Zhang, Lipei, Mei, Yaqi, Han, Baoqiang, Cai, Zifeng, Zhu, Zhaohua, Xiao, Jing

论文摘要

光学识别任务的挑战之一是将摄像头捕获图像的符号转录为数字音乐符号。以前作为卷积复发性神经网络开发的端到端模型并未从全尺度探索足够的上下文信息，并且仍然有一个很大的改进空间。我们提出了一个创新的框架，该框架将残留的复发性卷积神经网络与复发编码器网络结合在一起，以映射与图像中存在的符号相对应的单声音乐符号的序列。残留的复发卷积块可以提高模型丰富上下文信息的能力。实验结果针对称为Camera-Primus的公开数据集进行了基准测试，该数据集证明我们的方法使用卷积复发性神经网络超过了最新的端到端方法。

One of the challenges of the Optical Music Recognition task is to transcript the symbols of the camera-captured images into digital music notations. Previous end-to-end model which was developed as a Convolutional Recurrent Neural Network does not explore sufficient contextual information from full scales and there is still a large room for improvement. We propose an innovative framework that combines a block of Residual Recurrent Convolutional Neural Network with a recurrent Encoder-Decoder network to map a sequence of monophonic music symbols corresponding to the notations present in the image. The Residual Recurrent Convolutional block can improve the ability of the model to enrich the context information. The experiment results are benchmarked against a publicly available dataset called CAMERA-PRIMUS, which demonstrates that our approach surpass the state-of-the-art end-to-end method using Convolutional Recurrent Neural Network.

下载PDF全文

下载文献需遵守相关版权规定

论文标题