论文标题
通过细分识别手写的中文文本:无分段通道的方法
Recognition of Handwritten Chinese Text by Segmentation: A Segment-annotation-free Approach
论文作者
论文摘要
在线和离线手写的中文文本识别(HTCR)已经研究了数十年。早期方法采用了基于过度裂片的策略,但遭受了低速,准确性不足和角色分割注释的高成本。最近,基于连接主义者时间分类(CTC)和注意机制的无分割方法主导了HCTR的领域。但是,人们实际上是按字符读取文字的,尤其是对于中文等意识形态图。这就提出了一个问题:无细分策略真的是HCTR的最佳解决方案吗?为了探索这个问题,我们提出了一种新的基于细分的方法,用于识别使用简单但有效的完全卷积网络实现的手写中文文本。提出了一种新型的弱监督学习方法,以使网络仅使用笔录注释进行训练。因此,可以避免以前基于分割的方法所需的昂贵字符分割注释。由于缺乏完全卷积网络中的上下文建模,我们提出了一种上下文正则化方法,以在培训阶段将上下文信息集成到网络中,这可以进一步改善识别性能。在四个广泛使用的基准测试中进行的广泛实验,即Casia-HWDB,Casia-OlhwdB,ICDAR2013和Scut-HCCDOC,表明我们的方法在线和离线HCTR上都显着超过了现有的方法,并且比基于CTC/COVICATION基于CTC/基于CTC/GAISTIC/基于CTC/COATION速度的速度相当高。
Online and offline handwritten Chinese text recognition (HTCR) has been studied for decades. Early methods adopted oversegmentation-based strategies but suffered from low speed, insufficient accuracy, and high cost of character segmentation annotations. Recently, segmentation-free methods based on connectionist temporal classification (CTC) and attention mechanism, have dominated the field of HCTR. However, people actually read text character by character, especially for ideograms such as Chinese. This raises the question: are segmentation-free strategies really the best solution to HCTR? To explore this issue, we propose a new segmentation-based method for recognizing handwritten Chinese text that is implemented using a simple yet efficient fully convolutional network. A novel weakly supervised learning method is proposed to enable the network to be trained using only transcript annotations; thus, the expensive character segmentation annotations required by previous segmentation-based methods can be avoided. Owing to the lack of context modeling in fully convolutional networks, we propose a contextual regularization method to integrate contextual information into the network during the training stage, which can further improve the recognition performance. Extensive experiments conducted on four widely used benchmarks, namely CASIA-HWDB, CASIA-OLHWDB, ICDAR2013, and SCUT-HCCDoc, show that our method significantly surpasses existing methods on both online and offline HCTR, and exhibits a considerably higher inference speed than CTC/attention-based approaches.