论文标题

你现在可以读我吗?使用角度监督的内容意识到纠正

Can You Read Me Now? Content Aware Rectification using Angle Supervision

论文作者

Markovitz, Amir, Lavi, Inbal, Perel, Or, Mazor, Shai, Litman, Roee

论文摘要

智能手机摄像机的无处不在导致越来越多的文件被摄像机捕获而不是扫描。与平板扫描仪不同,经常折叠和弄皱的照片,导致文本结构的局部差异很大。文档纠正的问题是文档上的光学特征识别(OCR)过程的基础,其克服几何变形的能力显着影响识别精度。尽管在最近的OCR系统中取得了长足的进步,但大多数人仍然依赖于确保文本线直线和轴对齐的预处理。最近的作品解决了使用各种监督信号和对齐方式纠正野外图像的文档图像的问题。但是,他们专注于可以从文档边界中提取的全局功能,忽略了可以从文档内容中获得的各种信号。 我们提出了折痕:使用角度监督的内容意识到纠正,这是依赖文档内容,单词的位置,尤其是其方向的第一个学到的文档纠正方法,因为这提示可以协助整流过程。我们利用一种新颖的像素角回归方法和曲率估计辅助任务来优化我们的整流模型。我们的方法在OCR准确性,几何误差和视觉相似性方面超过了以前的方法。

The ubiquity of smartphone cameras has led to more and more documents being captured by cameras rather than scanned. Unlike flatbed scanners, photographed documents are often folded and crumpled, resulting in large local variance in text structure. The problem of document rectification is fundamental to the Optical Character Recognition (OCR) process on documents, and its ability to overcome geometric distortions significantly affects recognition accuracy. Despite the great progress in recent OCR systems, most still rely on a pre-process that ensures the text lines are straight and axis aligned. Recent works have tackled the problem of rectifying document images taken in-the-wild using various supervision signals and alignment means. However, they focused on global features that can be extracted from the document's boundaries, ignoring various signals that could be obtained from the document's content. We present CREASE: Content Aware Rectification using Angle Supervision, the first learned method for document rectification that relies on the document's content, the location of the words and specifically their orientation, as hints to assist in the rectification process. We utilize a novel pixel-wise angle regression approach and a curvature estimation side-task for optimizing our rectification model. Our method surpasses previous approaches in terms of OCR accuracy, geometric error and visual similarity.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源