Pychain：端到端ASR的LF-MMI的完全平行的Pytorch实现

论文标题

Pychain：端到端ASR的LF-MMI的完全平行的Pytorch实现

PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR

论文作者

Shao, Yiwen, Wang, Yiming, Povey, Daniel, Khudanpur, Sanjeev

论文摘要

我们提出了Pychain，这是针对Kaldi自动语音识别（ASR）工具包中所谓的\ EMPH {链模型}的端到端无晶格最大共同信息（LF-MMI）训练的完全平行的Pytorch实现。 Unlike other PyTorch and Kaldi based ASR toolkits, PyChain is designed to be as flexible and light-weight as possible so that it can be easily plugged into new ASR projects, or other existing PyTorch-based ASR tools, as exemplified respectively by a new project PyChain-example, and Espresso, an existing end-to-end ASR toolkit. Pychain的效率和柔韧性通过诸如Nemerator/MeNerator图的全GPU训练以及对不等长度序列的支持。 WSJ数据集的实验表明，使用简单的神经网络和常用的机器学习技术，Pychain可以取得与Kaldi相当的竞争结果，并且比其他端到端的ASR系统更好。

We present PyChain, a fully parallelized PyTorch implementation of end-to-end lattice-free maximum mutual information (LF-MMI) training for the so-called \emph{chain models} in the Kaldi automatic speech recognition (ASR) toolkit. Unlike other PyTorch and Kaldi based ASR toolkits, PyChain is designed to be as flexible and light-weight as possible so that it can be easily plugged into new ASR projects, or other existing PyTorch-based ASR tools, as exemplified respectively by a new project PyChain-example, and Espresso, an existing end-to-end ASR toolkit. PyChain's efficiency and flexibility is demonstrated through such novel features as full GPU training on numerator/denominator graphs, and support for unequal length sequences. Experiments on the WSJ dataset show that with simple neural networks and commonly used machine learning techniques, PyChain can achieve competitive results that are comparable to Kaldi and better than other end-to-end ASR systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题