论文标题

通过关节因子分析和神经矩阵分解的宣传表示学习

Articulatory Representation Learning Via Joint Factor Analysis and Neural Matrix Factorization

论文作者

Lian, Jiachen, Black, Alan W, Lu, Yijing, Goldstein, Louis, Watanabe, Shinji, Anumanchipalli, Gopala K.

论文摘要

关节表示学习是建模神经语音生产系统的基础研究。我们以前的工作已经建立了一个深层范式,将关节运动学数据分解为手势,该数据明确地模拟了用人类语音生产机制编码的语音和语言结构,并进行了相应的手势分数。我们通过提出两个问题来继续进行这项工作:(1)在原始算法中纠缠着旋转器,使得某些旋转器不利用有效的移动模式,这限制了手势和手势分数的解释性; (2)EMA数据稀疏地从插座器中取样,这限制了学习表示的清晰度。在这项工作中,我们提出了一种新型的关节表示分解算法,该算法利用引导因素分析来得出发音特异性因素和因子得分。然后,在因子评分上使用神经振奋的基质分解算法来得出新的手势和手势分数。我们尝试捕获细颗粒的声带轮廓的RTMRI语料库。主观和客观评估结果都表明,新提出的系统提供了可理解,可概括,有效和可解释的发音表示。

Articulatory representation learning is the fundamental research in modeling neural speech production system. Our previous work has established a deep paradigm to decompose the articulatory kinematics data into gestures, which explicitly model the phonological and linguistic structure encoded with human speech production mechanism, and corresponding gestural scores. We continue with this line of work by raising two concerns: (1) The articulators are entangled together in the original algorithm such that some of the articulators do not leverage effective moving patterns, which limits the interpretability of both gestures and gestural scores; (2) The EMA data is sparsely sampled from articulators, which limits the intelligibility of learned representations. In this work, we propose a novel articulatory representation decomposition algorithm that takes the advantage of guided factor analysis to derive the articulatory-specific factors and factor scores. A neural convolutive matrix factorization algorithm is then employed on the factor scores to derive the new gestures and gestural scores. We experiment with the rtMRI corpus that captures the fine-grained vocal tract contours. Both subjective and objective evaluation results suggest that the newly proposed system delivers the articulatory representations that are intelligible, generalizable, efficient and interpretable.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源