很多麦克风阵列时延估计算法在噪声和混响环境下性能都会下降。该文提出一种基于多路线性预测(multi-ple linear prediction,MLP)的时延估计算法。通过传递函数比估计来消除通道间传递函数的非对称性,提高信号相关程度;空间预测技术引入了阵列冗余信息,并以相关系数矩阵作为时延搜索的目标函数,提高时延估计的可靠性。实验结果显示了多路线性预测算法的估计准确率更高,性能更加稳健。与几种经典算法相比,在噪声和混响环境下MLP算法的估计正确率分别提高了5%和30%以上。
提出一种新的通用旁瓣消除器结构,它利用广义奇异值分解(Generalized singular value decomposition,GSVD)技术,通过广义奇异向量的变换间接估计声源到麦克风之间的传递函数。不同噪声环境下的实验结果表明,与现有的各种GSC算法相比,该算法能够更有效地抑制混响和噪声,并且增强后的语音失真最小。
An English speech recognition system was implemented on a chip, called speech system-on-chip (SoC). The SoC included an application specific integrated circuit with a vector accelerator to improve performance. The sub-word model based on a continuous density hidden Markov model recognition algorithm ran on a very cheap speech chip. The algorithm was a two-stage fixed-width beam-search baseline system with a variable beam-width pruning strategy and a frame-synchronous word-level pruning strategy to significantly reduce the recognition time. Tests show that this method reduces the recognition time nearly 6 fold and the memory size nearly 2 fold compared to the original system, with less than 1% accuracy degradation for a 600 word recognition task and recognition accuracy rate of about 98%.
研究了将自适应领域的最大似然线性回归(Maximum likelihood linear regression,MLLR)变换矩阵作为特征进行文本无关的说话人识别算法.本文引入了基于统一背景模型的MLLRSV-SVM说话人识别算法,并在此基础上进行高层音素聚类以进一步提高识别性能.在采用多种信道补偿技术后,在NISTSRE2006年1训练语段-1测试语段同信道和跨信道数据库上,基于MLLR特征的系统与其他最好的系统性能接近并有很强的互补性,经过简单线性融合可以极大提高识别性能.