语音情感识别技术的研究与应用

哈尔滨工业大学工程硕士学位论文

Abstract

In this paper, we propose a semantic combination of speech natural emotion recognition method, which uses the complementarity between speech signal and the emotion information in text information to solve the problem of emotion recognition, so as to obtain more accurate and reliable speech recognition compared with pure speech signal The recognition result.

After researching and summarizing a great deal of work in the related fields, this paper presents a sentiment model based on two sentiment models, the sentiment and discrete sentiment. Based on this, a semantic and semantic Acoustic Features Combined with Speech Emotion Recognition Model. Among them, SVM (Support Vector Machine) is used as the basic recognition model of voice acoustic feature channel, and emotion recognition based on affective dictionary is used as text channel. Two textual affective tendencies are set: positive and negative, The recognition results are weighted by two-channel multi-channel fusion to construct a semantic combined emotion recognition model.

In the process of building the system, this paper refers to the Chinese speech emotion database built by the Shenzhen Institute of Advanced Science of the Chinese Academy of Sciences around the intelligent center and organizes it t o realize the establishment of the emotional speech database. Based on the emotion-discretization model based on dimensions, an emotion recognition system based on phonetic acoustic features is constructed firstly. The system has the functions of endpoint detection, feature extraction, classifier training and recognition. By multiple classification in different dimensions, the system can not only effectively increase the amount of training data, but also improve the recognition rate on the data set, Semantic recognition methods combine to create a more accurate emotional recognition model.

Based on the establishment of the emotion recognition system, this dissertation also proposes a semantic combined speech emotion recognition method based on the dimension emotional discretization model. By using the existing speech recognition algorithm to recognize the emotion speech, the emotion dictionary can query the emotion value of the sentence to obtain the emotion labeling of the emotion word in the process of recognizing the meaning of the sentence, and to recognize the semantic of the text emotionally. Gain emotional orientation in different dimensions.

Finally, multi-channel multi-channel fusion algorithm is applied to multi-channel fusion of speech recognition results and semantic emotion

哈尔滨工业大学工程硕士学位论文

recognition results in the direction of emotional orientation. Finally, the multi-channel fusion weight is determined through multiple experiments and the result is reached A certain degree of recognition of the rate of improvem ent, so as to achieve a semantic combination of optimization model.

Keywords: emotion recognition, speech emotion recognition, multi-channel integration

哈尔滨工业大学工程硕士学位论文

摘要.................................................................................................................................... I ABSTRACT..........................................................................................................................II 第 1 章绪论 (1)

1.1 课题背景 (1)

1.2 国内外相关技术发展现状 (1)

1.3 本文主要工作 (3)

第 2 章语音情感识别概述 (4)

2.1 维度和离散情感描述模型 (4)

2.2 情感语音数据库 (6)

2.3 语音情感特征提取算法研究 (7)

2.3.1 韵律学特征 (8)

2.3.2 基于谱的特征 (10)

2.3.3 声音质量特征 (11)

2.4 语音情感识别算法研究进展 (12)

2.5 本章小结 (14)

第 3 章语音情感识别系统的搭建 (15)

3.1 情感语音数据库的构建 (15)

3.2 维度离散化情感模型 (19)

3.3 语音情感特征的提取法及特征选择 (20)

3.3.1 端点检测 (21)

3.3.2 特征提取 (24)

3.3.3 特征选择算法 (27)

3.4 语音情感识别基本模型构建 (29)

3.4.1 传统支持向量机 (29)

3.4.2 基于维度离散化情感模型的多级SVM情感分类 (31)

3.5 实验结果与分析 (34)

3.6 本章小结 (36)

第 4 章语义结合的语音情感识别技术研究 (37)

4.1 多通道的情感识别算法研究进展 (37)

4.2 语义情感识别算法研究 (38)

4.2.1 情感词典构建 (38)

哈尔滨工业大学工程硕士学位论文

4.2.2 基于情感词典的语义情感识别算法 (39)

4.3 语义结合的语音情感识别模型构建 (43)

4.4 实验结果与分析 (46)

4.5 本章小结 (47)

结论 (49)

参考文献 (50)

哈尔滨工业大学学位论文原创性声明和使用权限 (55)

致谢 (56)