基于视觉的实时手势识别技术及应用

Abstract

With the development of technology, a natural human-computer interaction is urgent for the users. The natural human-computer interaction is user friendly. As an important technology of human-computer interaction, the gesture recognition has become a research hotspot in recent years. The gesture recognition can be divided into two categories: vision-based gesture recognition and inertial sensor-based gesture recognition. Compared to inertial sensor-based gesture recognition, vision-based gesture recognition is more important because of its intuition, convenience, and freedom. In this thesis, we carefully study the key techniques of vision-based gesture recognition, including gesture segmentation, gesture extraction, recognition of static gesture, and recognition of dynamic gesture. At last, we apply our vision-based gesture recognition technology in a human-computer interaction system to test its performance. This system can recognize three gestures: “scissor”, “rock”, “paper”, and let the user play a “scissor-rock-paper” game.

Using the skin color information and the motion information, the gesture can be segmented accurately. The skin color information is obtained in real time by using the YCbCr Gaussian skin color model; the background subtraction method is used to extract the motion region; the skin color region and the motion region are fused to obtain the skin color motion region; the morphological method is used to denoise the skin color region. The search algorithm extracts the gesture contour and denoises it through the connected region analysis to obtain the final gesture. For static gesture recognition, we extract the HOG features of the training samples, train the SVM model, and identify the samples according to the trained model. The HOG feature of the training sample is analyzed by the experiment. The experimental results show that this method can identify the different directions of the same gesture and achieves the average recognition accuracy at 93.08%.

Considering the scale variability of HOG feature, this thesis adopts the volume semantic local binary patterns (VSLBP) algorithm to extract features, and uses SVM to design a classifier for real-time hand gesture recognition. Based on the three models of "scissors", "rock" and "paper", the LBP algorithm is used to extract the feature to train the SVM model. The tested samples are identified according to the trained model. The experimental results show that the average recognition accuracy is 94.42%. Finally, this algorithm is applied in a human-computer interaction to realize the recognition three gestures: "scissors",

-II-

"rock" and "paper". So, this human-computer interaction can let the user play a "scissors - stone - cloth" game, as show our method can recognize the gestures in real time and can be used in practical applications.

Keywords: support vector machine, human-computer interaction, histogram of oriented gradient feature, gesture recognition, volume semantic local

binary pattern

-III-

摘要................................................................................................................. I Abstract .............................................................................................................. I I 第1章绪论 .. (1)

1.1 课题的来源及研究的背景和意义 (1)

1.2 国内外研究的现状 (2)

1.2.1 手势分割 (3)

1.2.2 特征提取 (4)

1.2.3 手势识别 (5)

1.3 研究难点 (7)

1.4 本文的研究内容及章节安排 (7)

第2章手势分割的原理 (9)

2.1 肤色检测的原理 (10)

2.2 运动的检测 (12)

2.2.1 背景的建模 (12)

2.2.2 运动手势的分割 (13)

2.3 运动与肤色的融合 (14)

2.4 形态学的相关处理 (14)

2.5 轮廓的提取 (15)

2.5.1 使用八邻域搜索的算法来提取轮廓 (15)

2.5.2 连通区域分析 (16)

2.6 手势分割实验结果 (17)

2.7 本章小结 (18)

第3章静态手势识别 (19)

3.1 HOG特征原理 (19)

3.1.1 HOG原理简介 (19)

3.1.2 HOG特征原理 (20)

3.1.3 HOG特征的旋转可变性 (22)

3.2 支持向量机 (22)

3.2.1 支持向量机简介 (22)

3.2.2 SVM核函数 (24)

3.3卷积神经网络 (25)

3.4 手势分类系统设计 (25)

-IV-

3.4.1 构建手势样本库 (25)

3.4.2手势HOG特征提取 (26)

3.4.3训练手势分类系统 (27)

3.4.4 预测手势样本分类 (27)

3.4.5 基于卷积神经网络的手势分类 (28)

3.4.6 分析测试结果 (29)

3.5 本章小结 (29)

第4章实时手势识别及应用 (30)

4.1 LBP以及扩展LBP相关原理 (30)

4.1.1 局部二值模式原理 (30)

4.1.2 体语义局部二值模式 (31)

4.2 基于VSLBP的手势识别系统 (33)

4.2.1 手势设计和样本库构建 (33)

4.2.2 手势识别系统框架 (34)

4.2.3 训练系统 (36)

4.3 训练结果 (37)

4.4 手势识别应用 (38)

4.4.1 硬件环境 (39)

4.4.2 软件环境 (39)

4.5 人机交互界面实现 (40)

4.6 本章小结 (45)

结论 (46)

参考文献 (47)

哈尔滨工业大学学位论文原创性声明及使用授权说明 (51)

致谢 (52)

个人简介 (53)

-V-