移动机器人室内环境三维建模与目标识别技术研究

哈尔滨工业大学工程硕士专业学位论文

摘要

随着机器人技术的发展和人们对生活品质要求的提高，移动服务机器人的需求越来越强烈，然而现如今的主要矛盾是机器人的自主化和智能化还远不能满足人们的期待，所以如何让移动机器人更“像人”一直是研究的重点。在智能化的道路上，移动机器人的三维环境建图和目标识别技术越来越成为研究的热点。作为建图的关键技术SLAM在近些年有了很大进步，其中三维激光传感器建图精确，但价格过于昂贵，所以探索以视觉SLAM技术为代表的三维环境建模方法是解决目标识别问题的关键技术之一。

本文利用RGB-D传感器构建三维点云地图。首先研究三维建模中最基础也是最关键的点云配准算法，研究改进视觉SLAM中绝大部分应用的基于特征点的点云配准算法，提出一种特征点质量评价方法和三维随机采样一致性算法（RANSAC），提高了拼接点云过程的准确度和鲁棒性，同时降低了算法耗时。

在实现三维稠密点云快速建图时，本文改进了ORB-SLAM2算法，充分利用该算法对位姿的精确估计，在构建点云时对关键帧的所有像素点云进行重建，构建稠密点云地图。使用GPU加速特征提取和匹配，使得构建稠密地图的同时，帧处理时间没有增加，满足快速建图实时性。本文的改进建图方法和先进的稠密建图算法ElasticFusion相比，地图质量明显占优。

为了使机器人真正理解环境信息，本文提出一种目标自主定位和分割提取算法。该算法自主获得ROI区域，根据参考点遍历算法搜索定位目标，寻找到目标后根据基于体素的方法提取并分割目标点云。分割完成后，根据一定投影规则正交投影成图片作为训练集，将三维特征转化为二维特征，使用迁移学习方法训练并完成分类。将物体的类别标签和定位信息绑定构成语义索引，即可完成语义地图的构建。

至此本文完成了移动机器人室内环境的三维建模和目标识别技术研究。通过实验验证了改进的点云配准算法具有更高的准确率和配准速度。快速稠密建图算法具有良好的点云质量和实时性，并使用该算法构建了室内环境地图，对地图中常见物体进行了定位、分割、特征提取和分类实验，验证了目标定位算法和分割算法的有效性，同时验证了基于迁移学习的识别方法的可行性和高准确率。

关键词：视觉SLAM；点云配准；三维点云重建；目标定位；三维目标分割；迁移学习；目标识别

哈尔滨工业大学工程硕士专业学位论文

Abstract

With the development of robotics technology and people’s desire for better life, mobile service robots have become more prosperous. However, the main contradiction is that the autonomy and intelligence are far from people’s expectation. Making mobile robots more "like people" has always been the focus for researchers. The 3D construction of mobile robots and environmental awareness technologies are becoming hotspots for the purpose of intelligence. As a key technology for mapping, SLAM has made great progress in recent years. The three-dimensional lidar sensor has accurate construction, but it is too expensive. Therefore, the 3D environment modeling method represented by visual SLAM is the key to solve perceptual problem.

This paper constructs a 3D point cloud map. Firstly, the basic and critical point cloud registration algorithm in three-dimensional modeling is studied. We modify the feature-based point cloud registration algorithm for most applications in visual SLAM. A feature point quality evaluation algorithm and three-dimensional RANSAC method are proposed which improve the accuracy and robustness of the splicing point cloud process and reduce time-consuming.

This paper improves ORB-SLAM2 algorithm and makes full use of the accurately estimated pose graph in realizing real-time dense mapping. All pixel points of the keyframe are reconstructed instead of merely ORB feature points. GPGPU is used to accelerate the feature extraction and matching process, which makes real-time dense mapping without increasing frame processing time. Compared with ElasticFusion, an advanced dense SLAM algorithm, the quality of our method is obviously superior.

The object autonomous positioning and segmentation algorithm is proposed for the comprehension of 3D environment. It autonomously obtains the ROI region, and searches for object position according to the proposed reference point traversal algorithm. Then object point cloud grow and segment according to voxel-based segmentation algorithm. After that, the orthogonal projection image is used as training set according to certain projection rule, which convert 3D features to 2D features. Then use transfer learning method to train projection dataset for classification. Combining class labels and position information of an object into a semantic index can construct semantic map.

This paper completes the research of 3D modeling and object perception technology