Overview on vision-based 3D reconstruction

Overview on vision-based 3D reconstruction
Overview on vision-based 3D reconstruction

Overview on vision-based 3D reconstruction

TONG Shuai,XU Xiao-gang,YI Cheng-tao,SHAO Cheng-yong

( Dept.of Equipment System &Automatization,Dalian Naval Academy,Dalian

Liaoning 116018,China)

Abstract: As an developing technology,vision-based 3D reconstruction still got limitations in many aspects.Overviewed themain methods and relative study status of vision-based 3D reconstruction,and analyzed the advantages and disadvantages ofthese methods,hope to get a general understanding of this field and future indicate the study orientationsin the future.

Key words: vision-based 3D reconstruction; monocular vision; binocular vision; trinocular vision; overview.

0 the introduction:

Using modeling software for 3 d modeling is the commonly used method, but model requires a lot of manpower and material resources are often prohibitive, reconstruction effect is often poor. 3 d reconstruction technology based on vision in order to solve this problem provides a new train of thought.

3 d reconstruction technology based on vision, that is, using computer vision method for 3 d model reconstruction of the object, is refers to the use of digital camera as image sensor, the integrated use of image processing, such as visual computing technology of non-contact 3 d measurement, using a computer program for the object's 3 d information. Its advantage is not restricted by the object shape, the reconstruction speed faster, can realize automatic or semi-automatic modeling, 3 d reconstruction is an important development direction, can be widely used in including mobile robot autonomous navigation system, aviation and remote sensing measurement, industrial automation systems, various fields, such as the technology to produce economic benefit is very considerable.

As an important branch of computer vision technology, 3 d reconstruction

technology based on vision is based on the theoretical framework Marr visual, formed a variety of theoretical method. Depending on the number of cameras, for example, can be divided into the monocular vision method, binocular vision method, trinocular vision or eye visual method; According to the principle of different, can divide again for visual method based on region, based on the characteristics of visual methods, the method based on model and visual method based on rules, etc.; According to the obtained data, can be divided into active vision and passive vision method, etc. Based on the research situation at home and abroad in recent years, using the 3 d reconstruction based on visual technology related research and practical application more introduces several methods and comparative analysis, points out the main challenges facing the future and the future development direction. According to the use of the camera number is different, this article will be divided into 3 d reconstruction method based on visual monocular vision method, binocular vision method and three categories into trinocular vision method.

1 monocular vision method:

Monocular vision method, monocular vision) refers to the method for 3 d reconstruction using a video camera. The image can be used by single point of view of single or multiple images, image can also be multiple viewpoints. The former mainly through the image characteristics of two dimensional depth information (X) is derived in this paper, the two-dimensional features including the contrast, texture, focus, outline, etc., therefore is also referred to as X shape recovering method (shape from X). This kind of method of equipment has simple structure, the use of a single or a few images can reconstruct 3 d model objects; Is usually required conditions is more ideal, the actual application situation is not very ideal, the reconstruction effect is also generally. The latter by matching the same feature points of different images, using these points matching constraints, calculate the three-dimensional space coordinate information, so as to realize the three-dimensional reconstruction. This method can realize the reconstruction process of camera self-calibration, able to meet the needs of large-scale scene 3 d reconstruction, and the image is rich in resources

2 binocular vision method:

Stereo vision method (stereopsis) [63], also known as binocular vision method (binocularvision) [64], is a binocular parallax (binocular disparity) information is converted to a depth of information method. This method USES two cameras from two (usually about parallel alignment, also can be up and down vertical alignment) perspective, observing the same object for the object's perception of the image under different Angle of view, the matching points by using the method of triangulation of parallax information into depth. The reconstruction process is similar to human visual perception of the process, more intuitive and easy to understand.

Binocular vision method are commonly used to geometry to transform the problem european-style geometric conditions, and then use the triangulation method to estimate depth information. This method can be roughly divided into image acquisition, camera calibration, feature extraction and matching, camera calibration, stereo matching and 3 d modeling six steps. Using the triangulation method, calculated by the parallax, the depth of the point, obtain the depth of the dense point cloud (point cloud), the interpolation of point cloud and grid can get the 3 d model object.

Due to the geometric conditions in European demand for camera calibration and correction is strict, there are also some binocular vision method adopted under the condition of projective geometry, the use of space for ray intersection method to compute the 3 d coordinate of space point. This method requires more relaxed in the camera calibration and correction, reduces the computational complexity, but unable to get more accurate dense stereo matching, the reconstruction effects as described above.

Binocular stereo vision method has the advantage of mature, able to achieve better reconstruction effect is stable, the actual application situation is better than the other 3 d reconstruction method based on vision, also gradually appear on the part of commercial product; Computational complexity is still large, and in the case of baseline distance reconstruction effect is decreased obviously. Binocular vision method is comparatively mature on the whole, the present study focuses on improve the reconstruction speed and practical application, etc.

3 trinocular vision method:

Binocular vision method is the main problems existing in the process of reconstruction: a) repeat or similar features in the image is easy to cause the generation of false target; B) using polar outside constraints, parallel to the outer edge of the polar prone to vague; C) if the baseline distance increases, the shade made worse, reconstruction of spatial point reduce, at the same time the increase of the parallax scope, lead to the possibility of errors in large search space matching is also increased.

To solve these problems, and puts forward the trinocular vision method, trinocular vision) [65, 66]. The basic idea is to provide additional constraints, by adding a camera in order to avoid the above mentioned questions of binocular vision method. According to the relationship between the position of the camera, trinocular vision can be divided into two kinds of right triangle structure and collinear structure.

Trinocular vision method can avoid the binocular vision method is difficult to solve the false target, edge blur and false matching problem, in many cases, the reconstruction effect is better than that of binocular vision method. But due to the increased by a video camera, the equipment structure is more complex, the cost is higher, also difficult to achieve control, so the actual application situation is not ideal.

4 compare and analysis:

3 d reconstruction based on vision has always been a hot research topic in the field of computer vision, with the development of optics, electronics and computer technology, various reconstruction methods progress, become the key technology in the field of biomedicine, virtual reality, etc, were also widely used in space remote sensing, military reconnaissance, etc. Based on the visual effect of 3 d reconstruction technology advantage is clear, high degree of automation, can realize automatic or semi-automatic reconstruction, greatly saves manpower and improve the efficiency. Besides in this paper, the method, some from the perspective of the other methods, such as interactive method based on model database [67, 68], [69] multi-view stereo vision method, etc., have achieved good results.

The qualitative comparison of 3 d reconstruction method based on vision. In

addition to the movement law, the complexity of the monocular vision method is lower, the reconstruction time meet or close to real-time requirements; Is the lack of these methods depend on some assumptions, the differential method is universal, and susceptible to certain conditions, such as light, texture, etc.), the influence of the reconstruction effect is not stable. Movement method does not depend on certain assumptions and conditions, requirements for the input image is not high, the versatility is good, the reconstruction effect is related to the number of images, image reconstruction effect in general, the better the more, but the corresponding computation also increased greatly. Binocular vision method and trinocular vision method has more mature method system, the reconstruction effect is stable, the applicable scope is wide, but equipment complex, difficult to achieve control, costs are relatively high.

Although from the current actual situation and related studies, double eye vision (3) method used in the practical application than monocular vision method, reconstruction effect in most cases is also better than the monocular vision method, but in terms of research value, monocular vision method have higher study value. Main reason is the problem that equipment complexity and cost: double (three) visual method using multiple cameras, the relative position between the camera to get accurate fixed, at the same time also need to ensure that multiple cameras during filming of synchronism and stability, difficult to achieve control, cost is very high also, basically can only realize the small range of applications, can't be promoted. And monocular vision method using single camera image, to the requirements of the equipment is simple, low cost, easy to implement and has a huge advantage in terms of popularization and application.

5 conclusion:

3 d reconstruction based on vision technology research is still in the exploratory stage, all kinds of methods from the practical application and a distance, a variety of application requirements are to be met. Therefore, in the future for a long time, also need to do more intensive research in the field.

References:

[1] HORN B. Shape from shading: a method for obtaining the shape of asmooth opaque object from one view[D]. Cambridge: [s. n.],1970.

[2] BELHUMEUR P,KRIEGMAN D,YUILLE A. The bas-relief ambiguity[J]. International Journal of Computer Vision,1999,35 ( 1 ) :33-44.

[3] BAKSHI S,YANG Y. Shape from shading for non-lambertian surfaces[C]/ /Proc of International Conference on Image Processing. 1994:130-134.

[4] PENNA M. A shape from shading analysis for a single perspectiveimage of a polyhedron[J]. IEEE Trans on Pattern Analysis andMachine Intelligence,1989,11( 6) : 545-554.

[5] VOGEL O,BREUB M,WEICKERT J. Perspective shape from shadingwith non-lambertian reflectance[C]/ /Proc of DAGM Symposium onPattern Recognition. Berlin: Springer,2008: 517-526.

[6] ECKER A,JEPSON A D. Polynomial shape from shading[C]/ /Procof IEEE Computer Society Conference on Computer Vision and PatternRecognition. 2010.

[7] WOODHAM R J. Photometric method for determining surface orientation from multiple images[J]. Optical Engineering,1980,19 ( 1) : 139-144.

[8] NOAKES L,KOZERA R. Nonlinearities and noise reduction in 3-source photometric stereo[J]. Journal of Mathematical Imagingand Vision,2003,18( 2) : 119-127.[9] HOROVITZ I,KIRYATI N. Depth from gradient fields and controlpoints: bias correction in photometric stereo[J]. Image and VisionComputing,2004,

22( 9) : 681-694.

[10]TANG L K,TANG C K,WANG T T. Dense photometric stereo usingtensorial belief propagation[C]/ /Proc of IEEE Computer SocietyConference on Computer Vision and Pattern Recognition. San Diego:

[s. n.],2005: 132-139.

[11]TANG L K,TANG C K,WANG T T. A Markov random field approachfor dense photometric stereo[C]/ /Proc of IEEE Computer SocietyConference on Computer Vision and Pattern Recognition. 2005: 11-19.

[12]BASRI R,JACOBS D W,KEMELMACHER I. Photometric stereo withgeneral,unknown lighting[J]. Journal of Mathematical Imagingand Vision,2007,72( 3) : 239-257.[13] SUN Jiu-ai,SMITH M L,SMITH L N,et al. Object surface recoveryusing a multi-light photometric stereo technique for non-lambertiansurfaces subject to shadows and specularities[J]. Image and VisionComputing,2007,25( 7) : 1050-1057.

[14] HERNANDEZ C,VOGIATZIS G,BROSTOW G J,et al. Non-rigidphotometric stereo with colored lights[C]/ /Proc of International Conferenceon Computer

Vision. 2007: 1-8.

[15] SHI Bo-xin,MATSUSHITA Y,WEI Yi-chen,et al. Self-calibratingphotometric stereo[C]/ /Proc of IEEE Conference on Computer Visionand Pattern

Recognition. 2010.

[16]KIM H,WILBURN B,BEN-EZRA M. Photometric stereo for dynamicsurface orientations[C]/ /Proc of IEEE Conference on Computer Visionand Pattern Recognition. 2010.

[17]HIGO T,MATSUSHITA Y,IKEUCHI K. Consensus photometric stereo[C]/ /Proc of IEEE Conference on Computer Vision and Pattern Recognition.Grete: [s. n.],2010.

[18] WIKTIN A. Recovering surface shape and orientation from texture [J]. Artificial Intelligence,1981,17( 1 /2 /3) : 17-45.

[19]WARREN P A,MAMASSIAN P. Recovery of surface pose from textureorientation statistics under perspective projection[J]. Biological Cybernetics,2010,103( 3) : 199-212.

[20]BROWN R,SHVAYSTER H. Surface orientation from projective foreshorteningof isotropic texture autocorrelation[J]. IEEE Trans onPAMI,1990,12( 6) : 584-588.

[21]CLERC M,MALLAT S. Texture gradient equation for recovering shape from texture[J]. IEEE Trans on PAMI,2002,24( 4) : 536-549.

[22] LOBAY A,FORSYTH D. Shape from texture without boundaries[J]. Journal of Mathematical Imaging and Vision,2006,67( 1) :71-91.

[23] LOH A M,HARTLEY R. Shape from non-homogeneous,nonstationary,anisotropic,perspective texture[C]/ /Proc of British Machine VisionConference. Oxford,UK: [s. n.],2005: 1-8.

[24] GROSSBERG S,KUHLMANN L,MINGOLLA E. A neural model of3D shape-from-texture: multiple-scaleltering,boundary grouping,andsurface filling-in[J]. Vision Research,2007,47( 5) : 634-672.

[25]MASSOTD C,HERAULT J. Model of frequency analysis in the visualcortex and the shape from texture problem[J]. Journal of MathematicalImaging and Vision,2008,76( 2) : 165-182.

[26]MARTIN W N,AGGARWAL J K. Volumetric descriptions of objects from multiple views[J]. IEEE Trans on PAMI,1983,5 ( 2) : 150-158.[27] SZELISKI R. Rapid octree construction from image sequences[J].Computer Vision,Graphics,and Image Processing: Image Understanding,1993,58( 1) : 23-32.

[28] TARINI M,CALLIERI M,MONTAN C,et al. Marching intersections: anefficient approach to shape from silhouette[C]/ /Proc of Vision,Modeling,and

Visualization Conference. Erlangen: [s. n.],2002: 10-15.

[29] SNOW D,VIOLA P,ZABIH R. Exact voxel occupancy with graph cuts[C]/ /Proc of Eurographics Workshop on Rendering. London,UK:[s. n.],2001: 8-12.[30]KUHN S,HENRICH D. Multi-view reconstruction of unknown objectswithin a known environment[C]/ /Proc of International Symposium onVisual Computing. 2009 .[31]LANDABASO J L,PARDAS M,CASAS J R. Shape from inconsistentsilhouette

[J]. Computer Vision and Image Understanding,2008,112( 2) : 210-224.[32]HARO G,PARDAS M. Shape from incomplete silhouettes based onthe reprojection error[J]. Image and Vision Computing,2010,28( 9) : 1354-1368.

[33] LAURENTINI A. The visual hull concept for silhouette-based imageunderstanding[J]. IEEE Trans on PAMI,1994,16( 2) : 150-162.[34] MATUSIK W,BUEHLER C,MCMILLAN L. Polyhedral visual hullsfor real-time rendering[C]/ /Proc of Eurographics Workshop on Rendering.London,UK: [s. n.],2001: 115-125.

[35]FORBES K,NICOLLS F,JAGER G,et al. Shape-from-silhouette withtwo mirrors and an uncalibrated camera[C]/ /Proc of IEEE Conferenceon Computer Vision and Pattern Recognition. 2006: 1-8.

[36] FRANCO J S,LAPIERRE M,BOYER E. Visual shape of silhouettesets[C]/ /Proc of International Symposium on 3D Data Processing,Visualization,and

Transmission. 2006: 397-404.

[37] MATUSIK W,BUEHLER C,MCMILLAN L. Polyhedral visual hullsfor real-time rendering[C]/ /Proc of IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition. 2000: 345-353.

[38] CASAS J R,SALVADOR J. Image-based multi-view scene analysisusing conexels [C]/ /Proc of HCSNet Workshop on Use of Vision inHuman-computer

Interaction. Graz: [s. n.],2006.

[39] XIONG Ya-lin,SHAFER S A. Depth from focusing and defocusing[C]/ /Proc of IEEE Computer Society Conference on Computer Visionand Pattern

Recognition. 1993: 68-73.

[40]DARRELL T,WOHN K. Depth from focus using a pyramid architecture[J]. Pattern Recognition Letters,1990,11( 12) : 787-796.

[41] NAYAR S,NAKAGAWA Y. Shape from focus[J]. IEEE Trans onPAMI,1994,16( 8) : 824-831.

[42]YUN J,CHOI T S. Fast shape from focus using dynamic programming[C]/ /Proc of Three-Dimensional Image Capture and Applications.San Jose: [s. n.],2000: 71-79.

[43]RAJAGOPALAN A,CHAUDHURI S. Optimal selection of camera parametersfor recovery of depth from defocused images[C]/ /Proc ofIEEE Computer Society Conference on Computer Vision and PatternRecognition. 1997: 219-224.

[44] LOU Yi-fei,FAVARO F,BERTOZZI A L,et al. Autocalibration anduncalibrated reconstruction of shape from defocus[C]/ /Proc of IEEEComputer Society Conference on Computer Vision and Pattern Recognition.2007: 1-8.

[45] PRADEEP K,RAJAGOPALAN A. Improving shape from focus usingdefocus cue [J]. IEEE Trans on Image Processing,2007,16( 7) :1920-1925.

[46]MANNAN S M,MUTAHIRA H,CHOI T S. Shape recovery by focusbased methods using low pass filter[J]. Communication in Computer and Information Science,2009,56: 528-538.

相关主题
相关文档
最新文档