This paper proposes a simple and discriminative framework, using graphical model and 3D geometry to understand the diversity of urban scenes with varying viewpoints. Our algorithm constructs a conditional random field (CRF) network using over-segmented superpixels and learns the appearance model from different set of features for specific classes of our interest. Also, we introduce a training algorithm to learn a model for edge potential among these superpixel areas based on their feature difference. The proposed algorithm gives competitive and visually pleasing results for urban scene segmentation. We show the inference from our trained network improves the class labeling performance compared to the result when using the appearance model solely.
To solve the problem of wide-baseline stereo image matching based on multiple cameras,the paper puts forward an image matching method of combining maximally stable extremal regions (MSER) with Scale Invariant Feature Transform (SIFT) . It uses MSER to detect feature regions instead of difference of Gaussian. After fitted into elliptical regions,those regions will be normalized into unity circles and represented with SIFT descriptors. The method estimates fundamental matrix and removes outliers by auto-maximum a posteriori sample consensus after initial matching feature points. The experimental results indicate that the method is robust to viewpoint changes,can reduce computational complexity effectively and improve matching accuracy.
针对现有图像拼接算法在配准精度和速度方面的不足,提出了一种利用相机标定信息和相位相关技术相结合的图像配准方法,并给出了算法的详细推导过程.算法通过相机标定获得投影矩阵、相机坐标系间的旋转变换矩阵和平移向量.利用相机坐标系间的旋转矩阵推导对应图像间的旋转变换公式,通过z轴位移及场景的平均景深近似求解图像间的缩放系数,利用相位相关法求解图像间的平移转换参数.实验结果证明该算法理论推导正确,在有效景深范围较小情况下图像拼接准确,对于分辨率为640×480的输入数据,平均拼接速度能够达到26.8 m s.
A 3D augmented reality navigation system using stereoscopic images is developed for teleoperated robot systems. The accurate matching between the simulated model and the video image of the actual robot can be realized, which helps the operator to accomplish the remote control task correctly and reliably. The system introduces the disparity map translation transformation method to take parallax images for stereoscopic displays, providing the operator an immersive 3D experience. Meanwhile, a fast and accurate registration method of dynamic stereo video is proposed, and effective integration of a virtual robot and the real stereo scene can be achieved. Preliminary experiments show that operation error of the system is maintained at less than 2.2 mm and the average error is 0.854 7, 0.909 3 and 0.697 2 mm at x, y, z direction respectively. Lots of experiments such as pressing the button, pulling the drawer and so on are also conducted to evaluate the performance of the system. The feasibility studies show that the depth information of structure can be rapidly and recognized in remote environment site. The augmented reality of the image overlay system could increase the operating accuracy and reduce the procedure time as a result of intuitive 3D viewing.
GAO Xin HU Huan JIA Qing-xuan SUN Han-xu SONG Jing-zhou