Sparse and Dense Visual SLAM with Single-Image Depth Prediction

Loo, Shing Yan

doi:doi:10.7939/r3-tp4b-ke15

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

600 views
565 downloads

Sparse and Dense Visual SLAM with Single-Image Depth Prediction

Author / Creator

Loo, Shing Yan
In this thesis, we investigate the use of single-image depth prediction from convolutional neural networks (CNNs) in sparse and dense monocular visual simultaneous localization and mapping (SLAM) problems. Mainly, we are interested in solving three problems: (1) data association, (2) dense mapping, and (3) long-term adaptation.
Hence, we divide the thesis into three parts to discuss the contributions to solving the problems mentioned above.

To improve the robustness of data association in visual SLAM, our first proposal extends the state-of-the-art semi-direct visual SLAM algorithm using single-image depth prediction to improve the reliability of feature matching. We propose to use the additional depth information to initialize new features with a small uncertainty centred at the predicted depth. By reducing depth uncertainty, feature correspondence can be identified in a reduced search range along the epipolar line, resulting in fast convergence of the feature depth and improved mapping performance. With the improved mapping performance, our method outperforms the state-of-the-art visual SLAM algorithms in camera tracking error.

To recover a dense structure, we densify the semi-dense structure of the scene recovered from the state-of-the-art direct SLAM algorithm, LSD-SLAM. To this end, our second proposal exploits the local depth gradient consistency from single-image relative depth prediction as a spatial regularizer to densify the semi-dense depth maps. In addition, we propose an adaptive filtering scheme that incorporates the depth and pixel intensity of a local window to reduce the noise of the semi-dense structure, which allows for a substantial gain in densification accuracy. The optimized semi-dense and densified structures, in turn, are being used to refine the pose-graph to refine the pose estimation. Experimental results show that our dense reconstruction accuracy outperforms the state-of-the-art methods by a large margin.

Nevertheless, single-image depth prediction from CNNs tends to give accurate depth estimations on images similar to that of the training images. Therefore, to improve the generality of single-image depth prediction used in visual SLAM, our third proposal introduces a long-term adaptation framework, which supports online fine-tuning of a depth prediction CNN to improve its accuracy while leveraging improved quality of depth prediction to optimize the structure and camera pose estimation globally. Particularly, we propose a novel online adaptation method in which the fine-tuning is enhanced with regularization to retain the previously learned knowledge while the CNN is continually trained. We demonstrate the use of fine-tuned depth prediction for map point culling before running global photometric BA, resulting in a more accurate map reconstruction than running global photometric BA on all map points.
Subjects / Keywords
- Visual SLAM
- Single-image depth prediction
Graduation date

Spring 2022
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/r3-tp4b-ke15
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Hong Zhang (Computing Science)