Applications of Computer Vision and Machine Learning to Three Engineering Problems

Xie, Bowen

doi:doi:10.7939/r3-gb3j-z571

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

237 views
563 downloads

Applications of Computer Vision and Machine Learning to Three Engineering Problems

Author / Creator

Xie, Bowen
This thesis applies computer vision and machine learning techniques to three engineering projects: a self-driving vehicle, a predictive display system, and a vision-based robot manipulator joint detector. In the first project, we build a remote-controlled car and implement three core self-driving features: lane- keeping control, traffic signs/signals detection and distance estimation, and obstacle avoidance. The first two features are benchmarked in a lab environment. We employ a novel end-to-end learning method which directly controls the vehicle based on the image perceived, instead of a traditional model-based control design. The YOLO object detector is used to identify different traffic signs and its bounding boxes are utilized to estimate their distance to the vehicle. The proposed system demonstrates satisfactory results in both qualitative and quantitative evaluations, and it outperforms human drivers in terms of control consistency and smoothness. For the predictive display project, we propose a new generative model-based predictive display for robotic teleoperation over high-latency communication links. Our method is capable of rendering photo-realistic images of the scene to the human operator in real-time from RGB-D images acquired by the remote robot. A preliminary exploration stage is used to build a coarse 3D map of the remote environment and to train a generative model, both of which are then used to generate photo-realistic images for the human operator based on the commanded pose of the robot. Data captured by the remote robot is used to dynamically update the 3D map, enabling teleoperation in the presence of new and relocated objects. Various experiments validate our proposed method’s performance and benefits over alternative methods. The third project considers vision-based estimation of robot arm joint locations. Automatic robot arm manipulation is well developed for small robot arms with precise joint feedback, but still underdeveloped for inexpensive robots or human-operated equipment due to lacking precise joint feedback. Manual training data labelling for neural networks for robotic objects is not economic, so the simulator is now a popular tool to generate training data. The problem is the gap between real-world images and simulation images. Hence we propose a vision-based system with domain adaption for joint state estimation. The resulting system is implemented and benchmarked against a state-of-the-art approach with favourable results.
Subjects / Keywords
Graduation date

Spring 2021
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-gb3j-z571
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Mechanical Engineering
Supervisor / co-supervisor and their department(s)
- Barczyk, Martin (Mechanical Engineering)
- Jagersand, Martin (Computer Science)