Applications of Computer Vision and Machine Learning to Three Engineering Problems

  • Author / Creator
    Xie, Bowen
  • This thesis applies computer vision and machine learning techniques to three engineering projects: a self-driving vehicle, a predictive display system, and a vision-based robot manipulator joint detector. In the first project, we build a remote-controlled car and implement three core self-driving features: lane- keeping control, traffic signs/signals detection and distance estimation, and obstacle avoidance. The first two features are benchmarked in a lab environment. We employ a novel end-to-end learning method which directly controls the vehicle based on the image perceived, instead of a traditional model-based control design. The YOLO object detector is used to identify different traffic signs and its bounding boxes are utilized to estimate their distance to the vehicle. The proposed system demonstrates satisfactory results in both qualitative and quantitative evaluations, and it outperforms human drivers in terms of control consistency and smoothness. For the predictive display project, we propose a new generative model-based predictive display for robotic teleoperation over high-latency communication links. Our method is capable of rendering photo-realistic images of the scene to the human operator in real-time from RGB-D images acquired by the remote robot. A preliminary exploration stage is used to build a coarse 3D map of the remote environment and to train a generative model, both of which are then used to generate photo-realistic images for the human operator based on the commanded pose of the robot. Data captured by the remote robot is used to dynamically update the 3D map, enabling teleoperation in the presence of new and relocated objects. Various experiments validate our proposed method’s performance and benefits over alternative methods. The third project considers vision-based estimation of robot arm joint locations. Automatic robot arm manipulation is well developed for small robot arms with precise joint feedback, but still underdeveloped for inexpensive robots or human-operated equipment due to lacking precise joint feedback. Manual training data labelling for neural networks for robotic objects is not economic, so the simulator is now a popular tool to generate training data. The problem is the gap between real-world images and simulation images. Hence we propose a vision-based system with domain adaption for joint state estimation. The resulting system is implemented and benchmarked against a state-of-the-art approach with favourable results.

  • Subjects / Keywords
  • Graduation date
    Spring 2021
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.