Deep Learning in Robotics

  • Author / Creator
    Valipour, Sepehr
  • New machine learning methods and parallel computation opened the door to many applications in computer vision. While computer vision is progressing rapidly because of that, there are not as many and as successful real world applications in robotics. In this thesis, we investigate two possible causes of this problem and we provide potential solutions. In particular, using recurrent networks to deal with temporal information and using human robot interaction for incremental learning. Robotics is time dependent. The environment is perceived through time and tasks are being performed by time indexed trajectories. This important property has been generally neglected in the deep learning community. The main focus instead is on improving benchmarks on single image tasks or analyzing videos in batches. Consequently, real-time video analysis received less attention. But this is exactly what robots need to have. Processing single images will not provide sufficient information to observe the environment and processing a batch of images will be too delayed to be used for planning and decision-making in real-time. As a solution to this problem, we propose a recurrent fully convolutional neural network(RFCNN) for segmentation, which is very useful in different robotics scenarios. This type of network accepts a series of images ending with the current image to infer the segmentation for the last image. We showed how such networks can be designed and trained in an end-to-end fashion. An extensive set of experiments on several different architectures and benchmarks were made. We observed a consistent improvement by using RFCNN over non-recurrent counterparts. While deep learning methods are the most general machine learning solutions available, they still suffer from the change in the data distribution between training and test. Even though it is not limited to robotics, their effect is most apparent there. It is mostly due to the fact that robots need to learn complicated reasoning using a limited train data. This combination leads to severe overfitting which will become obvious during the test in a slightly different environment. To mitigate this problem, we suggest a novel paradigm in which new perception information can be thought to the robot through Human-Robot Interaction(HRI). A complete HRI system is developed that allows human- friendly communication using speech and gesture. An incremental learning method is designed to improve an object detection network by using human inputs. The system is tested on simulated data and real experiments with success. We showed that humans can easily teach new objects to the robot and the robot is able to learn and use this information later.

  • Subjects / Keywords
  • Graduation date
    2017-11:Fall 2017
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
  • Institution
    University of Alberta
  • Degree level
  • Department
    • Department of Computing Science
  • Supervisor / co-supervisor and their department(s)
    • Martin Jagersand (Computing Science)
  • Examining committee members and their departments
    • Nilanjan Ray (Computing Science)
    • Hong Zhang (Computing Science)