Download the full-sized PDF of Deep Learning in RoboticsDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Deep Learning in Robotics Open Access


Other title
Human Robot Interaction
Incremental Learning
Deep Learning
Video Segmentation
Recurrent Fully Convolutional Neural Network
Type of item
Degree grantor
University of Alberta
Author or creator
Valipour, Sepehr
Supervisor and department
Martin Jagersand (Computing Science)
Examining committee member and department
Nilanjan Ray (Computing Science)
Hong Zhang (Computing Science)
Department of Computing Science

Date accepted
Graduation date
2017-11:Fall 2017
Master of Science
Degree level
New machine learning methods and parallel computation opened the door to many applications in computer vision. While computer vision is progressing rapidly because of that, there are not as many and as successful real world applications in robotics. In this thesis, we investigate two possible causes of this problem and we provide potential solutions. In particular, using recurrent networks to deal with temporal information and using human robot interaction for incremental learning. Robotics is time dependent. The environment is perceived through time and tasks are being performed by time indexed trajectories. This important property has been generally neglected in the deep learning community. The main focus instead is on improving benchmarks on single image tasks or analyzing videos in batches. Consequently, real-time video analysis received less attention. But this is exactly what robots need to have. Processing single images will not provide sufficient information to observe the environment and processing a batch of images will be too delayed to be used for planning and decision-making in real-time. As a solution to this problem, we propose a recurrent fully convolutional neural network(RFCNN) for segmentation, which is very useful in different robotics scenarios. This type of network accepts a series of images ending with the current image to infer the segmentation for the last image. We showed how such networks can be designed and trained in an end-to-end fashion. An extensive set of experiments on several different architectures and benchmarks were made. We observed a consistent improvement by using RFCNN over non-recurrent counterparts. While deep learning methods are the most general machine learning solutions available, they still suffer from the change in the data distribution between training and test. Even though it is not limited to robotics, their effect is most apparent there. It is mostly due to the fact that robots need to learn complicated reasoning using a limited train data. This combination leads to severe overfitting which will become obvious during the test in a slightly different environment. To mitigate this problem, we suggest a novel paradigm in which new perception information can be thought to the robot through Human-Robot Interaction(HRI). A complete HRI system is developed that allows human- friendly communication using speech and gesture. An incremental learning method is designed to improve an object detection network by using human inputs. The system is tested on simulated data and real experiments with success. We showed that humans can easily teach new objects to the robot and the robot is able to learn and use this information later.
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication
Valipour, Sepehr, et al. "Recurrent Fully Convolutional Networks for Video Segmentation." Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on. IEEE, 2017.Siam, Mennatullah, et al. "Convolutional Gated Recurrent Networks for Video Segmentation." arXiv preprint arXiv:1611.05435 (2016).Valipour, Sepehr, Camilo Perez, and Martin Jagersand. "Incremental Learning for Robot Perception through HRI." arXiv preprint arXiv:1701.04693 (2017).

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 10650214
Last modified: 2017:11:08 16:59:35-07:00
Filename: valipour_sepehr_201708_MSc.pdf
Original checksum: 3c5f424737a970f4868fa3c355b88f6e
Well formed: false
Valid: false
Status message: Unexpected error in findFonts java.lang.ClassCastException: edu.harvard.hul.ois.jhove.module.pdf.PdfSimpleObject cannot be cast to edu.harvard.hul.ois.jhove.module.pdf.PdfDictionary offset=3905
Status message: Invalid name tree offset=10631224
Status message: Invalid name tree offset=10631224
Status message: Invalid name tree offset=10631224
Status message: Invalid name tree offset=10631224
File title: Introduction
Activity of users you follow
User Activity Date