Predictive Knowledge in Robots: An Empirical Comparison of Learning Algorithms

Banafsheh Rafiee

doi:doi:10.7939/R31G0J971

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

271 views
429 downloads

Predictive Knowledge in Robots: An Empirical Comparison of Learning Algorithms

Author / Creator

Banafsheh Rafiee
Knowledge is central to intelligence. Intelligence can be thought of as the ability to acquire knowledge and apply it effectively. Despite being a subject of intense interest in artificial intelligence, it is not yet clear what the best approach is for an intelligent system to acquire and maintain a large body of knowledge. One interesting approach that we pursue in this thesis is based on the view that much of world knowledge is predictive. For example, to know that a box is heavy, is to predict that we need lots of effort to lift it. We call this predictive approach to maintaining and acquiring knowledge, the predictive knowledge approach. In this thesis, we implement an instance of this approach in order to explore and assess it further. To do so, we build upon the techniques and ideas of reinforcement learning. In particular, we use the idea of value functions. In conventional RL, value functions capture predictions about reward. Recently, value functions have been extended to capture more general predictions which can constitute knowledge. A value function in the extended form is called a general value function (GVF). GVFs provide a language to talk and think about predictions. More generally, we can think of GVFs as a language for representing predictive knowledge.
In this thesis, we develop the predictive knowledge view using the language of GVFs and apply it to several robot domains. Our work has three main contributions. First, we contribute to the idea of predictive knowledge by providing several new examples of it on robot domains, gaining a more
substantive understanding of knowledge as predictions. Second, we perform empirical comparisons of many off-policy temporal-difference (TD) learning algorithms including gradient-TD and emphatic-TD families of methods on robot data. Third, we systematically study the learning process on robots. Such studies provide insights about how to effectively evaluate and compare algorithms on real-world systems.
Subjects / Keywords
Graduation date

Fall 2018
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/R31G0J971
License

Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Richard S. Sutton (Computing Science)
- Adam White (Computing Science)