DEVELOPING A PREDICTIVE APPROACH TO KNOWLEDGE

White, Adam, M

doi:doi:10.7939/R3FF3M75H

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

575 views
1395 downloads

DEVELOPING A PREDICTIVE APPROACH TO KNOWLEDGE

Author / Creator

White, Adam, M
Understanding how an artificial agent may represent, acquire, update, and use large amounts of knowledge has long been an important research challenge in artificial intelligence. The quantity of knowledge, or knowing a lot, may be nicely thought of as making and updat- ing many predictions about many different courses of action. This predictive approach to knowledge ensures the knowledge is grounded in and learned from low-level data generated by an autonomous agent interacting with the world. Because predictive knowledge can be maintained without human intervention, its acquisition can potentially scale with available data and computing resources. The idea that knowledge might be expressed as prediction has been explored by Cunningham (1972), Becker (1973), Drescher (1990), Sutton and Tanner (2005), Rafols (2006), and Sutton (2009, 2012). Other uses of predictions include representing state with predictions (Littman, Sutton &, Singh 2002; Boots et al. 2010) and modeling partially observable domains (Talvitie & Singh 2011). Unfortunately, technical challenges related to numerical instability, divergence under off-policy sampling, and com- putational complexity have limited the applicability and scalability of predictive knowledge acquisition in practice.
This thesis explores a new approach to representing and acquiring predictive knowledge on a robot. The key idea is that value functions, from reinforcement learning, can be used to represent policy-contingent declarative and goal-oriented predictive knowledge. We use recently developed gradient-TD methods that are compatible with off-policy learning and function approximation to explore the practicality of making and updating many predictions in parallel, while the agent interacts with the world from continuous inputs on a robot.
The work described here includes both empirical demonstrations of the effectiveness of our new approach and new algorithmic contributions useful for scaling prediction learning. We demonstrate that our value functions are practically learnable and can encode a variety of knowledge with several experiments—including a demonstration of the psychological
phenomenon of nexting, learning predictions with refined termination conditions, learn- ing policy-contingent predictions from off-policy samples, and learning procedural goal- directed knowledge—all on two different robot platforms. Our results demonstrate the po- tential scalability of our approach; making and updating thousands of predictions from hun- dreds of thousands of multi-dimensional data samples, in realtime and on a robot—beyond the scalability of related predictive approaches. We also introduce a new online estimate of off-policy learning progress, and demonstrate its usefulness in tracking the performance of thousands of predictions about hundreds of distinct policies. Finally, we conduct a novel empirical investigation of one of our main learning algorithms, GTD(λ), revealing several new insights of particular relevance to predictive knowledge acquisition. All told, the work described here significantly develops the predictive approach to knowledge.
Subjects / Keywords
Graduation date

Fall 2015
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/R3FF3M75H
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Richard S. Sutton (Computing Science)
Examining committee members and their departments
- Pierre-Yves Oudeyer (French National Institute for computer science and applied mathematics, Bordeaux France)
- Michael Bowling (Computing Science)
- Marek Reformat (Electrical and Computer Engineering)
- Pierre Boulanger (Computing Science)