Letting the Agent Take the Wheel: Principles for Constructive and Predictive Knowledge

Kearney, Alexandra K

doi:doi:10.7939/r3-45g2-p206

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

183 views
199 downloads

Letting the Agent Take the Wheel: Principles for Constructive and Predictive Knowledge

Author / Creator

Kearney, Alexandra K
Of all the capabilities of natural intelligence, one of the most exceptional is the ability to expand upon and refine knowledge of the world through subjective experience. Therefore, a longstanding goal of Artificial Intelligence has been to replicate this success: to enable artificial agents to construct knowledge of the world through subjective experiences.

This thesis explores how an agent can come to know its world through its experiences by making many predictions about its future sensations, referred to as Predictive Knowledge. Specifically, I consider predictions expressed as

General Value Functions (GVFs): expected accumulations of future sensations conditioned on a particular behaviour. While it has been suggested that GVFs could express all of an agent’s knowledge of the world, few examples of applications of GVFs exist. Many present examples of GVF applications require hand-coded relationships between the predictive inputs and an agent’s decision-making. I argue that two key challenges must be addressed in order for Predictive Knowledge agents to achieve their potential: enabling agents to determine both what their predictions are about and how predictions are learned.

In this thesis, I provide one particular solution to both challenges. First, I generalize Incremental Delta-Bar-Delta to be used with temporal difference learning, which I name TIDBD. TIDBD allows Predictive Knowledge agents to modify both the rate at which they learn and the weighting of their features independent of designer intervention during learning. I empirically evaluate the performance of TIDBD in synthetic and real-world robotics prediction tasks.

Having provided agents with a means of modifying how they learn a prediction, I then explore how an agent might choose what prediction questions to ask. I argue that predictions should be chosen not based solely on their accuracy with respect to some true value but rather with respect to their usefulness in decision-making. Through a series of examples, I demonstrate that selecting predictions based solely on strict measures of accuracy can lead to poor model construction. I show with a worked example how poor model choices can lead to catastrophic performance when model estimates are used in further decision-making. I propose a heuristic for assessing GVF estimates that combines both the accuracy of the prediction and the usefulness of the input features.

Further exploring usefulness in the construction of knowledge, I provide a meta-gradient method that adapts what predictions an agent learns based on feedback from the control learner. I demonstrate that by using meta-gradient descent, an agent can find predictions that resolve partial observability when the control learner uses prediction estimates as additional inputs.

In total, this thesis provides a new perspective on the importance of predictions: prioritizing an artificial agent’s use of predictions over a prediction’s representational accuracy. In the process of developing this perspective, I introduce new algorithms that enable Predictive Knowledge agents to be applied more broadly with less designer intervention.
Subjects / Keywords
Graduation date

Fall 2023
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/r3-45g2-p206
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Sutton, Richard (Computing Science)
- Pilarski, Patrick (Medicine and Computing Science)