This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.
Search
Skip to Search Results- 37reinforcement learning
- 7machine learning
- 3artificial intelligence
- 3deep learning
- 3optimization
- 3planning
- 2Ady, Nadia M.
- 2Pilarski, Patrick M.
- 1Bennett, Brendan
- 1Carvalho, Tales Henrique
- 1Chakravarty, Sucheta
- 1Chan, Alan
-
Spring 2021
Temporal difference (TD) methods provide a powerful means of learning to make predictions in an online, model-free, and highly scalable manner. In the reinforcement learning (RL) framework, we formalize these prediction targets in terms of a (possibly discounted) sum of rewards, called the...
-
Spring 2024
Searching for programmatic policies to solve a reinforcement learning problem can be challenging, particularly when dealing with domain-specific languages (DSLs) that define policies with internal states for partially observable Markov decision processes (POMDPs). This is because they lead to...
-
Fall 2024
Value-based reinforcement learning is an approach to sequential decision making in which decisions are informed by learned, long-horizon predictions of future reward. This dissertation aims to understand issues that value-based methods face and develop algorithmic ideas to address these issues....
-
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
DownloadFall 2020
Policy gradient methods typically estimate both explicit policy and value functions. The long-extant view of policy gradient methods as approximate policy iteration---alternating between policy evaluation and policy improvement by greedification---is a helpful framework to elucidate algorithmic...
-
Fall 2022
Actor-Critics are a popular class of algorithms for control. Their ability to learn complex behaviours in continuous-action environments make them directly applicable to many real-world scenarios. These algorithms are composed of two parts - a critic and an actor. The critic learns to critique...
-
Spring 2023
Construction labour productivity (CLP) is a key performance indicator for determining the success of construction undertakings, and notably affects the profitability of construction companies. To this effect, the construction industry and researchers have pursued better ways of addressing the CLP...
-
Spring 2024
Recent strides in lower-limb exoskeleton development have significantly enhanced the potential for more effective rehabilitation and assistance for individuals with mobility impairments. Despite these advancements, the widespread adoption of exoskeletons demands improvements in both hardware and...
-
Spring 2022
The concept of state is fundamental to a reinforcement learning agent. The state is the input to the agent's action-selection policy, value functions, and environmental model. A reinforcement learning agent interacts with the environment by performing actions and receiving observations, resulting...
-
Spring 2024
In this dissertation, I investigate how we can exploit generic problem structure to make reinforcement learning algorithms more efficient. Generic problem structure means basic structure that exists in a wide range of problems (e.g., an action taken in the present does not influence the past), as...