This is a decommissioned version of ERA which is running to enable completion of migration processes. All new collections and items and all edits to existing items should go to our new ERA instance at https://ualberta.scholaris.ca - Please contact us at erahelp@ualberta.ca for assistance!
Search
Skip to Search Results- 2Abdi Oskouie, Mina
- 2Birkbeck, Neil Aylon Charles
- 2Cai, Zhipeng
- 2Chen, Jiyang
- 2Chowdhury, Md Solimul
- 2Chubak, Pirooz
- 83Machine Learning
- 76Reinforcement Learning
- 42Artificial Intelligence
- 37Machine learning
- 24Natural Language Processing
- 23reinforcement learning
-
Spring 2024
Chinese Checkers, a traditional game played on a star-shaped board by 2-6 players, has been a domain for game AI research and has been strongly solved up to a 6×6 board with 6 pieces per player in a two-player game. In this work, we apply the AlphaZero algorithm, known for its success in perfect...
-
Spring 2024
Searching for programmatic policies to solve a reinforcement learning problem can be challenging, particularly when dealing with domain-specific languages (DSLs) that define policies with internal states for partially observable Markov decision processes (POMDPs). This is because they lead to...
-
Spring 2024
Recent advances in reinforcement learning (RL) and Human-in-the-Loop (HitL) learning have made human-AI collaboration easier for humans to team with AI agents. Leveraging human expertise and experience with AI agents in intelligent systems can be efficient and beneficial. Still, it is unclear to...
-
Spring 2024
In this thesis, we present approximation schemes for the airport and railway problem (AR) on several classes of graphs. The AR problem, introduced by Adamaszek et al., is a combination of the capacitated facility location problem (CFL) and the network design problem. An AR instance comprises a...
-
Improving the reliability of reinforcement learning algorithms through biconjugate Bellman errors
DownloadSpring 2024
In this thesis, we seek to improve the reliability of reinforcement learning algorithms for nonlinear function approximation. Semi-gradient temporal difference (TD) update rules form the basis of most state-of-the-art value function learning systems despite clear counterexamples proving their...
-
Spring 2024
Text attribute transfer (TAT) is a natural language processing task that involves transforming some attributes of a given text while preserving other attributes. Recently, prompting approaches have been explored in TAT with the emergence of various pretrained language models (PLMs), where a...
-
Spring 2024
In recent years, significant strides in optimal bidirectional heuristic search (Bi-HS) have deepened our theoretical understanding and boosted performance. Yet, algorithms for Bi-HS in unbounded suboptimal scenarios remains largely unexplored. Despite leveraging front-to-end (F2E) and...
-
Spring 2024
In reinforcement learning, agents solve problems through interactions with the environment. However, when faced with intricate environmental dynamics, learning can become challenging, resulting in sub-optimal policies. A potential remedy to this situation lies in the transfer of knowledge from...
-
Spring 2024
Trajectory data analysis refers to the systematic exploration of spatial and temporal movement patterns in trajectory datasets. Missing trajectory points pose a challenge as they affect downstream tasks that rely on these datasets, such as public transportation management, wildlife monitoring,...
-
Towards Practical Offline Reinforcement Learning: Sample Efficient Policy Selection and Evaluation
DownloadSpring 2024
Offline reinforcement learning (RL) involves learning policies from datasets, rather than online interaction. The dissertation first investigates a critical component in offline RL: offline policy selection (OPS). Given that most offline RL algorithms require careful hyperparameter tuning, we...