Search
Skip to Search Results- 84Reinforcement Learning
- 17Games
- 17Machine Learning
- 11Artificial Intelligence
- 5Planning
- 5Representation Learning
- 86Graduate and Postdoctoral Studies (GPS), Faculty of
- 86Graduate and Postdoctoral Studies (GPS), Faculty of /Theses and Dissertations
- 5Computing Science, Department of
- 5Computing Science, Department of/Technical Reports (Computing Science)
- 2Education, Faculty of
- 2Education, Faculty of/ENGAGE: Celebration of Research and Teaching Excellence
-
2017-10-13
SSHRC Awarded IG 2018: This Aboriginal and community-based, participatory research project aims to co-create knowledge about the holistic (emotional, mental, physical, and spiritual) benefits to Indigenous youth of participating in northern games, and to identify factors that might be modified to...
-
Spring 2022
The world offers unprecedented amounts of data in real-world domains, from which we can develop successful decision-making systems. It is possible for reinforcement learning (RL) to learn control policies offline from such data but challenging to deploy an agent during learning in safety-critical...
-
A Hierarchical Constrained Reinforcement Learning for Optimization of Bitumen Recovery Rate in a Primary Separation Vessel
Download2020-01-01
Shafi, Hareem, Velswamy, Kirubakaran, Ibrahim, Fadi, Huang,Biao
This work proposes a two-level hierarchical constrained control structure for reinforcement learning (RL) with application in a Primary Separation Vessel (PSV). The lower level is concerned with servo tracking and regulation of the interface level against variances in ore quality by manipulating...
-
A Long-Short Term Memory Recurrent Neural Network Based Reinforcement Learning Controller for Office Heating Ventilation and Air Conditioning Systems
Download2017-08-18
Yuan wang, Kirubakaran Velswamy, Biao Huang
Energy optimization in buildings by controlling the Heating Ventilation and Air Conditioning (HVAC) system is being researched extensively. In this paper, a model-free actor-critic Reinforcement Learning (RL) controller is designed using a variant of artificial recurrent neural networks called...
-
Action Selection for Hammer Shots in Curling: Optimization of Non-convex Continuous Actions With Stochastic Action Outcomes
DownloadSpring 2017
Optimal decision making in the face of uncertainty is an active area of research in artificial intelligence. In this thesis, I present the sport of curling as a novel application domain for research in optimal decision making. I focus on one aspect of the sport, the hammer shot, the last shot...
-
Fall 2019
Q-learning can be difficult to use in continuous action spaces, because a difficult optimization has to be solved to find the maximal action. Some common strategies have been to discretize the action space, solve the maximization with a powerful optimizer at each step, restrict the functional...
-
Spring 2021
Learning about many things can provide numerous benefits to a reinforcement learning system. For example, learning many auxiliary value functions, in addition to optimizing the environmental reward, appears to improve both exploration and representation learning. The question we tackle in this...
-
Spring 2015
Much of the focus on finding good representations in reinforcement learning has been on learning complex non-linear predictors of value. Methods like policy gradient, that do not learn a value function and instead directly represent policy, often need fewer parameters to learn good policies....
-
Spring 2023
Reinforcement learning (RL) defines a general computational problem where the learner must learn to make good decisions through interactive experience. To be effective in solving this problem, the learner must be able to explore the environment, make accurate predictions about the future, and...
-
Fall 2022
In this thesis, we investigate the empirical performance of several experience replay techniques. Efficient experience replay plays an important role in model-free reinforcement learning by improving sample efficiency through reusing past experience. However, replay-based methods were largely...