SearchSkip to Search Results
- 51Reinforcement Learning
- 12Machine Learning
- 7Artificial Intelligence
- 3Online Learning
- 3Policy Gradient
- 3Representation Learning
- 3Bowling, Michael
- 3Schuurmans, Dale
- 3Wang, Tao
- 2Jafferjee, Taher
- 2Lizotte, Daniel
- 1Abbasi Brujeni, Lena
- 42Graduate Studies and Research, Faculty of
- 42Graduate Studies and Research, Faculty of/Theses and Dissertations
- 5Computing Science, Department of
- 5Computing Science, Department of/Technical Reports (Computing Science)
- 2Chemical and Materials Engineering, Department of
- 2Chemical and Materials Engineering, Department of/Process Systems Engineering
A Hierarchical Constrained Reinforcement Learning for Optimization of Bitumen Recovery Rate in a Primary Separation VesselDownload
This work proposes a two-level hierarchical constrained control structure for reinforcement learning (RL) with application in a Primary Separation Vessel (PSV). The lower level is concerned with servo tracking and regulation of the interface level against variances in ore quality by manipulating...
A Long-Short Term Memory Recurrent Neural Network Based Reinforcement Learning Controller for Office Heating Ventilation and Air Conditioning SystemsDownload
Energy optimization in buildings by controlling the Heating Ventilation and Air Conditioning (HVAC) system is being researched extensively. In this paper, a model-free actor-critic Reinforcement Learning (RL) controller is designed using a variant of artificial recurrent neural networks called...
Q-learning can be difficult to use in continuous action spaces, because a difficult optimization has to be solved to find the maximal action. Some common strategies have been to discretize the action space, solve the maximization with a powerful optimizer at each step, restrict the functional...
Learning about many things can provide numerous benefits to a reinforcement learning system. For example, learning many auxiliary value functions, in addition to optimizing the environmental reward, appears to improve both exploration and representation learning. The question we tackle in this...
Much of the focus on finding good representations in reinforcement learning has been on learning complex non-linear predictors of value. Methods like policy gradient, that do not learn a value function and instead directly represent policy, often need fewer parameters to learn good policies....
In Dyna style algorithms, reinforcement learning (RL) agents use a model of the environment to generate simulated experience. By updating on this simulated experience, Dyna style algorithms allow agents to potentially learn control policies in fewer environment interactions than agents that use...
Learning auxiliary tasks, such as multiple predictions about the world, can provide many benets to reinforcement learning systems. A variety of off-policy learning algorithms have been developed to learn such predictions, but as yet there is little work on how to adapt the behavior to gather...
Current medical imaging professional training uses an apprenticeship model with students following an established doctor and viewing their cases, in what is called a practicum. This posses an issue as students are limited to the cases available during their practicum. To resolve this automated...
Technical report TR08-16. We propose a dual approach to dynamic programming and reinforcement learning based on maintaining an explicit representation of visit distributions as opposed to value functions. An advantage of working in the dual is that it allows one to exploit techniques for...