Search
Skip to Search Results- 21Online learning
- 3Artificial Intelligence
- 3Machine learning
- 2Machine Learning
- 2Regret
- 1Abstractions
- 2Bowling, Michael
- 1Bard, Nolan DC
- 1Bartók, Gábor
- 1Campbell, Katherine
- 1Chen, Zhaorui
- 1Dogru, Oguzhan
- 12Graduate and Postdoctoral Studies (GPS), Faculty of
- 12Graduate and Postdoctoral Studies (GPS), Faculty of /Theses and Dissertations
- 6Communications and Technology Graduate Program
- 6Communications and Technology Graduate Program/Capping Projects (Communications and Technology)
- 3Computing Science, Department of
- 3Computing Science, Department of/Technical Reports (Computing Science)
-
Spring 2016
Game theoretic solution concepts, such as Nash equilibrium strategies that are optimal against worst case opponents, provide guidance in finding desirable autonomous agent behaviour. In particular, we wish to approximate solutions to complex, dynamic tasks, such as negotiation or bidding in...
-
Fall 2012
In a partial-monitoring game a player has to make decisions in a sequential manner. In each round, the player suffers some loss that depends on his decision and an outcome chosen by an opponent, after which he receives "some" information about the outcome. The goal of the player is to keep the...
-
Spring 2023
Process industries involve processes that have complex, interdependent, and sometimes uncontrollable/unobservable features that are subject to a variety of uncertainties such as operational fluctuations, sensory noises, process anomalies, human involvement, market volatility, and so forth. In the...
-
Recommender systems to support socio-collaborative learning in educational discussion forums
DownloadFall 2020
With the popularity of online education, many educational technologies have been introduced to support students' learning. Among them, asynchronous discussion forums are widely used to support students’ socio-collaborative learning processes. However, the forum's complex thread structure and...
-
Spring 2022
In this dissertation, we study online off-policy temporal-difference learning algorithms, a class of reinforcement learning algorithms that can learn predictions in an efficient and scalable manner. The contributions of this dissertation are one of the two kinds: (1) empirically studying existing...
-
Fall 2016
In an online learning problem a player makes decisions in a sequential manner. In each round, the player receives some reward that depends on his action and an outcome generated by the environment while some feedback information about the outcome is revealed. The goal of the player can be...
-
Spring 2016
Ideal agent behaviour in multiagent environments depends on the behaviour of other agents. Consequently, acting to maximize utility is challenging since an agent must gather and exploit knowledge about how the other (potentially adaptive) agents behave. In this thesis, we investigate how an...
-
2012
Bowling, Michael, Zinkevich, Martin
Online learning aims to perform nearly as well as the best hypothesis in hindsight. For some hypothesis classes, though, even finding the best hypothesis offline is challenging. In such offline cases, local search techniques are often employed and only local optimality guaranteed. For online...