Search
Skip to Search Results-
Spring 2014
The culture of the young may increasingly be seen as a harbinger enticing us to follow a pathway from which will emerge the re-conceptualized educational practices of a new century. This research set out to discover where the highways of the Internet would lead me, the researcher-practitioner, in...
-
Fall 2023
A matroid bandit is the online version of combinatorial optimization on a matroid, in which the learner chooses $K$ actions from a set of $L$ actions that can form a matroid basis. Many real-world applications such as recommendation systems can be modeled as matroid bandits. In such learning...
-
Fall 2017
On the one hand, theoretical analyses of machine learning algorithms are typically performed based on various probabilistic assumptions about the data. While these probabilistic assumptions are important in the analyses, it is debatable whether such assumptions actually hold in practice. Another...
-
Fall 2020
Learning online is essential for an agent to perform well in an ever-changing world. An agent has to learn online not only out of necessity --- a non-stationary world might render past learning useless --- but also because continual tracking in a temporally coherent world can result in better...
-
Spring 2011
Current statistics suggest women form the majority of online learners. Their enrollment levels may be a result of promotional materials suggesting online learning allows learners access to flexible learning opportunities that will complement their busy lives. This research questions those...
-
Motivation and the information behaviours of online learning students: the case of a professionally-oriented, graduate program
DownloadFall 2010
Online learning is a wonderful opportunity for students who cannot attend classes at conventional times and places to further their education. However, to some extent, accessing and sharing information is often quite different and potentially more difficult for this particular group (e.g., they...
-
Spring 2016
Ideal agent behaviour in multiagent environments depends on the behaviour of other agents. Consequently, acting to maximize utility is challenging since an agent must gather and exploit knowledge about how the other (potentially adaptive) agents behave. In this thesis, we investigate how an...
-
Fall 2016
In an online learning problem a player makes decisions in a sequential manner. In each round, the player receives some reward that depends on his action and an outcome generated by the environment while some feedback information about the outcome is revealed. The goal of the player can be...
-
Spring 2022
In this dissertation, we study online off-policy temporal-difference learning algorithms, a class of reinforcement learning algorithms that can learn predictions in an efficient and scalable manner. The contributions of this dissertation are one of the two kinds: (1) empirically studying existing...
-
Fall 2023
We consider stochastic generalized linear bandit (GLB) problems when the reward distributions are log-concave and subgaussian. We consider for this problem the perturbed history exploration (PHE) algorithmIn each round of its operation, PHE perturbs the observed rewards by adding fresh noise to...