This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.
Search
Skip to Search Results-
Fall 2023
A matroid bandit is the online version of combinatorial optimization on a matroid, in which the learner chooses $K$ actions from a set of $L$ actions that can form a matroid basis. Many real-world applications such as recommendation systems can be modeled as matroid bandits. In such learning...
-
Fall 2016
In an online learning problem a player makes decisions in a sequential manner. In each round, the player receives some reward that depends on his action and an outcome generated by the environment while some feedback information about the outcome is revealed. The goal of the player can be...
-
Fall 2023
We consider stochastic generalized linear bandit (GLB) problems when the reward distributions are log-concave and subgaussian. We consider for this problem the perturbed history exploration (PHE) algorithmIn each round of its operation, PHE perturbs the observed rewards by adding fresh noise to...