Search
Skip to Search Results- 2Abdi Oskouie, Mina
- 2Birkbeck, Neil Aylon Charles
- 2Cai, Zhipeng
- 2Chen, Jiyang
- 2Chowdhury, Md Solimul
- 2Chubak, Pirooz
- 74Machine Learning
- 70Reinforcement Learning
- 41Artificial Intelligence
- 36Machine learning
- 22Natural Language Processing
- 22Reinforcement learning
-
Spring 2011
Off-policy reinforcement learning is useful in many contexts. Maei, Sutton, Szepesvari, and others, have recently introduced a new class of algorithms, the most advanced of which is GQ(lambda), for off-policy reinforcement learning. These algorithms are the first stable methods for general...
-
Fall 2022
OpenSpiel is an open-source software system for implementing high-performance software players for many different computer games. Hex is a two-player game of perfect information used in a variety of computer games research projects. The OpenSpiel project has implemented a version of the AlphaZero...
-
Spring 2022
As a student learns to program, there will be gaps in the student's knowledge that must be addressed for the student to gain a full understanding of the material. A student's answer to a single question may provide some insight into the student's level of understanding. However, a well-chosen...
-
Fall 2020
Motallebi Shabestari, Mohammad Hossein
Present-day advancements in AI, amongst other things, have often been regarding improving the accuracy of classification models. One lagging aspect, however, is justifying the decisions made by those models. Recently, AI researchers are paying more attention to fill this gap, leading to the...
-
Spring 2024
In the era of artificial intelligence, neural models have emerged as a powerful tool for tackling a wide range of tasks. However, these models are commonly regarded as black-box systems, making it difficult to understand their internal workings. The natural language explanation task seeks to...
-
Fall 2013
Delay Tolerant Mobile Networks (DTMNs) provide communication despite the occasional presence of disconnected subnetworks. They rely on finding a set of sequential opportunistic encounters between pairs of mobile nodes. In this context, understanding mobile node behaviour is essential to design...
-
Spring 2015
Sampling from a given probability distribution is a key problem in many different disciplines. Markov chain Monte Carlo (MCMC) algorithms approach this problem by constructing a random walk governed by a specially constructed transition probability distribution. As the random walk progresses, the...
-
Spring 2016
This thesis proposes, analyzes and tests different exploration-based techniques in Greedy Best-First Search (GBFS) for satisficing planning. First, we show the potential of exploration-based techniques by combining GBFS and random walk exploration locally. We then conduct deep analysis on how...