Search
Skip to Search Results- 2Abdi Oskouie, Mina
- 2Birkbeck, Neil Aylon Charles
- 2Cai, Zhipeng
- 2Chen, Jiyang
- 2Chowdhury, Md Solimul
- 2Chubak, Pirooz
- 74Machine Learning
- 70Reinforcement Learning
- 41Artificial Intelligence
- 36Machine learning
- 22Natural Language Processing
- 22Reinforcement learning
-
Improving the reliability of reinforcement learning algorithms through biconjugate Bellman errors
DownloadSpring 2024
In this thesis, we seek to improve the reliability of reinforcement learning algorithms for nonlinear function approximation. Semi-gradient temporal difference (TD) update rules form the basis of most state-of-the-art value function learning systems despite clear counterexamples proving their...
-
Fall 2022
We have witnessed the rising popularity of real-world applications of reinforcement learning (RL). However, most successful real-world applications of RL rely on high-fidelity simulators that enable rapid iteration of prototypes, hyperparameter selection and policy training. On the other hand, RL...
-
Fall 2018
Semi-dense SLAM systems have become popular in the last few years. They can produce much denser point clouds than sparse SLAM while being computationally efficient (using only CPU). In previous works, the surface of the viewed scene was reconstructed in real-time by combining sparse SLAM system...
-
Fall 2011
This thesis addresses the problem of automatic real-time 3D reconstruction of general scenes from monocular video. Whereas many impressively accurate reconstruction techniques exist in the multi-view stereo literature, most are slow offline batch methods designed to work in highly calibrated...
-
Fall 2017
Model-free off-policy temporal-difference (TD) algorithms form a powerful component of scalable predictive knowledge representation due to their ability to learn numerous counter- factual predictions in a computationally scalable manner. In this dissertation, we address and overcome two...
-
Spring 2012
Natural language text is a prominent source of representing and communicating information and knowledge. It is often desirable to search in granularities of text that are smaller than a document or to query the syntactic roles and relationships within syntactically annotated text sentences, often...
-
Fall 2019
An accurate model of a patient’s individual survival distribution can help determine the appropriate treatment for terminal patients. Unfortunately, risk scores (e.g., from Cox Proportional Hazard models) do not provide survival probabilities, single-time probability models (e.g., the Gail model,...
-
Spring 2013
In this thesis, a framework is described that is designed to perform indoor localization in the Smart Condo (TM). A significant aspect of the framework is that it mainly operates on the basis of binary sensors - including motion sensors and occupancy sensors - and it primarily involves geometric...