SearchSkip to Search Results
- 46Artificial Intelligence
- 8Machine Learning
- 7Computer Games
- 5Game Theory
- 4Computing Science
- 4Müller, Martin
- 3Mueller, Martin
- 2Johanson, Michael
- 2Nakhost, Hootan
- 2Pelletier, Francis J.
- 2Schaeffer, Jonathan
- 23Graduate Studies and Research, Faculty of
- 23Graduate Studies and Research, Faculty of/Theses and Dissertations
- 18Computing Science, Department of
- 18Computing Science, Department of/Technical Reports (Computing Science)
- 2Philosophy, Department of
- 2Philosophy, Department of/Book Reviews (Philosophy)
An agent in an adversarial, imperfect information environment must sometimes decide whether or not to take an action and, if they take the action, must choose a parameter value associated with that action. Examples include choosing to buy or sell some amount of resources or choosing whether or...
Given nothing but the generative model of the environment, Monte Carlo Tree Search techniques have recently shown spectacular results on domains previously thought to be intractable. In this thesis we try to develop generic techniques for temporal abstraction inside MCTS that would allow the...
Answer typing is an important aspect of the question answering process. Most commonly addressed with the use of a fixed set of possible answer classes via question classification, answer typing influences which answers will ultimately be selected as correct. Answer typing introduces the concept...
Designing competitive Artificial Intelligence (AI) systems for Real-Time Strategy (RTS) games often requires a large amount of expert knowledge (resulting in hard-coded rules for the AI system to follow). However, aspects of an RTS agent can be learned from human replay data. In this thesis, we...
Many important problems can be cast as state-space problems. In this dissertation we study a general paradigm for solving state-space problems which we name Cluster-and-Conquer (C&C). Algorithms that follow the C&C paradigm use the concept of equivalent states to reduce the number of states...
Technical report TR10-02. A ubiquitous feature of planning problems -- problems involving the automatic generation of action sequences for attaining a given goal -- is the need to economize limited resources such as fuel or money. While heuristic search, mostly based on standard algorithms such...
In large extensive form games with imperfect information, Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing approximate Nash equilibria. While the base algorithm performs a full tree traversal on each iteration, Monte Carlo CFR (MCCFR) reduces the per...
Online learning aims to perform nearly as well as the best hypothesis in hindsight. For some hypothesis classes, though, even finding the best hypothesis offline is challenging. In such offline cases, local search techniques are often employed and only local optimality guaranteed. For online...
Fuegito is an educational software package for learning about programming two player games. The package provides a simple, yet flexible and extensible framework which allows students to study the core search algorithms of computer game-playing, and extend them easily in projects. The current...
Technical report TR09-13. This article presents a survey of reinforcement learning algorithms for Markov Decision Processes (MDP). In the first half of the article, the problem of value estimation is considered. Here we start by describing the idea of bootstrapping and temporal difference...