• No download information available

Online Agent Modelling in Human-Scale Problems

  • Author / Creator
    Bard, Nolan DC
  • Ideal agent behaviour in multiagent environments depends on the behaviour of other agents. Consequently, acting to maximize utility is challenging since an agent must gather and exploit knowledge about how the other (potentially adaptive) agents behave. In this thesis, we investigate how an agent can efficiently tailor its behaviour to other agents during interaction in order to maximize its performance. This thesis presents three main contributions. First and foremost, the thesis characterizes and contrasts the traditional agent modelling approach – where practitioners explicitly estimate and subsequently respond to a generative model of an agent's behaviour – with an alternative approach called implicit modelling. Using traditional explicit modelling in complex human-scale domains is difficult since an agent must efficiently estimate sophisticated behaviours from observations that may be stochastic and partially observable. Even after estimating a generative model, it may be impractical to compute a response that is robust to modelling error during interaction. The implicit modelling framework avoids many of these challenges by estimating the utilities of a portfolio of strategies. Furthermore, implicit modelling naturally affords the opportunity to generate the portfolio offline, which provides practitioners with the time necessary for computationally expensive robust response techniques. We introduce an end-to-end approach for building an implicit modelling agent and empirically validate it in several poker domains. Second, the thesis contributes the first empirical analysis of how the granularity of an agent's representation of a multiagent environment – including its beliefs about the other agents – impacts two common objectives: performance against suboptimal agents and robustness against worst-case agents. We show that using asymmetric representations allows for practitioners to trade off these objectives whereas commonplace symmetric representations optimize neither. Third, we contribute a novel decision-theoretic clustering algorithm. While many existing clustering techniques optimize for spatial similarity between objects, we demonstrate that such spatial clustering can fail to capture similarity in how an agent should respond to the clusters to maximize utility. Our algorithm exploits structure in the utility function to allow for an efficient greedy approximation to this computationally hard optimization. We prove worst-case approximation bounds for our algorithm and empirically validate the approach by clustering agent behaviours in extensive-form games. These three contributions provide practitioners with a foundation of practical techniques for constructing an effective portfolio of strategies and using the portfolio to adapt an agent's behaviour. Our empirical evaluation of implicit modelling agents in a variety of poker games demonstrates that implicit modelling is an effective agent modelling approach for online real-time adaptation in complex human-scale domains.

  • Subjects / Keywords
  • Graduation date
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
  • Institution
    University of Alberta
  • Degree level
  • Department
    • Department of Computing Science
  • Supervisor / co-supervisor and their department(s)
    • Michael Bowling (Department of Computing Science, University of Alberta)
  • Examining committee members and their departments
    • Peter Stone (Department of Computer Science, The University of Texas at Austin)
    • Martin Müller (Department of Computing Science, University of Alberta)
    • Dale Schuurmans (Department of Computing Science, University of Alberta)
    • Robert Holte (Department of Computing Science, University of Alberta)