Parameter Screening for Curious Reinforcement Learner Motivated by Unexpected Error

  • Author(s) / Creator(s)
  • Curiosity is a critical component of intelligence. One method of motivating curious behaviour in computational systems is to use reinforcement learning to learn which decisions maximize the amount of unexpected error observed by a predictive component. However, reinforcement learning algorithms for prediction and control require the system designer to set multiple parameters, and it is unknown how such a curious system’s behaviour might vary depending on parameter settings. Eight parameters (one learning rate, continuation probability, trace decay parameter for both prediction and control, 'epsilon' (the probability of a random action for epsilon-greedy control) and beta-naught parameter for computation of White’s (2015) unexpected error) were tested in an inscribed central composite experimental design. The response variable was the return. We found that the linear effects on return for epsilon, the learning rate for control, the continuation probability for prediction, and the beta-naught parameter for unexpected error were significant, along with the quadratic interactions between epsilon and beta-naught, epsilon and the continuation probability for prediction, beta-naught and the continuation probability for prediction, and the learning rate and continuation probability for prediction.

  • Date created
  • Subjects / Keywords
  • Type of Item
  • DOI
  • License
    Attribution-NonCommercial-NoDerivatives 4.0 International