Bootstrap Learning of Heuristic Functions

Jabbari Arfaee, Shahab

doi:doi:10.7939/R3QP6J

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

480 views
526 downloads

Bootstrap Learning of Heuristic Functions

Author / Creator

Jabbari Arfaee, Shahab
We investigate the use of machine learning to create effective
heuristics for single-agent search. Our method aims
to generate a sequence of heuristics from a given weak heuristic
h{0} and a set of unlabeled training instances using a bootstrapping procedure. The training
instances that can be solved using h{0} provide training examples
for a learning algorithm that produces a heuristic h{1} that
is expected to be stronger than h{0}. If h{0}
is so weak that it cannot solve any of the given instances
we use random walks backward from the goal state to create a sequence
of successively more difficult training instances starting with ones
that are guaranteed to be solvable by h{0}.
The bootstrap process is then repeated using h{i} instead of h{i-1}
until a sufficiently strong heuristic is produced.
We test this method on the 15- and
24-sliding tile puzzles, the 17- , 24- , and 35-pancake puzzles, Rubik's Cube, and the 15- and 20-blocks world.
In every case our method produces
heuristics that allow IDA* to solve randomly generated problem instances quickly with solutions very close to optimal.

The total time for the bootstrap process to create strong heuristics for large problems is several days.
To make the process efficient when only a single test instance needs to be solved, we look
for a balance in the time spent on learning better heuristics and the time needed to solve the test instance using the current set of learned heuristics.
%We use two threads in parallel,
We alternate between the execution of two threads, namely the learning thread (to learn better heuristics) and the solving thread (to solve the test instance).
The solving thread is split up into sub-threads.
The first solving sub-thread aims at solving the instance using the initial heuristic.
When a new heuristic is learned in the learning thread, an additional solving sub-thread is started
which uses the new heuristic to try to solve the instance.
The total time by which we evaluate this process is the sum of the times used by both threads
up to the point when the instance is solved in one sub-thread.
The experimental results of this method on large search spaces demonstrate that the single instance of large problems are solved substantially faster
than the total time needed for the bootstrap process while the solutions obtained are still very close to optimal.
Subjects / Keywords
Graduation date

Fall 2010
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/R3QP6J
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Robert C. Holte (Computing Science)
- Sandra Zilles (Computer Science, University of Regina)
Examining committee members and their departments
- Sean Gouglas (Department of History & Classics)
- Jonathan Schaeffer (Computing Science)