- 42 views
- 69 downloads
Learning Policies and Heuristics for Bidirectional Search
-
- Author / Creator
- Tjhia, Kenneth G
-
This thesis empirically investigates the comparative ease of learning policies and heuristics for bidirectional versus unidirectional search in satisficing classical planning. Our research explores the potential advantages of bidirectional search in terms of learnability and efficiency of the resulting guidance mechanisms. We employ a learning framework where neural networks parameterize policies and heuristics, trained on solutions generated by bidirectional and unidirectional search methods across three classical planning domains: Sliding Tile Puzzle, Witness Puzzles (Triangles and Colours), and Pancake Puzzle. We compare base algorithms that span a spectrum from pure heuristic-based to pure policy-based approaches. As models improve, they solve more problems and find different solutions, potentially enhancing the quality of training data. Experimental results suggest that learning satisficing policies and heuristics for bidirectional search is often easier and leads to more efficient search guidance. Consistently, during training, bidirectional searches achieved greater or equal solve rates after seeing fewer problems and using fewer cumulative expansions. The resulting learned bidirectional guidance mechanisms were efficient in terms of expansions and generalized well to test sets. These findings were particularly pronounced in the Witness and Pancake domains. We propose several possible explanations for these observations, including the potential for bidirectional search to yield different solution distributions, allow for complementary interaction between the two directions predictions, and reduce the effective search depth. This research contributes to the ongoing exploration of learning in planning, raising new questions about the relationship between search strategies and the ease of learning effective guidance.
-
- Subjects / Keywords
-
- Graduation date
- Fall 2024
-
- Type of Item
- Thesis
-
- Degree
- Master of Science
-
- License
- This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.