- 179 views
- 200 downloads
A Parameter Selection Framework For Semi-Supervised Clustering Algorithms
- Author / Creator
- Pourrajabi, Mojgan
Many clustering techniques require parameter settings and depending on an algorithms sensitivity to the parameter, the choice of the parameter value can be very important. Several
approaches have been proposed to find the “best” value of the clustering parameter for the
different unsupervised clustering methods.
We introduce a general method, denoted as “Cross-validation framework for finding clustering parameters” (CVCP). Given a data set, CVCP selects the “best” parameter value for a semi-supervised clustering method based on available constraints or labels that are given as input to a semi-supervised clustering method. CVCP is evaluated based on selecting the “best” value of k for a semi-supervised Kmeans-based clustering algorithm and the “best” value of MinPts for a semi-supervised density-based clustering algorithm. Our experimental results show that using the framework to select parameters can significantly improve the expected performance of a semi-supervised clustering method when appropriate parameter
values often have to be “guessed”.
- Graduation date
- Fall 2013
- Type of Item
- Master of Science
- This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.