ERA

Download the full-sized PDF of Speedup Clustering with Hierarchical RankingDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R3MC8RH41

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Computing Science, Department of

Collections

This file is in the following collections:

Technical Reports (Computing Science)

Speedup Clustering with Hierarchical Ranking Open Access

Descriptions

Author or creator
Zhou, Jianjun
Sander, Joerg
Additional contributors
Subject/Keyword
clustering algorithms
hierarchical ranking
Type of item
Computing Science Technical Report
Computing science technical report ID
TR08-09
Language
English
Place
Time
Description
Technical report TR08-09. Many clustering algorithms in particular hierarchical clustering algorithms do not scale-up well for large data-sets especially when using an expensive distance function. In this paper, we propose a novel approach to perform approximate clustering with high accuracy. We introduce the concept of a pairwise hierarchical ranking to efficiently determine close neighbors for every data object. We also propose two techniques to significantly reduce the overhead of ranking: 1) a frontier search rather than a sequential scan in the naïve ranking to reduce the search space; 2) based on this exact search, an approximate frontier search for pairwise ranking that further reduces the runtime. Empirical results on synthetic and real-life data show a speedup of up to two orders of magnitude over OPTICS while maintaining a high accuracy and up to one order of magnitude over the previously proposed DATA BUBBLES method, which also tries to speedup OPTICS by trading accuracy for speed.
Date created
2008
DOI
doi:10.7939/R3MC8RH41
License information
Creative Commons Attribution 3.0 Unported
Rights

Citation for previous publication

Source
Link to related item

File Details

Date Uploaded
Date Modified
2014-04-29T17:41:34.476+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 182320
Last modified: 2015:10:12 16:50:56-06:00
Filename: TR08-09.pdf
Original checksum: a350bd65b38d7e92cadd9349681a264d
Well formed: true
Valid: true
File title: Microsoft Word - jul10k6roptics0045.doc
File author: net
Page count: 12
Activity of users you follow
User Activity Date