Published by the Foundation for Open Access Statistics
Editors-in-chief: Bettina Grün, Torsten Hothorn, Edzer Pebesma, Achim Zeileis    ISSN 1548-7660; CODEN JSSOBK
Spherical k-Means Clustering | Hornik | Journal of Statistical Software
Authors: Kurt Hornik, Ingo Feinerer, Martin Kober, Christian Buchta
Title: Spherical k-Means Clustering
Abstract: Clustering text documents is a fundamental task in modern data analysis, requiring approaches which perform well both in terms of solution quality and computational efficiency. Spherical k-means clustering is one approach to address both issues, employing cosine dissimilarities to perform prototype-based partitioning of term weight representations of the documents.

This paper presents the theory underlying the standard spherical k-means problem and suitable extensions, and introduces the R extension package skmeans which provides a computational environment for spherical k-means clustering featuring several solvers: a fixed-point and genetic algorithm, and interfaces to two external solvers (CLUTO and Gmeans). Performance of these solvers is investigated by means of a large scale benchmark experiment.

Page views:: 10097. Submitted: 2010-11-19. Published: 2012-09-18.
Paper: Spherical k-Means Clustering     Download PDF (Downloads: 11965)
skmeans_0.2-3.tar.gz: R source package Download (Downloads: 566; 193KB) Replication materials (code for examples/simulations and data) Download (Downloads: 503; 64KB)
tm.corpus.Oz.Books_2011.10.24.tar.gz: R source package Download (Downloads: 526; 1MB)

DOI: 10.18637/jss.v050.i10

This work is licensed under the licenses
Paper: Creative Commons Attribution 3.0 Unported License
Code: GNU General Public License (at least one of version 2 or version 3) or a GPL-compatible license.