Authors: Carlotta Domeniconi, Dimitris Papadopoulos, Dimitrios Gunopulos, Sheng Ma
Title: Subspace Clustering of High Dimensional Data
Conference: SIAM International Conference on Data Mining (SDM)
Year: 2004
Abstract: Clustering suffers from the curse of dimensionality, and
similarity functions that use all input features with equal
relevance may not be effective. We introduce an algorithm that
discovers clusters in subspaces spanned by di
erent combinations
of dimensions via local weightings of features. This
approach avoids the risk of loss of information encountered
in global dimensionality reduction techniques, and does not
assume any data distribution model. Our method associates to
each cluster a weight vector, whose values capture
the relevance of features within the corresponding cluster.
We experimentally demonstrate the gain in perfomance our
method achieves, using both synthetic and real data sets. In
particular, our results show the feasibility of the proposed
technique to perform simultaneous clustering of genes and
conditions in microarray data.
[Download]
Back