The remarkable simplicity of very high dimensional data: Application of model-based clustering

Fionn Murtagh

Research output: Contribution to journalArticlepeer-review

28 Citations (Scopus)

Abstract

An ultrametric topology formalizes the notion of hierarchical structure. An ultrametric embedding, referred to here as ultrametricity, is implied by a hierarchical embedding. Such hierarchical structure can be global in the data set, or local. By quantifying extent or degree of ultrametricity in a data set, we show that ultrametricity becomes pervasive as dimensionality and/or spatial sparsity increases. This leads us to assert that very high dimensional data are of simple structure. We exemplify this finding through a range of simulated data cases. We discuss also application to very high frequency time series segmentation and modeling.

Original languageEnglish
Pages (from-to)249-277
Number of pages29
JournalJournal of Classification
Volume26
Issue number3
DOIs
Publication statusPublished - 1 Dec 2009
Externally publishedYes

Fingerprint

Dive into the research topics of 'The remarkable simplicity of very high dimensional data: Application of model-based clustering'. Together they form a unique fingerprint.

Cite this