Hierarchical Clustering for Finding Symmetries and Other Patterns in Massive, High Dimensional Datasets

Fionn Murtagh, Pedro Contreras

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

4 Citations (Scopus)

Abstract

Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. "Structure" can be understood as symmetry and a range of symmetries are expressed by hierarchy. Such symmetries directly point to invariants, that pinpoint intrinsic properties of the data and of the background empirical domain of interest. We review many aspects of hierarchy here, including ultrametric topology, generalized ultrametric, linkages with lattices and other discrete algebraic structures and with p-adic number representations. By focusing on symmetries in data we have a powerful means of structuring and analyzing massive, high dimensional data stores. We illustrate the powerfulness of hierarchical clustering in case studies in chemistry and finance, and we provide pointers to other published case studies.

Original languageEnglish
Title of host publicationData Mining
Subtitle of host publicationFoundations and Intelligent Paradigms
EditorsDawn Holmes, Lakhmi Jain
PublisherSpringer Verlag
Chapter5
Pages95-130
Number of pages36
Volume1: Clustering, Association and Classification
ISBN (Electronic)9783642231667
ISBN (Print)9783642231650
DOIs
Publication statusPublished - 1 Dec 2012
Externally publishedYes

Publication series

NameIntelligent Systems Reference Library
Volume23
ISSN (Print)1868-4394
ISSN (Electronic)1868-4408

Fingerprint

Dive into the research topics of 'Hierarchical Clustering for Finding Symmetries and Other Patterns in Massive, High Dimensional Datasets'. Together they form a unique fingerprint.

Cite this