Symmetry in data mining and analysis

A unifying view based on hierarchy

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. The data sets themselves are explicitly linked as a form of representation to an observational, or otherwise empirical, domain of interest. "Structure" has long been understood as symmetry which can take many forms with respect to any transformation, including point, translational, rotational, and many others. Symmetries directly point to invariants that pinpoint intrinsic properties of the data and of the background empirical domain of interest. As our data models change, so too do our perspectives on analyzing data. The structures in data surveyed here are based on hierarchy, represented as p-adic numbers or an ultrametric topology.

Original languageEnglish
Pages (from-to)177-198
Number of pages22
JournalProceedings of the Steklov Institute of Mathematics
Volume265
Issue number1
DOIs
Publication statusPublished - Jul 2009
Externally publishedYes

Fingerprint

Data analysis
Data Mining
Symmetry
P-adic numbers
Data Model
Hierarchy
Topology
Invariant
Form

Cite this

@article{c450f5fecc9a4cacb027be8b180be9b5,
title = "Symmetry in data mining and analysis: A unifying view based on hierarchy",
abstract = "Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. The data sets themselves are explicitly linked as a form of representation to an observational, or otherwise empirical, domain of interest. {"}Structure{"} has long been understood as symmetry which can take many forms with respect to any transformation, including point, translational, rotational, and many others. Symmetries directly point to invariants that pinpoint intrinsic properties of the data and of the background empirical domain of interest. As our data models change, so too do our perspectives on analyzing data. The structures in data surveyed here are based on hierarchy, represented as p-adic numbers or an ultrametric topology.",
keywords = "steklov institute, terminal node, wreath product, haar wavelet, symbol sequence",
author = "Fionn Murtagh",
year = "2009",
month = "7",
doi = "10.1134/S0081543809020175",
language = "English",
volume = "265",
pages = "177--198",
journal = "Proceedings of the Steklov Institute of Mathematics",
issn = "0081-5438",
publisher = "Maik Nauka-Interperiodica Publishing",
number = "1",

}

Symmetry in data mining and analysis : A unifying view based on hierarchy. / Murtagh, Fionn.

In: Proceedings of the Steklov Institute of Mathematics, Vol. 265, No. 1, 07.2009, p. 177-198.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Symmetry in data mining and analysis

T2 - A unifying view based on hierarchy

AU - Murtagh, Fionn

PY - 2009/7

Y1 - 2009/7

N2 - Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. The data sets themselves are explicitly linked as a form of representation to an observational, or otherwise empirical, domain of interest. "Structure" has long been understood as symmetry which can take many forms with respect to any transformation, including point, translational, rotational, and many others. Symmetries directly point to invariants that pinpoint intrinsic properties of the data and of the background empirical domain of interest. As our data models change, so too do our perspectives on analyzing data. The structures in data surveyed here are based on hierarchy, represented as p-adic numbers or an ultrametric topology.

AB - Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. The data sets themselves are explicitly linked as a form of representation to an observational, or otherwise empirical, domain of interest. "Structure" has long been understood as symmetry which can take many forms with respect to any transformation, including point, translational, rotational, and many others. Symmetries directly point to invariants that pinpoint intrinsic properties of the data and of the background empirical domain of interest. As our data models change, so too do our perspectives on analyzing data. The structures in data surveyed here are based on hierarchy, represented as p-adic numbers or an ultrametric topology.

KW - steklov institute

KW - terminal node

KW - wreath product

KW - haar wavelet

KW - symbol sequence

UR - http://www.scopus.com/inward/record.url?scp=70350073872&partnerID=8YFLogxK

U2 - 10.1134/S0081543809020175

DO - 10.1134/S0081543809020175

M3 - Article

VL - 265

SP - 177

EP - 198

JO - Proceedings of the Steklov Institute of Mathematics

JF - Proceedings of the Steklov Institute of Mathematics

SN - 0081-5438

IS - 1

ER -