Fast hierarchical clustering from the Baire distance

Pedro Contreras, Fionn Murtagh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

The Baire or longest common prefix ultrametric allows a hierarchy, a multiway tree, or ultrametric topology embedding, to be constructed very efficiently. The Baire distance is a 1-bounded ultrametric. For high dimensional data, one approach for the use of the Baire distance is to base the hierarchy construction on random projections. In this paper we use the Baire distance on the Sloan Digital Sky Survey (SDSS, http://www.sdss.org) archive. We are addressing the regression of (high quality, more costly to collect) spectroscopic and (lower quality, more readily available) photometric redshifts. Nonlinear regression is used for mapping photometric and astrometric redshifts.

Original languageEnglish
Title of host publicationClassification as a Tool for Research
Subtitle of host publicationProceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft fur Klassifikation e.V.
EditorsHermann Locarek-Junge, Claus Weihs
PublisherSpringer Berlin
Pages235-243
Number of pages9
ISBN (Electronic)9783642107450
ISBN (Print)9783642107443
DOIs
Publication statusPublished - 3 May 2010
Externally publishedYes
Event11th Biennial Conference of the International Federation of Classification Societies with the 33rd Annual Conference of the German Classification Society : Classification as a Tool for Research - Dresden, Germany
Duration: 13 Mar 200918 Mar 2009
Conference number: 11 / 33
http://www.ifcs2009.de/ (Link to Conference Information)

Publication series

NameStudies in Classification, Data Analysis, and Knowledge Organization
PublisherSpringer
ISSN (Print)1431-8814

Conference

Conference11th Biennial Conference of the International Federation of Classification Societies with the 33rd Annual Conference of the German Classification Society
Abbreviated titleIFCS / GfKl 2009
CountryGermany
CityDresden
Period13/03/0918/03/09
Internet address

Fingerprint

Hierarchical Clustering
Topology
Random Projection
Nonlinear Regression
Prefix
High-dimensional Data
Regression
Hierarchical clustering
Hierarchy

Cite this

Contreras, P., & Murtagh, F. (2010). Fast hierarchical clustering from the Baire distance. In H. Locarek-Junge, & C. Weihs (Eds.), Classification as a Tool for Research : Proceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft fur Klassifikation e.V. (pp. 235-243). (Studies in Classification, Data Analysis, and Knowledge Organization). Springer Berlin. https://doi.org/10.1007/978-3-642-10745-0_25
Contreras, Pedro ; Murtagh, Fionn. / Fast hierarchical clustering from the Baire distance. Classification as a Tool for Research : Proceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft fur Klassifikation e.V.. editor / Hermann Locarek-Junge ; Claus Weihs. Springer Berlin, 2010. pp. 235-243 (Studies in Classification, Data Analysis, and Knowledge Organization).
@inproceedings{d2684a4d014240618fecc6d15768a5bb,
title = "Fast hierarchical clustering from the Baire distance",
abstract = "The Baire or longest common prefix ultrametric allows a hierarchy, a multiway tree, or ultrametric topology embedding, to be constructed very efficiently. The Baire distance is a 1-bounded ultrametric. For high dimensional data, one approach for the use of the Baire distance is to base the hierarchy construction on random projections. In this paper we use the Baire distance on the Sloan Digital Sky Survey (SDSS, http://www.sdss.org) archive. We are addressing the regression of (high quality, more costly to collect) spectroscopic and (lower quality, more readily available) photometric redshifts. Nonlinear regression is used for mapping photometric and astrometric redshifts.",
keywords = "artificial intelligence, biology, business intelligence, classification, data analysis, linguistics",
author = "Pedro Contreras and Fionn Murtagh",
year = "2010",
month = "5",
day = "3",
doi = "10.1007/978-3-642-10745-0_25",
language = "English",
isbn = "9783642107443",
series = "Studies in Classification, Data Analysis, and Knowledge Organization",
publisher = "Springer Berlin",
pages = "235--243",
editor = "Hermann Locarek-Junge and Claus Weihs",
booktitle = "Classification as a Tool for Research",
address = "Germany",

}

Contreras, P & Murtagh, F 2010, Fast hierarchical clustering from the Baire distance. in H Locarek-Junge & C Weihs (eds), Classification as a Tool for Research : Proceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft fur Klassifikation e.V.. Studies in Classification, Data Analysis, and Knowledge Organization, Springer Berlin, pp. 235-243, 11th Biennial Conference of the International Federation of Classification Societies with the 33rd Annual Conference of the German Classification Society , Dresden, Germany, 13/03/09. https://doi.org/10.1007/978-3-642-10745-0_25

Fast hierarchical clustering from the Baire distance. / Contreras, Pedro; Murtagh, Fionn.

Classification as a Tool for Research : Proceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft fur Klassifikation e.V.. ed. / Hermann Locarek-Junge; Claus Weihs. Springer Berlin, 2010. p. 235-243 (Studies in Classification, Data Analysis, and Knowledge Organization).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Fast hierarchical clustering from the Baire distance

AU - Contreras, Pedro

AU - Murtagh, Fionn

PY - 2010/5/3

Y1 - 2010/5/3

N2 - The Baire or longest common prefix ultrametric allows a hierarchy, a multiway tree, or ultrametric topology embedding, to be constructed very efficiently. The Baire distance is a 1-bounded ultrametric. For high dimensional data, one approach for the use of the Baire distance is to base the hierarchy construction on random projections. In this paper we use the Baire distance on the Sloan Digital Sky Survey (SDSS, http://www.sdss.org) archive. We are addressing the regression of (high quality, more costly to collect) spectroscopic and (lower quality, more readily available) photometric redshifts. Nonlinear regression is used for mapping photometric and astrometric redshifts.

AB - The Baire or longest common prefix ultrametric allows a hierarchy, a multiway tree, or ultrametric topology embedding, to be constructed very efficiently. The Baire distance is a 1-bounded ultrametric. For high dimensional data, one approach for the use of the Baire distance is to base the hierarchy construction on random projections. In this paper we use the Baire distance on the Sloan Digital Sky Survey (SDSS, http://www.sdss.org) archive. We are addressing the regression of (high quality, more costly to collect) spectroscopic and (lower quality, more readily available) photometric redshifts. Nonlinear regression is used for mapping photometric and astrometric redshifts.

KW - artificial intelligence

KW - biology

KW - business intelligence

KW - classification

KW - data analysis

KW - linguistics

UR - http://www.scopus.com/inward/record.url?scp=84879563260&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-10745-0_25

DO - 10.1007/978-3-642-10745-0_25

M3 - Conference contribution

SN - 9783642107443

T3 - Studies in Classification, Data Analysis, and Knowledge Organization

SP - 235

EP - 243

BT - Classification as a Tool for Research

A2 - Locarek-Junge, Hermann

A2 - Weihs, Claus

PB - Springer Berlin

ER -

Contreras P, Murtagh F. Fast hierarchical clustering from the Baire distance. In Locarek-Junge H, Weihs C, editors, Classification as a Tool for Research : Proceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft fur Klassifikation e.V.. Springer Berlin. 2010. p. 235-243. (Studies in Classification, Data Analysis, and Knowledge Organization). https://doi.org/10.1007/978-3-642-10745-0_25