Abstract
High dimensional data typify pattern recognition problems in bioinformatics, information retrieval, and various other fields. We discuss metric space properties in the context of four scenarios: increased dimensionality leading to larger
dissimilarities; uniform and Gaussian distributed points in the context of increasing dimensionality; and how pivot-based search can be understood in high dimensions. Conclusions include: (i) preprocessing using an ultrametric data structure (i.e., resulting from a hierarchical clustering) can lead to far faster proximity searching, among other operations; (ii) a locally ultrametric topology is targeted by pivot-based branch and bound searching; but (iii) high dimensional, structureless data (e.g., uniformly or Gaussian distributed) also become ultrametric.
dissimilarities; uniform and Gaussian distributed points in the context of increasing dimensionality; and how pivot-based search can be understood in high dimensions. Conclusions include: (i) preprocessing using an ultrametric data structure (i.e., resulting from a hierarchical clustering) can lead to far faster proximity searching, among other operations; (ii) a locally ultrametric topology is targeted by pivot-based branch and bound searching; but (iii) high dimensional, structureless data (e.g., uniformly or Gaussian distributed) also become ultrametric.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2005 UK Workshop on Computational Intelligence |
Subtitle of host publication | UKCI 2005 |
Editors | Boris Mirkin, George Magoulas |
Publisher | Birkbeck, University of London |
Pages | 226-231 |
Number of pages | 6 |
Publication status | Published - 2005 |
Externally published | Yes |
Event | UK Workshop on Computational Intelligence - Birkbeck, University of London, London, United Kingdom Duration: 5 Sep 2005 → 7 Sep 2005 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.65.3521&rep=rep1&type=pdf |
Conference
Conference | UK Workshop on Computational Intelligence |
---|---|
Abbreviated title | UKCI 2005 |
Country/Territory | United Kingdom |
City | London |
Period | 5/09/05 → 7/09/05 |
Internet address |