Hierarchical Matching and Regression with Application to Photometric Redshift Estimation

Research output: Contribution to journalArticle

Abstract

This work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or 'photo-z' problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.

Original languageEnglish
Pages (from-to)145-155
Number of pages11
JournalProceedings of the International Astronomical Union
Volume12
Issue numberS325
DOIs
Publication statusPublished - 1 Oct 2016
Externally publishedYes

    Fingerprint

Cite this