Feature subset selection based on relevance

Hui Wang, David Bell, Fionn Murtagh

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

In this paper an axiomatic characterisation of feature subset selection is presented. Two axioms are presented: sufficiency axiom - preservation of learning information, and necessity axiom - minimising encoding length. The sufficiency axiom concerns the existing dataset and is derived based on the following understanding: any selected feature subset should be able to describe the training dataset without losing information, i.e. it is consistent with the training dataset. The necessity axiom concerns the predictability and is derived from Occam's razor, which states that the simplest among different alternatives is preferred for prediction. The two axioms are then restated in terms of relevance in a concise form: maximising both the r(X; Y) and r(Y; X) relevance. Based on the relevance characterisation, four feature subset selection algorithms are presented and analysed: one is exhaustive and the remaining three are heuristic. Experimentation is also presented and the results are encouraging. Comparison is also made with some well-known feature subset selection algorithms, in particular, with the built-in feature selection mechanism in C4.5.

Original languageEnglish
Pages (from-to)387-396
Number of pages10
JournalVistas in Astronomy
Volume41
Issue number3
DOIs
Publication statusPublished - 1997
Externally publishedYes

Fingerprint

Dive into the research topics of 'Feature subset selection based on relevance'. Together they form a unique fingerprint.

Cite this