TY - JOUR
T1 - Feature subset selection based on relevance
AU - Wang, Hui
AU - Bell, David
AU - Murtagh, Fionn
PY - 1997
Y1 - 1997
N2 - In this paper an axiomatic characterisation of feature subset selection is presented. Two axioms are presented: sufficiency axiom - preservation of learning information, and necessity axiom - minimising encoding length. The sufficiency axiom concerns the existing dataset and is derived based on the following understanding: any selected feature subset should be able to describe the training dataset without losing information, i.e. it is consistent with the training dataset. The necessity axiom concerns the predictability and is derived from Occam's razor, which states that the simplest among different alternatives is preferred for prediction. The two axioms are then restated in terms of relevance in a concise form: maximising both the r(X; Y) and r(Y; X) relevance. Based on the relevance characterisation, four feature subset selection algorithms are presented and analysed: one is exhaustive and the remaining three are heuristic. Experimentation is also presented and the results are encouraging. Comparison is also made with some well-known feature subset selection algorithms, in particular, with the built-in feature selection mechanism in C4.5.
AB - In this paper an axiomatic characterisation of feature subset selection is presented. Two axioms are presented: sufficiency axiom - preservation of learning information, and necessity axiom - minimising encoding length. The sufficiency axiom concerns the existing dataset and is derived based on the following understanding: any selected feature subset should be able to describe the training dataset without losing information, i.e. it is consistent with the training dataset. The necessity axiom concerns the predictability and is derived from Occam's razor, which states that the simplest among different alternatives is preferred for prediction. The two axioms are then restated in terms of relevance in a concise form: maximising both the r(X; Y) and r(Y; X) relevance. Based on the relevance characterisation, four feature subset selection algorithms are presented and analysed: one is exhaustive and the remaining three are heuristic. Experimentation is also presented and the results are encouraging. Comparison is also made with some well-known feature subset selection algorithms, in particular, with the built-in feature selection mechanism in C4.5.
KW - Classification (of information)
KW - Algorithms
KW - Wavelet transforms
UR - http://www.scopus.com/inward/record.url?scp=33750393171&partnerID=8YFLogxK
U2 - 10.1016/S0083-6656(97)00043-3
DO - 10.1016/S0083-6656(97)00043-3
M3 - Article
AN - SCOPUS:33750393171
VL - 41
SP - 387
EP - 396
JO - New Astronomy Reviews
JF - New Astronomy Reviews
SN - 1387-6473
IS - 3
ER -