In this paper an axiomatic characterisation of feature subset selection is presented. Two axioms are presented: sufficiency axiom - preservation of learning information, and necessity axiom - minimising encoding length. The sufficiency axiom concerns the existing dataset and is derived based on the following understanding: any selected feature subset should be able to describe the training dataset without losing information, i.e. it is consistent with the training dataset. The necessity axiom concerns the predictability and is derived from Occam's razor, which states that the simplest among different alternatives is preferred for prediction. The two axioms are then restated in terms of relevance in a concise form: maximising both the r(X; Y) and r(Y; X) relevance. Based on the relevance characterisation, four feature subset selection algorithms are presented and analysed: one is exhaustive and the remaining three are heuristic. Experimentation is also presented and the results are encouraging. Comparison is also made with some well-known feature subset selection algorithms, in particular, with the built-in feature selection mechanism in C4.5.