TY - JOUR
T1 - Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks
AU - Zhang, Kui
AU - Li, Yuhua
AU - Scarf, Philip
AU - Ball, Andrew
PY - 2011/10
Y1 - 2011/10
N2 - The technique of machinery fault diagnosis has been greatly enhanced over recent years with the application of many pattern classification methods. However, these classification methods suffer from the "curse of dimensionality"; when applied to high-dimensional fault diagnosis data. In order to solve the problem, this paper proposes a hybrid model which combines multiple feature selection models to select the most significant input features from all potentially relevant features. Among the models, eight filter models are used to pre-rank the candidate features. They include data variance, Pearson correlation coefficient, the Relief algorithm, Fisher score, class separability, chi-squared, information gain and gain ratio. These variable ranking models measure features from various perspectives, and lead to different ranking results. Based on the effect of the ranking results on the Radial Basis Function (RBF) classification, a weighted voting scheme is then introduced to re-rank features. Furthermore, two wrapper models, a Binary Search (BS) model and a Sequential Backward Search (SBS) model are utilized to minimize the number of relevant features. To demonstrate the potential for applying the method to machinery fault diagnosis, two case studies are discussed. The experiment results support the conclusion that this method is useful for revealing fault-related frequency features.
AB - The technique of machinery fault diagnosis has been greatly enhanced over recent years with the application of many pattern classification methods. However, these classification methods suffer from the "curse of dimensionality"; when applied to high-dimensional fault diagnosis data. In order to solve the problem, this paper proposes a hybrid model which combines multiple feature selection models to select the most significant input features from all potentially relevant features. Among the models, eight filter models are used to pre-rank the candidate features. They include data variance, Pearson correlation coefficient, the Relief algorithm, Fisher score, class separability, chi-squared, information gain and gain ratio. These variable ranking models measure features from various perspectives, and lead to different ranking results. Based on the effect of the ranking results on the Radial Basis Function (RBF) classification, a weighted voting scheme is then introduced to re-rank features. Furthermore, two wrapper models, a Binary Search (BS) model and a Sequential Backward Search (SBS) model are utilized to minimize the number of relevant features. To demonstrate the potential for applying the method to machinery fault diagnosis, two case studies are discussed. The experiment results support the conclusion that this method is useful for revealing fault-related frequency features.
KW - Binary search
KW - Fault diagnosis
KW - Feature selection
KW - Radial basis function networks
KW - Sequential backward search
UR - http://www.scopus.com/inward/record.url?scp=80052944609&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2011.03.043
DO - 10.1016/j.neucom.2011.03.043
M3 - Article
AN - SCOPUS:80052944609
VL - 74
SP - 2941
EP - 2952
JO - Neurocomputing
JF - Neurocomputing
SN - 0925-2312
IS - 17
ER -