Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks

Kui Zhang, Yuhua Li, Philip Scarf, Andrew Ball

Research output: Contribution to journalArticle

46 Citations (Scopus)

Abstract

The technique of machinery fault diagnosis has been greatly enhanced over recent years with the application of many pattern classification methods. However, these classification methods suffer from the "curse of dimensionality"; when applied to high-dimensional fault diagnosis data. In order to solve the problem, this paper proposes a hybrid model which combines multiple feature selection models to select the most significant input features from all potentially relevant features. Among the models, eight filter models are used to pre-rank the candidate features. They include data variance, Pearson correlation coefficient, the Relief algorithm, Fisher score, class separability, chi-squared, information gain and gain ratio. These variable ranking models measure features from various perspectives, and lead to different ranking results. Based on the effect of the ranking results on the Radial Basis Function (RBF) classification, a weighted voting scheme is then introduced to re-rank features. Furthermore, two wrapper models, a Binary Search (BS) model and a Sequential Backward Search (SBS) model are utilized to minimize the number of relevant features. To demonstrate the potential for applying the method to machinery fault diagnosis, two case studies are discussed. The experiment results support the conclusion that this method is useful for revealing fault-related frequency features.

LanguageEnglish
Pages2941-2952
Number of pages12
JournalNeurocomputing
Volume74
Issue number17
DOIs
Publication statusPublished - Oct 2011

Fingerprint

Radial basis function networks
Failure analysis
Machinery
Feature extraction
Politics
Pattern recognition

Cite this

@article{89750b63231542b0b992f3874712cfb0,
title = "Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks",
abstract = "The technique of machinery fault diagnosis has been greatly enhanced over recent years with the application of many pattern classification methods. However, these classification methods suffer from the {"}curse of dimensionality{"}; when applied to high-dimensional fault diagnosis data. In order to solve the problem, this paper proposes a hybrid model which combines multiple feature selection models to select the most significant input features from all potentially relevant features. Among the models, eight filter models are used to pre-rank the candidate features. They include data variance, Pearson correlation coefficient, the Relief algorithm, Fisher score, class separability, chi-squared, information gain and gain ratio. These variable ranking models measure features from various perspectives, and lead to different ranking results. Based on the effect of the ranking results on the Radial Basis Function (RBF) classification, a weighted voting scheme is then introduced to re-rank features. Furthermore, two wrapper models, a Binary Search (BS) model and a Sequential Backward Search (SBS) model are utilized to minimize the number of relevant features. To demonstrate the potential for applying the method to machinery fault diagnosis, two case studies are discussed. The experiment results support the conclusion that this method is useful for revealing fault-related frequency features.",
keywords = "Binary search, Fault diagnosis, Feature selection, Radial basis function networks, Sequential backward search",
author = "Kui Zhang and Yuhua Li and Philip Scarf and Andrew Ball",
year = "2011",
month = "10",
doi = "10.1016/j.neucom.2011.03.043",
language = "English",
volume = "74",
pages = "2941--2952",
journal = "Neurocomputing",
issn = "0925-2312",
publisher = "Elsevier",
number = "17",

}

Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks. / Zhang, Kui; Li, Yuhua; Scarf, Philip; Ball, Andrew.

In: Neurocomputing, Vol. 74, No. 17, 10.2011, p. 2941-2952.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks

AU - Zhang, Kui

AU - Li, Yuhua

AU - Scarf, Philip

AU - Ball, Andrew

PY - 2011/10

Y1 - 2011/10

N2 - The technique of machinery fault diagnosis has been greatly enhanced over recent years with the application of many pattern classification methods. However, these classification methods suffer from the "curse of dimensionality"; when applied to high-dimensional fault diagnosis data. In order to solve the problem, this paper proposes a hybrid model which combines multiple feature selection models to select the most significant input features from all potentially relevant features. Among the models, eight filter models are used to pre-rank the candidate features. They include data variance, Pearson correlation coefficient, the Relief algorithm, Fisher score, class separability, chi-squared, information gain and gain ratio. These variable ranking models measure features from various perspectives, and lead to different ranking results. Based on the effect of the ranking results on the Radial Basis Function (RBF) classification, a weighted voting scheme is then introduced to re-rank features. Furthermore, two wrapper models, a Binary Search (BS) model and a Sequential Backward Search (SBS) model are utilized to minimize the number of relevant features. To demonstrate the potential for applying the method to machinery fault diagnosis, two case studies are discussed. The experiment results support the conclusion that this method is useful for revealing fault-related frequency features.

AB - The technique of machinery fault diagnosis has been greatly enhanced over recent years with the application of many pattern classification methods. However, these classification methods suffer from the "curse of dimensionality"; when applied to high-dimensional fault diagnosis data. In order to solve the problem, this paper proposes a hybrid model which combines multiple feature selection models to select the most significant input features from all potentially relevant features. Among the models, eight filter models are used to pre-rank the candidate features. They include data variance, Pearson correlation coefficient, the Relief algorithm, Fisher score, class separability, chi-squared, information gain and gain ratio. These variable ranking models measure features from various perspectives, and lead to different ranking results. Based on the effect of the ranking results on the Radial Basis Function (RBF) classification, a weighted voting scheme is then introduced to re-rank features. Furthermore, two wrapper models, a Binary Search (BS) model and a Sequential Backward Search (SBS) model are utilized to minimize the number of relevant features. To demonstrate the potential for applying the method to machinery fault diagnosis, two case studies are discussed. The experiment results support the conclusion that this method is useful for revealing fault-related frequency features.

KW - Binary search

KW - Fault diagnosis

KW - Feature selection

KW - Radial basis function networks

KW - Sequential backward search

UR - http://www.scopus.com/inward/record.url?scp=80052944609&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2011.03.043

DO - 10.1016/j.neucom.2011.03.043

M3 - Article

VL - 74

SP - 2941

EP - 2952

JO - Neurocomputing

T2 - Neurocomputing

JF - Neurocomputing

SN - 0925-2312

IS - 17

ER -