Feature subset selection in large dimensionality domains

Iffat A. Gheyas, Leslie S. Smith

Research output: Contribution to journalArticle

327 Citations (Scopus)

Abstract

Searching for an optimal feature subset from a high dimensional feature space is known to be an NP-complete problem. We present a hybrid algorithm, SAGA, for this task. SAGA combines the ability to avoid being trapped in a local minimum of simulated annealing with the very high rate of convergence of the crossover operator of genetic algorithms, the strong local search ability of greedy algorithms and the high computational efficiency of generalized regression neural networks. We compare the performance over time of SAGA and well-known algorithms on synthetic and real datasets. The results show that SAGA outperforms existing algorithms.

LanguageEnglish
Pages5-13
Number of pages9
JournalPattern Recognition
Volume43
Issue number1
Early online date24 Jun 2009
DOIs
Publication statusPublished - Jan 2010
Externally publishedYes

Fingerprint

Computational efficiency
Simulated annealing
Mathematical operators
Computational complexity
Genetic algorithms
Neural networks
Local search (optimization)

Cite this

Gheyas, Iffat A. ; Smith, Leslie S. / Feature subset selection in large dimensionality domains. In: Pattern Recognition. 2010 ; Vol. 43, No. 1. pp. 5-13.
@article{ea309706663b440bbe445223ff842b29,
title = "Feature subset selection in large dimensionality domains",
abstract = "Searching for an optimal feature subset from a high dimensional feature space is known to be an NP-complete problem. We present a hybrid algorithm, SAGA, for this task. SAGA combines the ability to avoid being trapped in a local minimum of simulated annealing with the very high rate of convergence of the crossover operator of genetic algorithms, the strong local search ability of greedy algorithms and the high computational efficiency of generalized regression neural networks. We compare the performance over time of SAGA and well-known algorithms on synthetic and real datasets. The results show that SAGA outperforms existing algorithms.",
keywords = "Curse of dimensionality, Dimensionality reduction, Feature subset selection, High dimensionality",
author = "Gheyas, {Iffat A.} and Smith, {Leslie S.}",
year = "2010",
month = "1",
doi = "10.1016/j.patcog.2009.06.009",
language = "English",
volume = "43",
pages = "5--13",
journal = "Pattern Recognition",
issn = "0031-3203",
publisher = "Elsevier Limited",
number = "1",

}

Feature subset selection in large dimensionality domains. / Gheyas, Iffat A.; Smith, Leslie S.

In: Pattern Recognition, Vol. 43, No. 1, 01.2010, p. 5-13.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Feature subset selection in large dimensionality domains

AU - Gheyas, Iffat A.

AU - Smith, Leslie S.

PY - 2010/1

Y1 - 2010/1

N2 - Searching for an optimal feature subset from a high dimensional feature space is known to be an NP-complete problem. We present a hybrid algorithm, SAGA, for this task. SAGA combines the ability to avoid being trapped in a local minimum of simulated annealing with the very high rate of convergence of the crossover operator of genetic algorithms, the strong local search ability of greedy algorithms and the high computational efficiency of generalized regression neural networks. We compare the performance over time of SAGA and well-known algorithms on synthetic and real datasets. The results show that SAGA outperforms existing algorithms.

AB - Searching for an optimal feature subset from a high dimensional feature space is known to be an NP-complete problem. We present a hybrid algorithm, SAGA, for this task. SAGA combines the ability to avoid being trapped in a local minimum of simulated annealing with the very high rate of convergence of the crossover operator of genetic algorithms, the strong local search ability of greedy algorithms and the high computational efficiency of generalized regression neural networks. We compare the performance over time of SAGA and well-known algorithms on synthetic and real datasets. The results show that SAGA outperforms existing algorithms.

KW - Curse of dimensionality

KW - Dimensionality reduction

KW - Feature subset selection

KW - High dimensionality

UR - http://www.scopus.com/inward/record.url?scp=68949155378&partnerID=8YFLogxK

U2 - 10.1016/j.patcog.2009.06.009

DO - 10.1016/j.patcog.2009.06.009

M3 - Article

VL - 43

SP - 5

EP - 13

JO - Pattern Recognition

T2 - Pattern Recognition

JF - Pattern Recognition

SN - 0031-3203

IS - 1

ER -