Exploring the Hidden Challenges Associated with the Evaluation of Multi-class Datasets Using Multiple Classifiers

Shamaila Iram, Dhiya Al-jumeily, Paul Fergus, Abir Hussain

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The optimization and evaluation of a pattern recognition system requires different problems like multi-class and imbalanced datasets be addressed. This paper presents the classification of multi-class datasets which present more challenges when compare to binary class datasets in machine learning. Furthermore, it argues that the performance evaluation of a classification model for multi-class imbalanced datasets in terms of simple "accuracy rate" can possibly provide misleading results. Other parameters such as failure avoidance, true identification of positive and negative instances of a class and class discrimination are also very important. We, in this paper, hypothesize that "misclassification of true positive patterns should not necessarily be categorized as false negative while evaluating a classifier for multi-class datasets", a common practice that has been observed in the existing literature. In order to address these hidden challenges for the generalization of a particular classifier, several evaluation metrics are compared for a multi-class dataset with four classes, three of them belong to different neurodegenerative diseases and one to control subjects. Three classifiers, linear discriminant, quadratic discriminant and Parzen are selected to demonstrate the results with examples.
LanguageEnglish
Title of host publication2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS)
PublisherIEEE
Pages346-352
Number of pages7
ISBN (Electronic)9781479943258
DOIs
Publication statusPublished - Jul 2014
Externally publishedYes
Event8th International Conference on Complex, Intelligent and Software Intensive Systems - Birmingham City University, Birmingham, United Kingdom
Duration: 2 Jul 20144 Jul 2014
Conference number: 8
http://voyager.ce.fit.ac.jp/conf/cisis/2014/ (Link to Conference Website)

Conference

Conference8th International Conference on Complex, Intelligent and Software Intensive Systems
Abbreviated titleCISIS
CountryUnited Kingdom
CityBirmingham
Period2/07/144/07/14
Internet address

Fingerprint

Classifiers
Neurodegenerative diseases
Pattern recognition systems
Learning systems

Cite this

Iram, S., Al-jumeily, D., Fergus, P., & Hussain, A. (2014). Exploring the Hidden Challenges Associated with the Evaluation of Multi-class Datasets Using Multiple Classifiers. In 2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS) (pp. 346-352). IEEE. https://doi.org/10.1109/CISIS.2014.48
Iram, Shamaila ; Al-jumeily, Dhiya ; Fergus, Paul ; Hussain, Abir. / Exploring the Hidden Challenges Associated with the Evaluation of Multi-class Datasets Using Multiple Classifiers. 2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS). IEEE, 2014. pp. 346-352
@inproceedings{7b5ced2a1fc243e5a418a9c982798c07,
title = "Exploring the Hidden Challenges Associated with the Evaluation of Multi-class Datasets Using Multiple Classifiers",
abstract = "The optimization and evaluation of a pattern recognition system requires different problems like multi-class and imbalanced datasets be addressed. This paper presents the classification of multi-class datasets which present more challenges when compare to binary class datasets in machine learning. Furthermore, it argues that the performance evaluation of a classification model for multi-class imbalanced datasets in terms of simple {"}accuracy rate{"} can possibly provide misleading results. Other parameters such as failure avoidance, true identification of positive and negative instances of a class and class discrimination are also very important. We, in this paper, hypothesize that {"}misclassification of true positive patterns should not necessarily be categorized as false negative while evaluating a classifier for multi-class datasets{"}, a common practice that has been observed in the existing literature. In order to address these hidden challenges for the generalization of a particular classifier, several evaluation metrics are compared for a multi-class dataset with four classes, three of them belong to different neurodegenerative diseases and one to control subjects. Three classifiers, linear discriminant, quadratic discriminant and Parzen are selected to demonstrate the results with examples.",
keywords = "classifier evaluation, multi-class dataset, multiple classifiers, neurodegenerative diseases, pattern recognition",
author = "Shamaila Iram and Dhiya Al-jumeily and Paul Fergus and Abir Hussain",
year = "2014",
month = "7",
doi = "10.1109/CISIS.2014.48",
language = "English",
pages = "346--352",
booktitle = "2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS)",
publisher = "IEEE",

}

Iram, S, Al-jumeily, D, Fergus, P & Hussain, A 2014, Exploring the Hidden Challenges Associated with the Evaluation of Multi-class Datasets Using Multiple Classifiers. in 2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS). IEEE, pp. 346-352, 8th International Conference on Complex, Intelligent and Software Intensive Systems, Birmingham, United Kingdom, 2/07/14. https://doi.org/10.1109/CISIS.2014.48

Exploring the Hidden Challenges Associated with the Evaluation of Multi-class Datasets Using Multiple Classifiers. / Iram, Shamaila; Al-jumeily, Dhiya; Fergus, Paul; Hussain, Abir.

2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS). IEEE, 2014. p. 346-352.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Exploring the Hidden Challenges Associated with the Evaluation of Multi-class Datasets Using Multiple Classifiers

AU - Iram, Shamaila

AU - Al-jumeily, Dhiya

AU - Fergus, Paul

AU - Hussain, Abir

PY - 2014/7

Y1 - 2014/7

N2 - The optimization and evaluation of a pattern recognition system requires different problems like multi-class and imbalanced datasets be addressed. This paper presents the classification of multi-class datasets which present more challenges when compare to binary class datasets in machine learning. Furthermore, it argues that the performance evaluation of a classification model for multi-class imbalanced datasets in terms of simple "accuracy rate" can possibly provide misleading results. Other parameters such as failure avoidance, true identification of positive and negative instances of a class and class discrimination are also very important. We, in this paper, hypothesize that "misclassification of true positive patterns should not necessarily be categorized as false negative while evaluating a classifier for multi-class datasets", a common practice that has been observed in the existing literature. In order to address these hidden challenges for the generalization of a particular classifier, several evaluation metrics are compared for a multi-class dataset with four classes, three of them belong to different neurodegenerative diseases and one to control subjects. Three classifiers, linear discriminant, quadratic discriminant and Parzen are selected to demonstrate the results with examples.

AB - The optimization and evaluation of a pattern recognition system requires different problems like multi-class and imbalanced datasets be addressed. This paper presents the classification of multi-class datasets which present more challenges when compare to binary class datasets in machine learning. Furthermore, it argues that the performance evaluation of a classification model for multi-class imbalanced datasets in terms of simple "accuracy rate" can possibly provide misleading results. Other parameters such as failure avoidance, true identification of positive and negative instances of a class and class discrimination are also very important. We, in this paper, hypothesize that "misclassification of true positive patterns should not necessarily be categorized as false negative while evaluating a classifier for multi-class datasets", a common practice that has been observed in the existing literature. In order to address these hidden challenges for the generalization of a particular classifier, several evaluation metrics are compared for a multi-class dataset with four classes, three of them belong to different neurodegenerative diseases and one to control subjects. Three classifiers, linear discriminant, quadratic discriminant and Parzen are selected to demonstrate the results with examples.

KW - classifier evaluation

KW - multi-class dataset

KW - multiple classifiers

KW - neurodegenerative diseases

KW - pattern recognition

U2 - 10.1109/CISIS.2014.48

DO - 10.1109/CISIS.2014.48

M3 - Conference contribution

SP - 346

EP - 352

BT - 2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS)

PB - IEEE

ER -

Iram S, Al-jumeily D, Fergus P, Hussain A. Exploring the Hidden Challenges Associated with the Evaluation of Multi-class Datasets Using Multiple Classifiers. In 2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS). IEEE. 2014. p. 346-352 https://doi.org/10.1109/CISIS.2014.48